Archive for Web

What is an Ontology?

Posted in Web with tags , on November 17, 2006 by wsjoung

Ontology is the key terminology of semantic web. I just wanted to get more exact concept of this word.

Short answer:
An ontology is a specification of a conceptualization.

The word “ontology” seems to generate a lot of controversy in discussions about AI. It has a long history in philosophy, in which it refers to the subject of existence. It is also often confused with epistemology, which is about knowledge and knowing.

In the context of knowledge sharing, I use the term ontology to mean a specification of a conceptualization. That is, an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set-of-concept-definitions, but more general. And it is certainly a different sense of the word than its use in philosophy.

What is important is what an ontology is for. My colleagues and I have been designing ontologies for the purpose of enabling knowledge sharing and reuse. In that context, an ontology is a specification used for making ontological commitments. The formal definition of ontological commitment is given below. For pragmetic reasons, we choose to write an ontology as a set of definitions of formal vocabulary. Although this isn’t the only way to specify a conceptualization, it has some nice properties for knowledge sharing among AI software (e.g., semantics independent of reader and context). Practically, an ontological commitment is an agreement to use a vocabulary (i.e., ask queries and make assertions) in a way that is consistent (but not complete) with respect to the theory specified by an ontology. We build agents that commit to ontologies. We design ontologies so we can share knowledge with and among these agents.

Tom Gruber

Make it hide or searchable

Posted in Internet with tags , , on November 17, 2006 by wsjoung

Some people are concerned about their personal or some information which they want to hide. Even if they do, sometimes they up-load those information on the public web space. Okay, there is a way to hide those information from the search engine or robot.
Currently most search robots don’t support Meta tags but, there is another thing for robots. “robots.txt” this file gives a direction to the robots which are trying to search your web site. Which directories or files are allowed or not allowed for searching.
There are some examples.

To exclude all robots from the entire server
User-agent: *
Disallow: /

To allow all robots complete access
User-agent: *
Disallow:
Or create an empty “/robots.txt” file.

To exclude all robots from part of the server
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /private/

To exclude a single robot
User-agent: BadBot
Disallow: /

To allow a single robot
User-agent: WebCrawler
Disallow:

User-agent: *
Disallow: /

To exclude all files except one
This is currently a bit awkward, as there is no “Allow” field. The easy way is to put all files to be disallowed into a separate directory, say “docs”, and leave the one file in the level above this directory:

User-agent: *
Disallow: /~joe/docs/

Alternatively you can explicitly disallow all disallowed pages:
User-agent: *
Disallow: /~joe/private.html
Disallow: /~joe/foo.html
Disallow: /~joe/bar.html

Googlebot
robotstxt.org