Dear Jelena,attached to this mail, you can find a part of our latest system documentation with explanations on extending KIM.
hth, borislav
Atanas Kiryakov wrote:
Title: Extending the KIM Platform Administration :: Extending the KIM PlatformDear Jelena,thanks for your interest! Actually it is more simple than it looks like: you shall define your topics as instances of protont:Topic and put them in a hierarchy via protont:subTopicOf, i.e.<protont:Topic rdf:ID="eCommerce"> </protont:Topic> <protont:Topic rdf:ID="b2bActivities"> <protont:subTopicOf rdf:resource="#eCommerce"/> </protont:Topic>the rationale behind this solution is to keep the ontology and the KB in the OWL DL. As long as OWLIM (the semantic repository used in KIM) is rule-based, we do not really need this, we prefered this modelling approach because it also avoids overloading of the subClassOf relationship (IS-A overloadingis probably the most typical ontology design problem). You can read more on this subject in section 6.3.4 (pp.49-51) of PROTON's documentation: http://proton.semanticweb.org/D1_8_1.pdf.Regards, Naso ---------------------------------------------------------- Atanas Kiryakov Head of Ontotext Lab, http://www.ontotext.com Sirma Group Corp, http://www.sirma.bg Phone: (+359 2) 9768 303; Fax: 9768 311---------------------------------------------------------- ----- Original Message ----- From: Jelena JovanovicTo: KIM Mailing list Sent: Friday, August 04, 2006 10:09 PMSubject: [KIM-discussion] How to use PROTON's Topic class and its subclassesin KIMHello everyone,I would like to use KIM to semantically annotate content of a course with the topics of the domain ontology for that course (as you might have guessed, I'm doing my research in the learning domain). I read the instructions for extending the KIM platform and it seams clear to me what is to be done:-), however I am not sure how to extend the PROTON ontology. I thought of defining classes of my domain ontology as subclasses of the protont:Topic class. However, I have doubts here. Should I model all my classes as direct subclasses of the Topic class and relate them in a hierarchy using protont:subTopicOf property, or should I model them in the hierarchy in the typical manner it is done in ontologies ( i.e. using rdfs:subClassOf). According to the PROTON documentation the first option seams to be preferable, but I'm concerned that this design decision might not be in accordance with KIM extraction modules, and that I might later have problems when using KIM IE functionality to extract entities from text. In order to clarify my question i will give you an example for the above mentioned options: 1) Defining domain classes as direct subclasses of the Topic class and relating them in a hierarchy using protont:subTopicOf property<owl:Class rdf:ID="eCommerce"><rdfs:subClassOf rdf:resource=" http://proton.semanticweb.org/2005/04/protont#Topic"/></owl:Class> <owl:Class rdf:ID="b2bActivities"> <rdfs:subClassOf> <owl:Restriction><owl:onProperty rdf:resource="http://proton.semanticweb.org/2005/04/protont#subTopicOf"/><owl:allValuesFrom rdf:resource="#eCommerce"/> </owl:Restriction> </rdfs:subClassOf><rdfs:subClassOf rdf:resource=" http://proton.semanticweb.org/2005/04/protont#Topic"/></owl:Class>2) Using rdfs:subClassOf property to create hierarchy of domain classes and just make a reference to protont:Topic via the top level domain class<owl:Class rdf:ID="eCommerce"><rdfs:subClassOf rdf:resource="http://proton.semanticweb.org/2005/04/protont#Topic "/></owl:Class> <owl:Class rdf:ID="b2bActivities"> <rdfs:subClassOf rdf:resource="#eCommerce"/> </owl:Class>Initially my idea was to represent domain concepts as instances of the protont:Topic class. However, I realized that in that case KIM would not be able to recognize those concepts in the content of the course - KIM would recognize each of them as Topic not as something more specific. I'm I right, or I misinterpreted that part of KIM documentation?Thanks in advance! And sorry for the confusion with my previous email:-) Cheers, Jelena _______________________________________________NOTE: Please REPLY TO ALL to ensure that your reply reaches all members of this mailing list.KIM-discussion mailing list [email protected] http://ontotext.com/mailman/listinfo/kim-discussion_ontotext.com __________ NOD32 1.1691 (20060803) Information __________ This message was checked by NOD32 antivirus system.http://www.eset.com_______________________________________________ NOTE: Please REPLY TO ALL to ensure that your reply reaches all members of this mailing list. KIM-discussion mailing list [email protected] http://ontotext.com/mailman/listinfo/kim-discussion_ontotext.com
Extending the KIM Platform
Extending the KIM Platform to cover a new domain consists of the following:
-
Extend the ontology with a domain-specific model, defining classes and relationships
(e.g. Robot, createdBy, etc.)
-
Extend the instance base with pre-populated entities important in the new domain
(e.g. R2D2 (typeOf Robot), R2D2 createdBy Mr.Simpson)
-
Change or extend the Information Extraction module
(e.g. make it recognize a new class of instances (robots) and relationships with their creators)
Extending the Ontology
When extending the ontology with a domain-specific module the following rules should be observed:- Referencing PROTON
Since PROTON is the inherent ontology for the KIM Platform, each domain-specific module should reference it, so to end up with a linked ontology and not two independent modules. This is also very important for the (default) IE module since it recognizes only instances of Entity or one of its subclasses. - Supported Formats of the Extension Modules
- RDF/XML
- N-Triples
- Turtle
- TriX. - Configuration
Ontology extension modules (*.owl) should by default reside in the %KIM_CONTEXT%/kb/owl folder but can be placed anywhere provided they are correctly included in the Semantic Repository Configuration file.
<!-- example: in.space.owl -->
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE owl [
<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
<!ENTITY owl 'http://www.w3.org/2002/07/owl#' >
<!ENTITY xsd 'http://www.w3.org/2001/XMLSchema#' >
<!ENTITY psys 'http://proton.semanticweb.org/2005/04/protons#'>
<!ENTITY ptop 'http://proton.semanticweb.org/2005/04/protont#'>
<!ENTITY protonkm 'http://proton.semanticweb.org/2005/04/protonkm#'>
]>
<rdf:RDF
xmlns:owl="&owl;"
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:psys="&psys;"
xmlns:ptop="&ptop;"
xmlns:protonkm="&protonkm;"
xmlns="http://in.space#"
xml:base="http://in.space#">
<owl:AnnotationProperty rdf:about="http://www.w3.org/2000/01/rdf-schema#comment"/>
<owl:AnnotationProperty rdf:about="http://www.w3.org/2000/01/rdf-schema#label"/>
<owl:AnnotationProperty rdf:about="http://www.w3.org/2002/07/owl#versionInfo"/>
<owl:Ontology rdf:about="">
<rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">InSpace</rdfs:label>
<rdfs:comment>InSpace Ontology</rdfs:comment>
<owl:imports rdf:resource="http://proton.semanticweb.org/2005/04/protonkm"/>
<owl:versionInfo>"0.1"</owl:versionInfo>
</owl:Ontology>
<owl:Class rdf:ID="Robot">
<rdfs:label>Robot</rdfs:label>
<rdfs:comment>
A mechanical or electronic device that resembles a living animal and moves automatically or by remote control.
</rdfs:comment>
<rdfs:subClassOf rdf:resource="&protonkm;Device"/>
</owl:Class>
<owl:ObjectProperty rdf:about="#createdBy" rdfs:label="createdBy">
<rdfs:domain rdf:resource="#Robot"/>
<rdfs:range rdf:resource="&ptop;Person""/>
</owl:ObjectProperty>
</rdf:RDF>
Extending the Instance Base
Information Extraction can be enhanced by modeling a set of predefined entities in the knowledge base. The choice of these entities is usually driven by two aspects:- Importance
Entities of high importance in a domain are likely to appear often and their correct extraction is crucial. Important entities also tend to have nicknames or aliases that originated along the line of frequent references to these entities. These names are likely to be hard to classify and identify. - Existing Lists
Focusing KIM towards a particular domain, where semantic annotation based on information extraction is expected, could be achieved very easily if one could obtain lists of the instances of the domain-specific classes. An example would be all the robot brands and models in a robot domain, or all the location names in a geographically-restricted domain (e.g. a domain like South-African Politics). These lists could be transformed into entities and modeled into an extension of the instance or knowledge base associated with the KIM Platform.
- Namespaces
The instances should be with the same namespace as the entities in the default knowledge base (wkb.nt), which currently is:
http://www.ontotext.com/kim/2005/04/wkb
An example URI of an instance would be:
http://www.ontotext.com/kim/2005/04/wkb#Robot_T.123 - Alias Modeling
Entities usually have more than one name (e.g. R2D2 and R-Two D-Two). In KIM this is modeled by the helper classes Alias and MainAlias, and the respective relations hasAlias and hasMainAlias. The MainAlias is the official or most popular name and is used for presentation purposes in the Web UI of KIM or the client applications. Here is an example of modeling R2D2's class and names:#example: in.space.nt (part 1) #NOTE: this example is given with triples separated onto 3 lines each for readability only, # when copying them into a file make sure each is brought together on a single line !!! <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://in.space#Robot> . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://www.w3.org/2000/01/rdf-schema#label> "R2D2" . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://proton.semanticweb.org/2005/04/protons#Alias> . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.1> <http://www.w3.org/2000/01/rdf-schema#label> "R2D2" . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://proton.semanticweb.org/2005/04/protons#hasAlias> <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.1> . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://proton.semanticweb.org/2005/04/protons#Alias> . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.2> <http://www.w3.org/2000/01/rdf-schema#label> "R-Two D-Two" . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://proton.semanticweb.org/2005/04/protons#hasAlias> <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.2> . <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://proton.semanticweb.org/2005/04/protons#hasMainAlias> <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1.1> . - Relation Modeling
Entities may have relations to other entities. These realtions should be defined in the ontology with respect to domain and range. An example of such a relation is assigning a creator to a robot:#example: in.space.nt (part 2) #NOTE: this example is given with triples separated onto 3 lines each for readability only, # when copying them into a file make sure each is brought together on a single line !!! # creator <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://proton.semanticweb.org/2005/04/protont#Person> . <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson> <http://www.w3.org/2000/01/rdf-schema#label> "Mr. Simpson" . <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson.0> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://proton.semanticweb.org/2005/04/protons#Alias> . <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson.0> <http://www.w3.org/2000/01/rdf-schema#label> "Mr. Simpson" . <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson> <http://proton.semanticweb.org/2005/04/protons#hasMainAlias> <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson.0> . <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson> <http://proton.semanticweb.org/2005/04/protons#generatedBy> <http://www.ontotext.com/kim/2005/04/wkb#Gazetteer> . # realation <http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://in.space##createdBy> <http://www.ontotext.com/kim/2005/04/wkb#Person.Mr.Simpson> . - Source (generatedBy)
An important distinction between pre-populated and automatically recognized entities is being taken into consideration during the phrase-lookup (gazetteer) phase of the default IE module of KIM. The gazetteer looks up only the mentions of pre-defined or trusted entities, and thus prevents the propagation of one-time IE mistakes. Defining an instance as a trusted will make it recognizable by the gazetteer. This is modeled by associating the entity with a "trusted source". There are two classes in the system part of PROTON that make this distinction:
http://proton.semanticweb.org/2005/04/protons#Trusted
and
http://proton.semanticweb.org/2005/04/protons#Recognized
To model trusted entities one should first define a source:<http://www.ontotext.com/kim/2005/04/wkb#RobotModelsList> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://proton.semanticweb.org/2005/04/protons#Trusted> .Then, the association of the instance and the source is made like this:<http://www.ontotext.com/kim/2005/04/wkb#Robot_T.1> <http://proton.semanticweb.org/2005/04/protons#generatedBy> <http://www.ontotext.com/kim/2005/04/wkb#RobotModelsList> .All these definitions have to be defined in one of the extension modules of the semantic repository, so to ensure the proper and intended functioning of KIM.
Extending/Changing the Information Extraction Module(s)
After the ontology has been extended with domain specific classes and the knowledge base enrcihed with instances there are in general three options to configure the information extraction process:- leave the default IE module and grammars and count on them to recognize entities based on the instance base and provided the new classes are subclasses of Entity or its subclasses
- leave the default IE module but edit some grammars to provide specific recognition rules (not really recommended)
- create a brand new IE module (*.gapp) and grammars to cover the domain desired (recommended to advanced users only)
Administration :: Extending the KIM Platform
_______________________________________________ NOTE: Please REPLY TO ALL to ensure that your reply reaches all members of this mailing list. KIM-discussion mailing list [email protected] http://ontotext.com/mailman/listinfo/kim-discussion_ontotext.com
