Cristian Petroaca created STANBOL-1121:
------------------------------------------

             Summary: Triple extraction Enhancement Engine
                 Key: STANBOL-1121
                 URL: https://issues.apache.org/jira/browse/STANBOL-1121
             Project: Stanbol
          Issue Type: New Feature
          Components: Enhancement Engines
            Reporter: Cristian Petroaca


Functionality
=========

Develop an Enhancement Engine which would construct a formal knowledge 
representation from natural language text. The knowledge extracted from the 
text would be in the form of Triples (Subject-Verb-Object). This Enhancement 
Engine will be mainly concerned with representation of real-world events.

Example : 
We have the text "Google buys Youtube". Google=Subject, buys=verb, 
Youtube=object.


Implementation
===========

Triple Extraction
-----------------------
The following will be applied on the natural language text in order to extract 
the triples:
+ Named entity extraction
+ Co-reference resolution of those named entites
+ POS Tagging or dependency trees to figure out what verbs and object are in 
conjunction to the named entities.

Based on the last step we would have the set of triples. 


Formal representation of triples
---------------------------------------------
The formal representation of the triples will be based on the DOLCE 
foundational ontology. We will have the following data structures :

 * fise:SettingAnnotation
    * {fise:Enhancement} metadata
describes the context of the data

 * fise:ParticipantAnnotation
    * {fise:Enhancement} metadata
    * fise:inSetting {settingAnnotation}
    * fise:hasMention {textAnnotation}
    * fise:suggestion {entityAnnotation} (multiple if there are more
suggestions)
    * dc:type one of fise:Agent, fise:Patient, fise:Instrument, fise:Cause
describes the participants from the context. In our example these would be 
"Google" and "Youtube". In Dolce ontology these would be the Endurants.

 * fise:OccurrentAnnotation
    * {fise:Enhancement} metadata
    * fise:inSetting {settingAnnotation}
    * fise:hasMention {textAnnotation}
    * dc:type set to fise:Activity
    *??:hasRelations (describes the particpants linked to this occurent - TBD)
describes the action made by the participants. In our example this would be 
"buys". In Dolce ontology this would be the Perdurant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to