[
https://issues.apache.org/jira/browse/STANBOL-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636530#comment-14636530
]
Cristian Petroaca commented on STANBOL-1121:
--------------------------------------------
I proposed this feature at some point but due to working on other stuff I did
not get a chance to actually start working on this.
The last thing that was discussed about this feature is here :
https://mail-archives.apache.org/mod_mbox/stanbol-dev/201502.mbox/%3CCAHmcHzNd4HiOYAAToRcgrdLj0chzUiuCU%2BFhfeak0_hwMrcNXw%40mail.gmail.com%3E
> Event extraction Enhancement Engine
> -----------------------------------
>
> Key: STANBOL-1121
> URL: https://issues.apache.org/jira/browse/STANBOL-1121
> Project: Stanbol
> Issue Type: New Feature
> Components: Enhancement Engines
> Reporter: Cristian Petroaca
> Assignee: Rupert Westenthaler
> Labels: extraction, triple
>
> Functionality
> =========
> Develop an Enhancement Engine which would construct a formal knowledge
> representation from natural language text. The knowledge extracted from the
> text would be in the form of Triples (Subject-Verb-Object). This Enhancement
> Engine will be mainly concerned with representation of real-world events.
> Example :
> We have the text "Google buys Youtube". Google=Subject, buys=verb,
> Youtube=object.
> Implementation
> ===========
> Triple Extraction
> -----------------------
> The following will be applied on the natural language text in order to
> extract the triples:
> + Named entity extraction
> + Co-reference resolution of those named entites
> + POS Tagging or dependency trees to figure out what verbs and object are in
> conjunction to the named entities.
> Based on the last step we would have the set of triples.
> Formal representation of triples
> ---------------------------------------------
> The formal representation of the triples will be based on the DOLCE
> foundational ontology. We will have the following data structures :
> * fise:SettingAnnotation
> * {fise:Enhancement} metadata
> describes the context of the data
> * fise:ParticipantAnnotation
> * {fise:Enhancement} metadata
> * fise:inSetting {settingAnnotation}
> * fise:hasMention {textAnnotation}
> * fise:suggestion {entityAnnotation} (multiple if there are more
> suggestions)
> * dc:type one of fise:Agent, fise:Patient, fise:Instrument, fise:Cause
> describes the participants from the context. In our example these would be
> "Google" and "Youtube". In Dolce ontology these would be the Endurants.
> * fise:OccurrentAnnotation
> * {fise:Enhancement} metadata
> * fise:inSetting {settingAnnotation}
> * fise:hasMention {textAnnotation}
> * dc:type set to fise:Activity
> *??:hasRelations (describes the particpants linked to this occurent - TBD)
> describes the action made by the participants. In our example this would be
> "buys". In Dolce ontology this would be the Perdurant.
> For further information see also the Mail Thread related to this Issue:
> http://markmail.org/message/qed6y5avbymvmmgu
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)