Rupert Westenthaler created STANBOL-1070:
--------------------------------------------
Summary: Entity Co-Mention Engine
Key: STANBOL-1070
URL: https://issues.apache.org/jira/browse/STANBOL-1070
Project: Stanbol
Issue Type: New Feature
Components: Enhancement Engines
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Entity Co-Mention Engine
======
The goal of this engine is to extract co-mentions of Entities already detected
for an document. The typical example are persons only mentioned by their family
name after an initial mention with the full name e.g.
... Barack Obama gave a talk to members of the Labor Union ... Obama
specially mentioned ...
But also alternate names used to refer to Entities might be used for extracting
co-mentions.
NOTE: that this Engine does not use NLP level co-reference (e.g. linking a
Pronoun with the Entity it stands for).
Implementation
-----
This Engine will be implemented based on existing Entity linking functionality
as implemented by the EntityLinkingEngine. The main difference is that an
in-memory EntityMentionIndex will be used as controlled vocabulary to link
against. This EntityMentionIndex will implement the EntitySearcher interface as
used by the EntityLinkingEngine to search for Entities.
The EntityMentionIndex will contain both fise:TextAnnotations (such as
NamedEntities) as well as fise:EntityAnnotations (entity suggested for
fise:TextAnnotations).
Writing results of the co-mention extraction will involve
* creating new fise:TextAnnotations with suggested fise:EntityAnnotation (e.g.
for additional mentions not previously detected by any other engine)
* modification of existing Suggestions for fise:TextAnnotations (e.g. if
'Sevenson' was linked with "Svenson" (http://rdf.freebase.com/ns/m.0n5rh_s), a
fictional character from the 1930 film The Silver Horde but "Peter Svenson"
(http://rdf.freebase.com/ns/m.05wxvv9) an Author was already earlier mentioned
in the document - the later would be added as additional suggestion to an
existing TextAnnotation and also confidence values would be adapted accordingly.
* creation of relations between enhancements to express entity co-mentions
(most likely dc:relation from the co-mention to the initial mention of an
Entity.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira