[
https://issues.apache.org/jira/browse/STANBOL-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828577#comment-13828577
]
Rupert Westenthaler commented on STANBOL-1211:
----------------------------------------------
merged the missing changes with http://svn.apache.org/r1544052
> Improve Chunk support for Entitylinking
> ---------------------------------------
>
> Key: STANBOL-1211
> URL: https://issues.apache.org/jira/browse/STANBOL-1211
> Project: Stanbol
> Issue Type: Improvement
> Components: Enhancement Engines
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 0.12.0
>
>
> Both the EntityLinkingEngine as well as the LuceneFstLinkingEngine do
> currently not use Chunk information very well. For now Chunks are only used
> to also lookup multiple matchable tokens in the same chunk with the
> Vocabulary - to increase recall in case proper-noun linking is enabled.
> However chunks can also be useful to increase precision by using the span of
> the Chunk as a base for calculating the confidence of the linked Entity.
> A typical example are suggestions for Persons Names: If a text mentions the
> Given and Family name of a Person not present in an vocabulary the
> Entitylinking may suggest Entities just matching on of the two names with a
> 100% confidence. When using the span of the Chunk such suggestions would be
> omitted as the minimum label match score is typically > 50%.
> Other example include matches for "US {OrgName}" where "US" is linked when
> the whole organization is not found; same with "{OrgName} {Role}" where the
> {Role} (e.g. president) is linked; Also cases like "15. September, 2013" may
> cause September to be suggested if present in the vocabulary.
--
This message was sent by Atlassian JIRA
(v6.1#6144)