[
https://issues.apache.org/jira/browse/STANBOL-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rupert Westenthaler resolved STANBOL-1091.
------------------------------------------
Resolution: Fixed
fixed with http://svn.apache.org/r1489737
> EntityLinking Engine should not process the same tokens twice
> -------------------------------------------------------------
>
> Key: STANBOL-1091
> URL: https://issues.apache.org/jira/browse/STANBOL-1091
> Project: Stanbol
> Issue Type: Improvement
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
>
> The EntityLinking Engine currently processes the text based on Sections
> (typically Sentences - if present). However in cases where multiple NLP
> framework do process the parsed text it might happen that Sentence
> annotations are overlapping. In such cases the EntityLinkingEngine would
> first process the Sentence with the earlier start and/or later end position.
> But it would also process the other sentence that is (partially) covered by
> the other one. Because of that Tokens and Chunks contained in two (or more)
> overlapping Sentence annotations will be processed twice.
> To avoid this the EntityLinking Engine should keep track of Tokens that where
> already processed and just ignore already processed parts of overlapping
> sentences.
> NOTE: This will not have any affects on the Entity Linking Results. However
> it will prevent unnecessary processing steps in cases as described above.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira