[jira] [Resolved] (STANBOL-1091) EntityLinking Engine should not process the same tokens twice

Rupert Westenthaler (JIRA) Wed, 05 Jun 2013 01:21:05 -0700

     [ 
https://issues.apache.org/jira/browse/STANBOL-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rupert Westenthaler resolved STANBOL-1091.
------------------------------------------

    Resolution: Fixed

fixed with http://svn.apache.org/r1489737
                
> EntityLinking Engine should not process the same tokens twice
> -------------------------------------------------------------
>
>                 Key: STANBOL-1091
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1091
>             Project: Stanbol
>          Issue Type: Improvement
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>
> The EntityLinking Engine currently processes the text based on Sections 
> (typically Sentences - if present). However in cases where multiple NLP 
> framework do process the parsed text it might happen that Sentence 
> annotations are overlapping. In such cases the EntityLinkingEngine would 
> first process the Sentence with the earlier start and/or later end position. 
> But it would also process the other sentence that is (partially) covered by 
> the other one. Because of that Tokens and Chunks contained in two (or more) 
> overlapping Sentence annotations will be processed twice.
> To avoid this the EntityLinking Engine should keep track of Tokens that where 
> already processed and just ignore already processed parts of overlapping 
> sentences.
> NOTE: This will not have any affects on the Entity Linking Results. However 
> it will prevent unnecessary processing steps in cases as described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (STANBOL-1091) EntityLinking Engine should not process the same tokens twice

Reply via email to