Rupert Westenthaler created STANBOL-1123:
--------------------------------------------

             Summary: Label Token matching should consider tokens that are 
marked as "consumed"
                 Key: STANBOL-1123
                 URL: https://issues.apache.org/jira/browse/STANBOL-1123
             Project: Stanbol
          Issue Type: Sub-task
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


Tokens marked as "consumed" should be considered while matching Labels of 
Entities with the processed Text.

Marking Tokens as "consumed" aims to reduce the number or required vocabulary 
lookups. However considering those while matching does not hurt performance 
while it dose increase the quality of the linking process.

Allowing so will bring improvements especially for very long noun phrases, 
where an initial query (typically by using the first to nouns) might not 
suggest the best matching Entity. Person mentions like "{role} {given} {given} 
{family}" are typical examples for such cases.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to