Rupert Westenthaler created STANBOL-1064:
--------------------------------------------

             Summary: Ignore "link multiple matchable token in chunk" for 
chunks that do already contain a linkable token.
                 Key: STANBOL-1064
                 URL: https://issues.apache.org/jira/browse/STANBOL-1064
             Project: Stanbol
          Issue Type: Improvement
          Components: Enhancement Engines
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler
            Priority: Minor


This is about the following  option of the EntityhubLinkingEngine

* lmmtip [''/true/false]::boolean - the Link Multiple Matchable Tokens in 
Phrases parameter. As the name says it allows to enable/disable the linking of 
multiple matchable tokens within the same Chunk. This is especially important 
if Proper Noun Linking is active, as it allows to detect 'named entities' that 
are constituted by two common nouns. NOTE that 'lmmtip' is short for 
'lmmtip=true'

The current implementation will convert the first matchable token of a chunk if 
it does contain multiple matchable tokens. Because of this in the chunk 

    "Express Tribune newspaper reports"

the token "newspaper" would be converted to a linkable one. However this is 
unintended as "Express" and "Tribune" are already linkable tokens and would 
anyway consider other matchable tokens for entity lookups.

Because of that 'lmmtip' should be deactivated in Chunks that do already 
contain matchable tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to