Rupert Westenthaler created STANBOL-622:
-------------------------------------------
Summary: The KeywordLinkingEngine should check if all Tokens of a
Label match against the text
Key: STANBOL-622
URL: https://issues.apache.org/jira/browse/STANBOL-622
Project: Stanbol
Issue Type: Bug
Components: Enhancer
Affects Versions: 0.9.0-incubating
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Fix For: 0.10.0-incubating
Currently the KeywordLinkingEngine generates a lot of suggestions for phrases
like "{noun/noun phrase} {preposition}".
E.g. for the sentence
"Mexico City is by far the biggest urban area of Mexico"
the KeywordLinkingEngine would suggest
"Urban areas of England"
"Urban areas of Sweden"
for
"urban area of"
However the intended behavior would be to
(a) match "urban area of Mexico" if this concept exists in the controlled
vocabulary ... currently not the case in DBpedia
(b) match the text "urban area" with the concept "urban area" and possibly
include "Urban areas of England" and "Urban areas of Sweden" as additional
suggestions.
The cause for the faulty behavior is that currently the KeywordLinkingEngine
does only check if all Tokens of the selected Region of the Text match against
the label, but does not check if all Tokens of the Label do also match the
selected Region of the Text.
Because of that "Urban areas of England" and "Urban areas of Sweden" or treated
as FULL match for "urban area of" and therefore making this preferable to
selecting "urban area" with the EXACT match "Urban Area".
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira