Rupert Westenthaler created STANBOL-1403:
--------------------------------------------
Summary: Add PLAIN linking mode to the FST linking engine
Key: STANBOL-1403
URL: https://issues.apache.org/jira/browse/STANBOL-1403
Project: Stanbol
Issue Type: Improvement
Components: Enhancement Engines
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
The Lucene FST linking engine uses a similar linking process as the entity
linking engine. This means that NLP processing results are used to determine
"Linkable" and "Matchable" tokens in the text. "Linkable" tokens are than used
to initiate vocabulary lookups and "Linkable" and "Matchable" tokens are used
to check if labels of entities do actually match with the text.
This issue will introduce a new linking mode where the FST linking engine that
will try to link every singe word in the text. Instead of using NLP processing
results this will simple use the Solr Analyzer of the configured field.
The PLAIN mode is intended to be used in cases:
* where no NLP support is available
* for vocabularies that do contain entities that appear in text with tokens
other than nouns (e.g. a vocabulary that contains activities)
The PLAIN mode will not work in cases where users have used ProperNoun mode
with big vocabularies.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)