Rupert Westenthaler created STANBOL-1116:
--------------------------------------------

             Summary: Filter Literals of suggested Entities based on Languages 
used for Lookups
                 Key: STANBOL-1116
                 URL: https://issues.apache.org/jira/browse/STANBOL-1116
             Project: Stanbol
          Issue Type: Sub-task
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler


EntityLinking uses two languages to lookup Entities:

(1) the language of the current document (as detected by language detection)
(2) the default mapping language (default: null ... labels without language tag)

In multi-lingual vocabularies (e.g. dbpedia or freebase) entities might define 
literal values for a lot of languages (for freebase there might be labels for 
more as 100 languages for some entities)

Currently the EntityLinkingEngine includes labels of all languages in the 
EnhancementResults. This has two disadvantages:

(1) All values need to be provided by the EntitySearcher. This might require to 
convert all those values to Clerezza RDF (such as in the case of the Solr based 
EntitySearcher)

(2) If dereferencing is activated a lot of additional literals (ant therefore 
triples) are added to the Enhancement results. This has both a negative impact 
for performance AND also the size of the Enhancement Results.

This issue will adapt the EntiySearcher interface to allow specifying

* selected fields
* selected languages

with all requests, where the languages used to query will always be included to 
the parsed selected languages and the label field, type field and redirect 
field will always be included in the selected fields - as those information are 
required by the linking process itself.

EntitySearcher implementation may ignore those configurations and return all 
values for returned entities instead.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to