Hi Edi

On Fri, Feb 6, 2015 at 7:22 PM, Edi Bice <edi_b...@yahoo.com.invalid> wrote:
>
> Quick question. Does the disambiguation-mlt engine work with the FST Linking 
> engine?
>

Quick answer. Currently not :(

> P.S. For the gory details:
> Chain 1: lang, sent, tok, pos, chunker, dbpedia-link, disamb-mlt, 
> dbpedia-derefChain 2: lang, tok, dbpedia-fst-link, disamb-mlt, dbpedia-deref
> Stepping through code while using chain 2 via remote debug I end up in
> if (entityhubSites.isEmpty()) {
>     log.debug("TextAnnotation {} (selectedText: {}, start: {}) has "
>             + "suggestions do not have 'entityhub:site' information. "
>             + "Can not disambiguate because origin is unknown.", new Object[] 
> {entity.uri,
>             entity.name, entity.start});
>     return null; // Ignore TextAnnotatiosn with suggestions of unknown 
> origin.I prepared a new dbpedia index from DBpedia 2014 leaving everything as 
> is (except for modifying the script to obtain 2014 instead of 3.9)

Yes. This is the reason. The FST linking engine does not use an
Entityhub Site but directly looks up a SolrCore. Because of that it
can not "know" the entityhub:site name for that SolrCore (their might
not even be an Entityhub Site for that core).

STANBOL-1391 added the possibility to the FST linking engine to
provide origin information for annotations created by the FST linking
engine. To make the MLT based disambiguation engine work with the FST
linking engine one could extend the MLT disambiguation engine so that
it also considers "fise:origin" values in addition to
"entityhub:site". In that case users would need to configure the name
of the Entityhub site as "enhancer.engines.linking.lucenefst.origin"
for the FST linking engine.

I created STANBOL-1411 [2] related to this feature request.

best
Rupert


[1] https://issues.apache.org/jira/browse/STANBOL-1391
[2] https://issues.apache.org/jira/browse/STANBOL-1411


-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to