Re: Sensitivity setting of the Stanbol extraction

Olivier Grisel Sun, 25 Mar 2012 08:09:29 -0700

Le 25 mars 2012 16:53, Allel Benbrahim <[email protected]> a écrit :
> Hello
>
> Regarding the previous issue, is there any mean to set the sensitivity of
> the Stanbol extraction engine, meaning being able to determine an
> acceptability threshold when it interrogates the index base ?
> We tried to look at the Stanbol configuration screen "osgi" in order to
> enhance the matching with the detected words (person, localisation,
> organisation) but did not find the way to do this.
>
> Is there a way to set the sensitivity or is it planned on the project
> road-map if it is an open issue ?


It really depends on why engine you are referring too. Which kind of
failure is most annoying to you? Phrases detected as names of people,
locations or organizations whereas they should not be detected at all
(e.g. verbs, adverbs or other non-name noun phrases)? Or names of
people, locations or organizations that are linked to the wrong entity
in the knowledge base?

For the first type of errors, there is a relevance score on the
TextAnnotation object returned in the results. This score behaves as a
normalized probability so that you can use it to increase the
sensitivity. Alternatively you could use opennlp tools to train new
NameFinder models on hand annotated text but this is a big effort.

For the second type of errors:
- build larger index of entities to increase the rate of unambiguous
exact names match,
- build a new engine in charge of entity disambiguation based on
contextual information that would make it possible to return both
better linking results and an ambiguity or confidence score.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Sensitivity setting of the Stanbol extraction

Reply via email to