2011/9/14 Stefane Fermigier <[email protected]>:
>
> Anyway, yes he is trashing Stanbol (at least, not saying that the Stanbol 
> version is using is still an early prototype), but he is fair in his 
> conclusions.
>
> And I think that recall and precision ~= 50% for a project where entity 
> extraction is just a side project is already a promising result !

No it's not, its completely useless in this current state. But there
are easy ways to greatly improve the current state:

- make the NamedEntityTaggingEngine have an option to ignore potential
matches that are not and exact name match (that should improve the
precision dramatically).

- build and distribute more complete indexes and document on the
homepage of the project how to download and deploy them (that should
improve the recall): this is improving but still not easy to do for
the end users => nobody does it and instead uses the small index that
comes by default

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Reply via email to