Re: Apache Stanbol vs. Ontotext KIM

Andreas Gruber Wed, 14 Sep 2011 08:20:55 -0700

Hi,

Olivier Grisel schrieb:

2011/9/14 Stefane Fermigier <[email protected]>:

Anyway, yes he is trashing Stanbol (at least, not saying that the Stanbol 
version is using is still an early prototype), but he is fair in his 
conclusions.


And I think that recall and precision ~= 50% for a project where entity 
extraction is just a side project is already a promising result !


No it's not, its completely useless in this current state. But there
are easy ways to greatly improve the current state:

- make the NamedEntityTaggingEngine have an option to ignore potential
matches that are not and exact name match (that should improve the
precision dramatically)

+1


- build and distribute more complete indexes and document on the
homepage of the project how to download and deploy them (that should
improve the recall): this is improving but still not easy to do for
the end users => nobody does it and instead uses the small index that
comes by default

FYI: I am working on a howto [1] for creating and using indexes (stillstaging draft), where I could also link to pre-generated indexes servedby [2].


Andreas

[1]http://stanbol.staging.apache.org/stanbol/docs/trunk/customvocabulary.html

[2] http://dev.iks-project.eu/downloads/stanbol-indices/

Re: Apache Stanbol vs. Ontotext KIM

Reply via email to