Re: Apache "Text Analysis" top-level project?

Jörn Kottmann Mon, 09 Jul 2012 01:50:10 -0700

On 07/09/2012 05:56 AM, Lance Norskog wrote:

Would it make sense to join OpenNLP, UIMA, and Open Relevance into one
top-level "Text Analysis" project? There are already cross-project
connections between UIMA and OpenNLP. ORP seems dormant. It also seems
a more natural place than OpenNLP for a database of tagged text.


OpenNLP and UIMA align nicely in my opinion. OpenNLP just implements
engines for various NLP tasks without any further support.
UIMA on the other side can do a lot of these additional things you need to
run OpenNLP in a production system e.g. scaling the engines

to many machines, providing workflow support, resource loading andmanagement, etc.

So there is not really an overlap between the two.

UIMA has some NLP related addons in their sandbox, some of themduplicate functionalitywhich is also provided by OpenNLP e.g. pos tagging, or the dictionaryannotator, but that

does not seem to be that much.

Lucene contains a lot of NLP code for stemming and word segmentation indifferentlanguages. Thats probably the biggest NLP related code base next toOpenNLP at Apache.


Jörn

Re: Apache "Text Analysis" top-level project?

Reply via email to