Re: UIMA

2014-01-15 Thread Jens Grivolla
Hello Burcu, UIMA has an entirely different purpose actually, and doesn't do classification or clustering. You would rather use UIMA to enrich documents (individually) through text analysis and then use the result to create better feature vectors to use with Solr, Mahout, etc. We typically

Re: UIMA

2014-01-15 Thread Burcu B
Hi, Thank you, Jens. I was planning to use OpenNLP for named entity recognition directly for the analysis you''ve mentioned; and Lucene for tokenization. However, UIMA has OpenNLP component, too. What is the reason to use UIMA instead of uisng OpenNLP and SOLR together? I am planning to use

Re: UIMA

2014-01-15 Thread Jens Grivolla
Hi, the advantage of using UIMA over plain OpenNLP is that it can allow you to more easily combine components from different sources, e.g. a tokenizer and POS tagger from OpenNLP, a parser from Stanford, etc. You then have components for input that deal with the different sources

UIMA

2014-01-14 Thread Burcu B
Hi, I'd like to know why someone should prefer UIMA when developing an application for end users to classify and cluster general purpose documents? I have two options: 1- integrating Mahout, SOLR, R ,Hadoop and other file sources such as document man. systems or file system. 2- or doing