subject:"mahout text mining"

mahout text mining

2014-01-16 Thread qiaoresearcher

Mahout has an example of using naive bayes to classify 20 news group. but how to just classify paragraphs (e.g. twitter message, movie review) in text files such as: Text files has content like: -- text paragraph 1 class

Re: mahout text mining

2014-01-16 Thread Suneel Marthi

See http://chimpler.wordpress.com/2013/03/13/using-the-mahout-naive-bayes-classifier-to-automatically-classify-twitter-messages/ for classifying twitter messages. Lucene has support for ngrams, stopwords, porter stemmer, snowball stemmer, language specific analyzers etc... Mahout uses Lucene

Re: mahout text mining

2014-01-16 Thread qiaoresearcher

Suneel, thanks a lot. I assume the example you mentioned was generating a numerical vector for each paragraph, is it right? now, to further improve the performance, I may add other features from other data set into this vector and make it much longer, then use the enriched vector for naive

mahout text mining

Re: mahout text mining

Re: mahout text mining

3 matches

Site Navigation

Mail list logo

Footer information