Deprecated code for Text NaiveBayesClassifier in Mahout

2014-08-08 Thread François Bossière
Hi, I am currently working on P. Giacomelli's great Mahout Cookbook and I am stuck with a problem of deprecated code. I am building the Text NaiveBayesClassifier using the code and It is written: final BayesParameters params = new BayesParameters(); params.setGramSize( 1 );

Re: RowSimilarityJob implementation with Spark

2014-08-08 Thread Ted Dunning
On Thu, Aug 7, 2014 at 3:22 AM, Reinis Vicups mah...@orbit-x.de wrote: During my tests I observed that there were always 2-3-4 long running tasks that determined the critical path of the whole spark job (as in, there was one task running for whole 18 minutes). Also I observed that only through

Re: UserBasedRecommender question

2014-08-08 Thread Pat Ferrel
BTW I ran across this page on the Mahout wiki that explains what runs on single machines, mapreduce, and spark. http://mahout.apache.org/users/basics/algorithms.html On Aug 6, 2014, at 1:31 PM, Pat Ferrel pat.fer...@gmail.com wrote: Most people use Mahout as a Library so they write Java to use

Re: Deprecated code for Text NaiveBayesClassifier in Mahout

2014-08-08 Thread Ted Dunning
What does the author of the book say? On Thu, Aug 7, 2014 at 11:36 PM, François Bossière francois.bossi...@gmail.com wrote: Hi, I am currently working on P. Giacomelli's great Mahout Cookbook and I am stuck with a problem of deprecated code. I am building the Text NaiveBayesClassifier

Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

2014-08-08 Thread Ted Dunning
Piero, It might help if you put your examples with updates on github so that you can point people to that. On Thu, Aug 7, 2014 at 2:30 AM, Piero Giacomelli pgiac...@gmail.com wrote: Ok nice in case you have more problem pls do not hesitate to ask me Piero Giacomelli 2014-08-07 11:29

CSV to Mahout Seqfile

2014-08-08 Thread Aniket
Hi, I am working on project want to run a dataset on mahout for naive bayes classifier. dataset has csv format with columns ( id , rating ,summary, review, label). id : numeric rating : numeric ( 1 to 5) summary : 4-5 texts strings review : more texts and strings label : positive or negative.

Re: CSV to Mahout Seqfile

2014-08-08 Thread Suneel Marthi
See http://stackoverflow.com/questions/13663567/mahout-csv-to-vector-and-running-the-program On Fri, Aug 8, 2014 at 11:05 PM, Aniket sankhe@gmail.com wrote: Hi, I am working on project want to run a dataset on mahout for naive bayes classifier. dataset has csv format with columns (