MapReduce Training phase

2013-12-07 Thread unmesha sreeveni
I want to know..more things on how the algorithms like svm is made parallel weather MR -ed training phase or prediction or both... In normal cases training phase is apt for MR as it takes lot of time. Do we need to MR prediction also? -- *Thanks Regards* Unmesha Sreeveni U.B *Junior

Re: SVM Implementation for mahout?

2013-12-07 Thread Fernando Santos
Thanks Manuel. It seems that these two (https://issues.apache.org/jira/browse/MAHOUT-334 and https://issues.apache.org/jira/browse/MAHOUT-232) patches might work, although not in parallel. Does anyone has sucessfully used any of these two patches already and could share some comments about it?

Re: Test naivebayes task running really slowly and not in distributed mode

2013-12-07 Thread Fernando Santos
I realized what was the problem. First of all the data was not big enough to split the job in more than one task. Training file was 30MB and my block sizes were 64MB. Besides that, I set the number of map (mapred.map.tasks) and reduce ( mapred.reduce.tasks) tasks in the mapred-site.xml file of

Re: SVM Implementation for mahout?

2013-12-07 Thread Suneel Marthi
Any specific reasons u r looking for an SVM implementation only?  R u sure that those patches r still relevant given the codebase today? On Saturday, December 7, 2013 2:58 PM, Fernando Santos fernandoleandro1...@gmail.com wrote: Thanks Manuel. It seems that these two

Re: SVM Implementation for mahout?

2013-12-07 Thread Fernando Santos
Hello Suneel, I want to check if any better performance is reached with SVM. I've been using naive bayes, but my data is quite unbalanced and therefore I'm getting pretty bad results with it. I also tried the complementary naive bayes, but got the same bad results. I read about this difference

Re: SVM Implementation for mahout?

2013-12-07 Thread Lucas Fernandes Brunialti
Hello Fernando, The naive bayes approach makes the assumption that your features are independent, if your featurea have a high correlation, naive bayes won't be a good choice. I would advice you to try the neural networks (mlp), it can get a better decision surface than logistic regression...