Fwd: Algorithms in Mahout

2013-11-25 Thread unmesha sreeveni
I have gone through http://mahout.apache.org for some data mining algorithms already implemented on the Hadoop plattform. From that i understood that 1. Kmeans 2. Decision Tree 3. Navie Bayes Have implementation in hadoop platform And for 4. DBscan 5. k-mearesr neighbr 6. svm 7. Logistic

Re: Fwd: Algorithms in Mahout

2013-11-25 Thread Pavan K Narayanan
k nearest neibhor, svm, logistic regression, neural nets exist in mahout . just type mahout and press enter you ll see list of algorithms available and type mahout algo-name -h to get detailed information about how to use /configure them Pavan On Nov 25, 2013 2:44 PM, unmesha sreeveni

Re: Fwd: Algorithms in Mahout

2013-11-25 Thread Sebastian Schelter
From the algorithms listed, only logistic regression (non-distributed) is implemented. Sorry, for the confusion, we are currently reworking the wiki. On 25.11.2013 10:24, Pavan K Narayanan wrote: k nearest neibhor, svm, logistic regression, neural nets exist in mahout . just type mahout and

Re: Fwd: Algorithms in Mahout

2013-11-25 Thread unmesha sreeveni
So currently we dnt have Decision Tree in mahout 0.6 release. On Mon, Nov 25, 2013 at 2:59 PM, Sebastian Schelter ssc.o...@googlemail.com wrote: From the algorithms listed, only logistic regression (non-distributed) is implemented. Sorry, for the confusion, we are currently reworking the

Re: Algorithms in Mahout

2013-11-25 Thread Manuel Blechschmidt
Hi Unmesha, please also consult JIRA as a source for algorithm, there you find implementations or discussions: e.g. for neural networks a.k.a multilayer perceptrons: https://issues.apache.org/jira/browse/MAHOUT-1265 https://issues.apache.org/jira/browse/MAHOUT-976 SVM:

Re: HELP for implicit data feed back - beginner

2013-11-25 Thread Antony Adopo
Hello, I disover one ebook and an article which help me about my problem: the article :http://www.csulb.edu/web/journals/jecr/issues/20044/Paper1.pdf the ebook : http://www.amazon.fr/gp/product/B00BEQ82FY/ref=oh_d__o00_details_o00__i00?ie=UTF8psc=1 very interesting 2013/11/23 Manuel

Re: Canopy threshold limitation

2013-11-25 Thread Chih-Hsien Wu
Hey Suneel, thanks for the reply. I'm trying to create hierarchical clusters via top down approach. I'm caught in the trade off between the lower canopy threshold and running out of heap memory. Stream Kmeans sounds ideal for top clustering. What are the major differences between Streaming kmeans

Re: Algorithms in Mahout

2013-11-25 Thread Ted Dunning
On Mon, Nov 25, 2013 at 3:14 AM, Manuel Blechschmidt manuel.blechschm...@gmx.de wrote: There are/were multiple kNN implementation in Mahout: Recommender knn

Recommender Streaming with EMR

2013-11-25 Thread Bryan Marble
Hello - If this isn't the best forum to ask, please let me know. TL;DR; Is there a way to stream preference/user data to an EMR recommender workflow without having to go through the pain of re-uploading all preference data, and starting brand new jobs over and over, etc? I am trying to

Re: Recommender Streaming with EMR

2013-11-25 Thread Manuel Blechschmidt
Hi Bryan, On 25.11.2013, at 17:14, Bryan Marble wrote: Hello - If this isn't the best forum to ask, please let me know. This is the correct forum to ask this question. TL;DR; Is there a way to stream preference/user data to an EMR recommender workflow without having to go through

java.io.ioexception: Failed to set permissions of path

2013-11-25 Thread Antony Adopo
Hello, please for my first install of Mahout, I have this error on eclipse java.io.ioexception: Failed to set permissions of path on many tests. please , could someone help me fix it. thanks

Only one reducer running on canopy generator

2013-11-25 Thread Chih-Hsien Wu
Hi all, I have been experiencing memory issue while working with Mahout canopy algorithm on big set of data on Hadoop. I notice that only one reducer was running while other nodes were idle. I was wondering if increasing the number of reduce tasks would ease down the memory usage and speed up

Re: Only one reducer running on canopy generator

2013-11-25 Thread Suneel Marthi
Canopy Clustering is a 2 step process: Canopy Generation followed by Canopy Clustering. For Canopy Generation, it uses a single reducer (and this cannot be overidden), while the Clustering task uses multiple reducers. You seem to be hitting OOM during the Canopy generation phase. On

Re: Algorithms in Mahout

2013-11-25 Thread unmesha sreeveni
Thxs for the replies. I will go through those links.Thanks for spending time for me :) On Mon, Nov 25, 2013 at 11:59 PM, Suneel Marthi suneel_mar...@yahoo.comwrote: Dhruv, Could u update the patch to present trunk codebase and also create a Wiki page for this? On Monday, November 25,