Re: Is mahout kmeans slow ?

2012-09-13 Thread Paritosh Ranjan
You can also try to find initial clusters first using canopy clustering, its a fast single iteration clustering algorithm. https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+Clustering Canopy clustering would provide you better initial clusters which you can feed into kmeans for faster

Re: How to use kmeans clustering algorithm of Mahout

2012-09-13 Thread Paritosh Ranjan
Please ask questions describing the problem that you are facing in detail here, I hope that you will get the answer. On 13-09-2012 08:29, Don.Tan wrote: I have tried it by following the way of the sample code, and I noticed that I should not use seq2sparse directory. That leads to the sparse

Re: Newbie question on modeling a Recommender using Mahout when the matrix is sparse

2012-09-13 Thread Sean Owen
Well there are only 7 products in the universe! If you ask for 10 recommendations, you will always get all unrated items back in the recommendations. That's always true unless the algorithm can't actually establish a value for some items. What result were you expecting, less than 10 recs? less

Re: Running a single test

2012-09-13 Thread Dhruv
Your command is correct, and it should run a single test. I just tried running a new test I wrote from the $MAHOUT_HOME/core/ directory. From where are you firing this command? On Wed, Sep 12, 2012 at 12:09 PM, Nick Kolegraff nickkolegr...@gmail.comwrote: Does this work for anyone? mvn

Building Mahout

2012-09-13 Thread David Scarlatti
Hi, I'm installing Mahout, following this steps ( http://cloudblog.8kmiles.com/2012/01/31/apache-mahout-installation-on-hadoop-cluster/ ): user1@ubuntu-server:~$ apt-get install maven2 user1@ubuntu-server:~$ cd /opt user1@ubuntu-server:~$ svn co http://svn.apache.org/repos/asf/mahout/trunk

Re: Building Mahout

2012-09-13 Thread Paritosh Ranjan
The current build is broken as sometimes happens with development https://builds.apache.org/job/Mahout-Quality/1658/console. Till the time it gets fixed, I would suggest to skip tests and build. On 13-09-2012 15:59, David Scarlatti wrote: Hi, I'm installing Mahout, following this steps (

Re: Running a single test

2012-09-13 Thread Nick Kolegraff
Thanks for the response. I have tried firing the command from both $MAHOUT_HOME (as this suggests https://cwiki.apache.org/MAHOUT/buildingmahout.html) $MAHOUT_HOME/core both with the same errors as reported above. ($MAHOUT_HOME is set to the root of mahout) using: git:

Re: Running a single test

2012-09-13 Thread Nick Kolegraff
Ok, sorry for the bother. I took a recent pull and everything seems to be working fine now. working from this for others reference: commit: 79313f55c3c3d38a4999a5cb0656170bc9e29434 git-svn-id: https://svn.apache.org/repos/asf/mahout/trunk@138178013f79535-47bb-0310-9956-ffa450edef68 On Thu, Sep

RE: Building Mahout

2012-09-13 Thread I-Scarlatti, David
Ok. So tests are just tests... not needed for having mahout running Thanks! -Original Message- From: Paritosh Ranjan [mailto:pran...@xebia.com] Sent: Thursday, September 13, 2012 1:15 PM To: user@mahout.apache.org; d_scarla...@yahoo.es Subject: Re: Building Mahout The current

Re: Is mahout kmeans slow ?

2012-09-13 Thread Pat Ferrel
What distance measure? On Sep 12, 2012, at 10:37 PM, Elaine Gan elaine-...@gmo.jp wrote: My -cd was quite loose, set it at 0.1 Hmm.. maybe the data is too small, causing the low performance..? 200 iterations? What is your convergence delta? If it is too small for your distance measure

Re: Is mahout kmeans slow ?

2012-09-13 Thread Pat Ferrel
Actually if it is really taking 200 iterations then it is never matching your convergence delta. That means either your data does not cluster well or you convergence delta is still to tight. I was suggesting that you loosen the convergence delta until it only takes 10-20 iterations to cluster

Re: Building Mahout

2012-09-13 Thread Ted Dunning
Yes. It is a grave embarrassment to us, but not a functional requirement. On Thu, Sep 13, 2012 at 6:42 AM, I-Scarlatti, David david.scarla...@boeing.com wrote: Ok. So tests are just tests... not needed for having mahout running Thanks! -Original Message- From: Paritosh

Re: Mahout Kmeans

2012-09-13 Thread Gustavo Enrique Salazar Torres
Hi Paritosh: I made it work on Hadoop mode, not Local. I don't know if thats desirable. I also got this error: Hadoop libraries are missing when running local and, from what I saw at the mahout script, it simply discards all libraries when MAHOUT_LOCAL is set. So, is the local mode used for

hadoop-0.19 and mahout 0.7: throwing incompatible errors, how can I fix it?

2012-09-13 Thread Phoenix Bai
Hi guys, I am trying to compile my application code using mahout 0.7 and hadoop 0.19. during the compile process, it is throwing errors as below: $ hadoop jar cluster-0.0.1-SNAPSHOT-jar-with-dependencies.jar mahout.sample.ClusterVideos 12/09/13 20:36:18 INFO

Re: Newbie question on modeling a Recommender using Mahout when the matrix is sparse

2012-09-13 Thread Gokul Pillai
Very true, good catch. I think I was interpreting the results the wrong way. I expect only the top 5, so I changed the parameter to 5 instead of 10 and the results are as expected now. Thanks. On Wed, Sep 12, 2012 at 11:36 PM, Sean Owen sro...@gmail.com wrote: Well there are only 7 products in