Thanks. Was there any fix to this? Or is this an open issues? -----Original Message----- From: Stevo Slavić [mailto:ssla...@gmail.com] Sent: Saturday, July 27, 2013 1:27 AM To: user@mahout.apache.org Cc: Suneel Marthi Subject: Re: mahout kmeans not generating clusteredPoint dir?
Current Mahout examples cluster Reuters build has same issue: https://builds.apache.org/user/sslavic/my-views/view/Mahout/job/Mahout-Examples-Cluster-Reuters/395/console Kind regards, Stevo Slavic. On Wed, Jul 17, 2013 at 11:42 AM, Fuhrmann Alpert, Galit <galp...@ebay.com>wrote: > > Thanks Suneel. > I tried to add this flag (though I think clusteredPoints directory was > supposed to be created by default?). > Either way, for some reason whenever I add '-cl' (tried to run it on > several data sets), I get the following error: > "There is no queue named default" > (even though I do specify a queue by -Dmapred.job.queue.name=...). > I don't get this error otherwise. > > Has anyone ever encountered this error? > Is there some sort of configuration I'm missing? > > Thanks, > > Galit. > > -----Original Message----- > From: Suneel Marthi [mailto:suneel_mar...@yahoo.com] > Sent: Wednesday, July 10, 2013 5:30 PM > To: user@mahout.apache.org > Subject: Re: mahout kmeans not generating clusteredPoint dir? > > Been a while since I last worked with this, I believe u r missing the > clustering option '-cl'. > Give that a try. > > > > > ________________________________ > From: "Fuhrmann Alpert, Galit" <galp...@ebay.com> > To: "user@mahout.apache.org" <user@mahout.apache.org> > Sent: Wednesday, July 10, 2013 5:17 AM > Subject: mahout kmeans not generating clusteredPoint dir? > > > Hello, > > I ran mahout kmeans (using rand seeds) on hadoop cluster. It ran > successfully and created a directory containing clusters-*, including > the last which was clusters-3-final. > However, it did not create the clusteredPoints, or at least I cannot > find it under the same dir (or anywhere else). > > My call was: > mahout kmeans -k 4000 -i inputSeq.dat -o outputPath --maxIter 3 > --clusters outputSeeds > > Was there an extra argument I needed to specify in order for it to > generate the clusteredPoints? > (BTW I also can't see the outputSeeds. Was it created for seeds and > then > deleted?) > > According to mahout in action: > > The k-means clustering implementation creates two types of directories > in the output folder. The clusters-* directories are formed at the end > of each > iteration: the clusters-0 > directory is generated after the first iteration, clusters-1 after the > second iteration, and so on. These directories contain information > about the clusters: centroid, standard deviation, and so on. The > clusteredPoints directory, on the other hand, contains the final > mapping from cluster ID to document ID. This data is generated from > the output of the last MapReduce operation. > The directory listing of the output folder looks something like this: > $ ls -l reuters-kmeans-clusters > drwxr-xr-x 4 user 5000 136 Feb 1 18:56 clusters-0 drwxr-xr-x 4 user > 5000 136 Feb 1 18:56 clusters-1 drwxr-xr-x 4 user 5000 136 Feb 1 18:56 > clusters-2 ... > drwxr-xr-x 4 user 5000 136 Feb 1 18:59 clusteredPoint > > Again, my call did not generate the clusteredPoint directory. > I would appreciate your help. > > Thanks a lot, > > Galit. >