Canopy and KMeans run independently and do not call eachother. For KMEans, the K value has to be specified when invoking KMeans.
Typically u run Canopy first and then invoke KMeans with the appropriate K-value as inferred from Canopy. On Tuesday, March 18, 2014 4:33 AM, hiroshi leon <hiroshi_8...@hotmail.com> wrote: Thank you Wei and Suneel, By the way, does somebody know if the Parallel K-means of Mahout is using Cannopy clustering at the beginning to generate the initial K in the K-Means driver class? Best regards, Hiroshi > Date: Mon, 17 Mar 2014 13:05:01 -0700 > Subject: Re: Mahout parallel K-Means - algorithms analysis > From: weish...@gmail.com > To: user@mahout.apache.org > CC: ted.dunn...@gmail.com > > You could take a look > at org.apache.mahout.clustering.classify/ClusterClassificationMapper > > Enjoy, > Wei Shung > > > On Sat, Mar 15, 2014 at 2:51 PM, Suneel Marthi <suneel_mar...@yahoo.com>wrote: > > > The clustering code is cimapper and cireducer. Following the clustering, > > there is cluster classification which is mapper only. > > > > Not sure about the reference paper, this stuffs been around for long but > > the documentation for kmeans on mahout.apache.org should explain the > > approach. > > > > Sent from my iPhone > > > > > On Mar 15, 2014, at 5:36 PM, hiroshi leon <hiroshi_8...@hotmail.com> > > wrote: > > > > > > Hello Ted, > > > > > > Thank you so much for your reply, the program that I was checking is the > > KMeansDriver class with the run function, > > > the buildCluster function in the same class and following the > > ClusterIterator class with > > > the iterateMR function. > > > > > > I would like to know how where can I check the code that is implemented > > for the mapper and the > > > reducer? is it in the CIMappper.class and CIReducer.class? > > > > > > Is there a research paper or pseudo-code in which Mahout parallel > > K-means was based on? > > > > > > Thank you so much and have a nice day. > > > > > > Best regards > > > > > > > > >> From: ted.dunn...@gmail.com > > >> Date: Sat, 15 Mar 2014 13:56:56 -0700 > > >> Subject: Re: Mahout parallel K-Means - algorithms analysis > > >> To: user@mahout.apache.org > > >> > > >> We would love to help. > > >> > > >> Can you say which program and which classes you are looking at? > > >> > > >> > > >> On Sat, Mar 15, 2014 at 12:58 PM, hiroshi leon < > > hiroshi_8...@hotmail.com>wrote: > > >> > > >>> To whom it may correspond, > > >>> > > >>> Hello, I have been checking the algorithm of Mahout 0.9 version k-means > > >>> using MapReduce and I would like to know where can I check the code of > > >>> what is happening inside the map function and in the reducer? > > >>> > > >>> > > >>> I was debugging using NetBeans and I was not able to find what is > > exactly > > >>> implemented in the Map and Reduce functions... > > >>> > > >>> > > >>> > > >>> The reason what I am doing this is because I would like to know what > > >>> is exactly implemented in the version of Mahout 0.9 in order to see > > >>> which parts where optimized on the K-Means mapReduce algorithm. > > >>> > > >>> > > >>> > > >>> Do you know which research paper the Mahout K-means was based on or > > where > > >>> can I read the pseudo code? > > >>> > > >>> > > >>> > > >>> Thank you so much! > > >>> > > >>> > > >>> > > >>> Best regards! > > >>> > > >>> Hiroshi > > > > >