Thanks Suneel,

Can someone please explain me a litlte bit about the ClusteringPolicy and the 
clusterClassifier?
and what are the benefits when using it with parallel K-Means?

Thank you so much,

Best regards.

> Date: Tue, 18 Mar 2014 04:35:14 -0700
> From: suneel_mar...@yahoo.com
> Subject: Re: Mahout parallel K-Means - algorithms analysis
> To: user@mahout.apache.org
> 
> Canopy and KMeans run independently and do not call eachother. 
> 
> For KMEans, the K value has to be specified when invoking KMeans.
> 
> Typically u run Canopy first and then invoke KMeans with the appropriate 
> K-value as inferred from Canopy.
> 
> 
> 
> 
> 
> 
> 
> On Tuesday, March 18, 2014 4:33 AM, hiroshi leon <hiroshi_8...@hotmail.com> 
> wrote:
>  
> Thank you Wei and Suneel, 
> 
> By the way, does somebody know if the Parallel K-means of Mahout is using 
> Cannopy clustering at the beginning to generate the initial K in the K-Means 
> driver class?
> 
> Best regards,
> 
> Hiroshi
> 
> > Date: Mon, 17 Mar 2014 13:05:01 -0700
> > Subject: Re: Mahout parallel K-Means - algorithms analysis
> > From: weish...@gmail.com
> > To: user@mahout.apache.org
> > CC: ted.dunn...@gmail.com
> > 
> > You could take a look
> > at org.apache.mahout.clustering.classify/ClusterClassificationMapper
> > 
> > Enjoy,
> > Wei Shung
> > 
> > 
> > On Sat, Mar 15, 2014 at 2:51 PM, Suneel Marthi 
> > <suneel_mar...@yahoo.com>wrote:
> > 
> > > The clustering code is cimapper and cireducer.  Following the clustering,
> > > there is cluster classification which is mapper only.
> > >
> > > Not sure about the reference paper, this stuffs been around for long but
> > > the documentation for kmeans on mahout.apache.org should explain the
> > > approach.
> > >
> > > Sent from my iPhone
> > >
> > > > On Mar 15, 2014, at 5:36 PM, hiroshi leon <hiroshi_8...@hotmail.com>
> > > wrote:
> > > >
> > > > Hello Ted,
> > > >
> > > > Thank you so much for your reply, the program that I was checking is the
> > > KMeansDriver class with the run function,
> > > > the buildCluster function in the same class and following the
> > > ClusterIterator class with
> > > > the iterateMR function.
> > > >
> > > > I would like to know how where can I check the code that is implemented
> > > for the mapper and the
> > > > reducer? is it in the CIMappper.class and CIReducer.class?
> > > >
> > > > Is there a research paper or pseudo-code in which Mahout parallel
> > > K-means was based on?
> > > >
> > > > Thank you so much and have a nice day.
> > > >
> > > > Best regards
> > > >
> > > >
> > > >> From: ted.dunn...@gmail.com
> > > >> Date: Sat, 15 Mar 2014 13:56:56 -0700
> > > >> Subject: Re: Mahout parallel K-Means - algorithms analysis
> > > >> To: user@mahout.apache.org
> > > >>
> > > >> We would love to help.
> > > >>
> > > >> Can you say which program and which classes you are looking at?
> > > >>
> > > >>
> > > >> On Sat, Mar 15, 2014 at 12:58 PM, hiroshi leon <
> > > hiroshi_8...@hotmail.com>wrote:
> > > >>
> > > >>> To whom it may correspond,
> > > >>>
> > > >>> Hello, I have been checking the algorithm of Mahout 0.9 version 
> > > >>> k-means
> > > >>> using MapReduce and I would like to know where can I check the code of
> > > >>> what is happening inside the map function and in the reducer?
> > > >>>
> > > >>>
> > > >>> I was debugging using NetBeans and I was not able to find what is
> > > exactly
> > > >>> implemented in the Map and Reduce functions...
> > > >>>
> > > >>>
> > > >>>
> > > >>> The reason what I am doing this is because I would like to know what
> > > >>> is exactly implemented in the version of Mahout 0.9 in order to see
> > > >>> which parts where optimized on the K-Means mapReduce algorithm.
> > > >>>
> > > >>>
> > > >>>
> > > >>> Do you know  which research paper the Mahout K-means was based on or
> > > where
> > > >>> can I read the pseudo code?
> > > >>>
> > > >>>
> > > >>>
> > > >>> Thank you so much!
> > > >>>
> > > >>>
> > > >>>
> > > >>> Best regards!
> > > >>>
> > > >>> Hiroshi
> > > >
> > >
                                          

Reply via email to