Excellent *! (excellent paper and excellent contribution to Mahout)
On Mon, Jul 27, 2009 at 2:39 PM, Panagiotis Papadimitriou (JIRA) < [email protected]> wrote: > Implement kmeans++ for initial cluster selection in kmeans > ---------------------------------------------------------- > > Key: MAHOUT-153 > URL: https://issues.apache.org/jira/browse/MAHOUT-153 > Project: Mahout > Issue Type: New Feature > Components: Clustering > Affects Versions: 0.2 > Environment: OS Independent > Reporter: Panagiotis Papadimitriou > > > The current implementation of k-means includes the following algorithms for > initial cluster selection (seed selection): 1) random selection of k points, > 2) use of canopy clusters. > > I plan to implement k-means++. The details of the algorithm are available > here: > http://www.stanford.edu/~darthur/kMeansPlusPlus.pdf<http://www.stanford.edu/%7Edarthur/kMeansPlusPlus.pdf> > . > > Design Outline: I will create an abstract class SeedGenerator and a > subclass KMeansPlusPlusSeedGenerator. The existing class RandomSeedGenerator > will become a subclass of SeedGenerator. > > > > > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > -- Ted Dunning, CTO DeepDyve
