Re: [jira] Created: (MAHOUT-153) Implement kmeans++ for initial cluster selection in kmeans

Ted Dunning Mon, 27 Jul 2009 15:55:42 -0700

Excellent *!

(excellent paper and excellent contribution to Mahout)


On Mon, Jul 27, 2009 at 2:39 PM, Panagiotis Papadimitriou (JIRA) <
[email protected]> wrote:

> Implement kmeans++ for initial cluster selection in kmeans
> ----------------------------------------------------------
>
>                 Key: MAHOUT-153
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-153
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Clustering
>    Affects Versions: 0.2
>         Environment: OS Independent
>            Reporter: Panagiotis Papadimitriou
>
>
> The current implementation of k-means includes the following algorithms for
> initial cluster selection (seed selection): 1) random selection of k points,
> 2) use of canopy clusters.
>
> I plan to implement k-means++. The details of the algorithm are available
> here: 
> http://www.stanford.edu/~darthur/kMeansPlusPlus.pdf<http://www.stanford.edu/%7Edarthur/kMeansPlusPlus.pdf>
> .
>
> Design Outline: I will create an abstract class SeedGenerator and a
> subclass KMeansPlusPlusSeedGenerator. The existing class RandomSeedGenerator
> will become a subclass of SeedGenerator.
>
>
>
>
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


-- 
Ted Dunning, CTO
DeepDyve

Re: [jira] Created: (MAHOUT-153) Implement kmeans++ for initial cluster selection in kmeans

Reply via email to