Re: Contributing to MLlib: Proposal for Clustering Algorithms

Dmitriy Lyubimov Tue, 08 Jul 2014 13:25:31 -0700

Hector, could you share the references for hierarchical K-means? thanks.


On Tue, Jul 8, 2014 at 1:01 PM, Hector Yee <[email protected]> wrote:

> I would say for bigdata applications the most useful would be hierarchical
> k-means with back tracking and the ability to support k nearest centroids.
>
>
> On Tue, Jul 8, 2014 at 10:54 AM, RJ Nowling <[email protected]> wrote:
>
> > Hi all,
> >
> > MLlib currently has one clustering algorithm implementation, KMeans.
> > It would benefit from having implementations of other clustering
> > algorithms such as MiniBatch KMeans, Fuzzy C-Means, Hierarchical
> > Clustering, and Affinity Propagation.
> >
> > I recently submitted a PR [1] for a MiniBatch KMeans implementation,
> > and I saw an email on this list about interest in implementing Fuzzy
> > C-Means.
> >
> > Based on Sean Owen's review of my MiniBatch KMeans code, it became
> > apparent that before I implement more clustering algorithms, it would
> > be useful to hammer out a framework to reduce code duplication and
> > implement a consistent API.
> >
> > I'd like to gauge the interest and goals of the MLlib community:
> >
> > 1. Are you interested in having more clustering algorithms available?
> >
> > 2. Is the community interested in specifying a common framework?
> >
> > Thanks!
> > RJ
> >
> > [1] - https://github.com/apache/spark/pull/1248
> >
> >
> > --
> > em [email protected]
> > c 954.496.2314
> >
>
>
>
> --
> Yee Yang Li Hector <http://google.com/+HectorYee>
> *google.com/+HectorYee <http://google.com/+HectorYee>*
>

Re: Contributing to MLlib: Proposal for Clustering Algorithms

Reply via email to