Re: Possible contribution to MLlib
I think it is valuable to make the distance function pluggable and also provide some builtin distance function. This might be also useful for other algorithms besides KMeans. On Tue, Jun 21, 2016 at 7:48 PM, Simon NANTY wrote: > Hi all, > > > > In my team, we are currently developing a fork of spark MLlib extending > K-means method such that it is possible to set its own distance function. > In this implementation, it could be possible to directly pass, in argument > of the K-means train function, a distance function whose signature is: > (VectorWithNorm, VectorWithNorm) => Double. > > > > We have found the Jira instance SPARK-11665 proposing to support new > distances in bisecting K-means. There has also been the Jira instance > SPARK-3219 proposing to add Bregman divergences as distance functions, but > it has not been added to MLlib. Therefore, we are wondering if such an > extension of MLlib K-means algorithm would be appreciated by the community > and would have chances to get included in future spark releases. > > > > Regards, > > > > Simon Nanty > > > -- Best Regards Jeff Zhang
Possible contribution to MLlib
Hi all, In my team, we are currently developing a fork of spark MLlib extending K-means method such that it is possible to set its own distance function. In this implementation, it could be possible to directly pass, in argument of the K-means train function, a distance function whose signature is: (VectorWithNorm, VectorWithNorm) => Double. We have found the Jira instance SPARK-11665 proposing to support new distances in bisecting K-means. There has also been the Jira instance SPARK-3219 proposing to add Bregman divergences as distance functions, but it has not been added to MLlib. Therefore, we are wondering if such an extension of MLlib K-means algorithm would be appreciated by the community and would have chances to get included in future spark releases. Regards, Simon Nanty
Re: Contribution to MLlib
I don't know if anyone is working on it either. If that JIRA is not moved to Apache JIRA, feel free to create a new one and make a note that you are working on it. Thanks! -Xiangrui On Wed, Jul 9, 2014 at 4:56 AM, RJ Nowling wrote: > Hi Meethu, > > There is no code for a Gaussian Mixture Model clustering algorithm in the > repository, but I don't know if anyone is working on it. > > RJ > > On Wednesday, July 9, 2014, MEETHU MATHEW wrote: > >> Hi, >> >> I am interested in contributing a clustering algorithm towards MLlib of >> Spark.I am focusing on Gaussian Mixture Model. >> But I saw a JIRA @ https://spark-project.atlassian.net/browse/SPARK-952 >> regrading the same.I would like to know whether Gaussian Mixture Model >> is already implemented or not. >> >> >> >> Thanks & Regards, >> Meethu M > > > > -- > em rnowl...@gmail.com > c 954.496.2314
Re: Contribution to MLlib
Hi Meethu, There is no code for a Gaussian Mixture Model clustering algorithm in the repository, but I don't know if anyone is working on it. RJ On Wednesday, July 9, 2014, MEETHU MATHEW wrote: > Hi, > > I am interested in contributing a clustering algorithm towards MLlib of > Spark.I am focusing on Gaussian Mixture Model. > But I saw a JIRA @ https://spark-project.atlassian.net/browse/SPARK-952 > regrading the same.I would like to know whether Gaussian Mixture Model > is already implemented or not. > > > > Thanks & Regards, > Meethu M -- em rnowl...@gmail.com c 954.496.2314
Contribution to MLlib
Hi, I am interested in contributing a clustering algorithm towards MLlib of Spark.I am focusing on Gaussian Mixture Model. But I saw a JIRA @ https://spark-project.atlassian.net/browse/SPARK-952 regrading the same.I would like to know whether Gaussian Mixture Model is already implemented or not. Thanks & Regards, Meethu M