[ https://issues.apache.org/jira/browse/SPARK-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136564#comment-14136564 ]
Derrick Burns commented on SPARK-3219: -------------------------------------- I went ahead and created a pull request. I would appreciate your comments. https://github.com/apache/spark/pull/2419 > K-Means clusterer should support Bregman distance functions > ----------------------------------------------------------- > > Key: SPARK-3219 > URL: https://issues.apache.org/jira/browse/SPARK-3219 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Derrick Burns > Assignee: Derrick Burns > > The K-Means clusterer supports the Euclidean distance metric. However, it is > rather straightforward to support Bregman > (http://machinelearning.wustl.edu/mlpapers/paper_files/BanerjeeMDG05.pdf) > distance functions which would increase the utility of the clusterer > tremendously. > I have modified the clusterer to support pluggable distance functions. > However, I notice that there are hundreds of outstanding pull requests. If > someone is willing to work with me to sponsor the work through the process, I > will create a pull request. Otherwise, I will just keep my own fork. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org