[ https://issues.apache.org/jira/browse/SPARK-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-3220: --------------------------------- Labels: clustering (was: ) > K-Means clusterer should perform K-Means initialization in parallel > ------------------------------------------------------------------- > > Key: SPARK-3220 > URL: https://issues.apache.org/jira/browse/SPARK-3220 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Derrick Burns > Labels: clustering > > The LocalKMeans method should be replaced with a parallel implementation. As > it stands now, it becomes a bottleneck for large data sets. > I have implemented this functionality in my version of the clusterer. > However, I see that there are hundreds of outstanding pull requests. If > someone on the team wants to sponsor the pull request, I will create one. > Otherwise, I will just maintain my own private fork of the clusterer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org