[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984574#comment-14984574 ] Yu Ishikawa commented on SPARK-6001: [~josephkb] can we close this issue? > K-Means clusterer should return the assignments of input points to clusters > --- > > Key: SPARK-6001 > URL: https://issues.apache.org/jira/browse/SPARK-6001 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.2.1 >Reporter: Derrick Burns >Priority: Minor > > The K-Means clusterer returns a KMeansModel that contains the cluster > centers. However, when available, I suggest that the K-Means clusterer also > return an RDD of the assignments of the input data to the clusters. While the > assignments can be computed given the KMeansModel, why not return assignments > if they are available to save re-computation costs. > The K-means implementation at > https://github.com/derrickburns/generalized-kmeans-clustering returns the > assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629808#comment-14629808 ] Manoj Kumar commented on SPARK-6001: I just started to work on this. K-Means clusterer should return the assignments of input points to clusters --- Key: SPARK-6001 URL: https://issues.apache.org/jira/browse/SPARK-6001 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.1 Reporter: Derrick Burns Priority: Minor The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs. The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629981#comment-14629981 ] Manoj Kumar commented on SPARK-6001: Oops. I just figured out we do not have a KMeans yet in spark.ml K-Means clusterer should return the assignments of input points to clusters --- Key: SPARK-6001 URL: https://issues.apache.org/jira/browse/SPARK-6001 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.1 Reporter: Derrick Burns Priority: Minor The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs. The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630309#comment-14630309 ] Joseph K. Bradley commented on SPARK-6001: -- It's almost ready to merge... K-Means clusterer should return the assignments of input points to clusters --- Key: SPARK-6001 URL: https://issues.apache.org/jira/browse/SPARK-6001 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.1 Reporter: Derrick Burns Priority: Minor The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs. The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614720#comment-14614720 ] Venkata Vineel commented on SPARK-6001: --- [~derrickburns] Can you please assign this to me. K-Means clusterer should return the assignments of input points to clusters --- Key: SPARK-6001 URL: https://issues.apache.org/jira/browse/SPARK-6001 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.1 Reporter: Derrick Burns Priority: Minor The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs. The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters
[ https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615502#comment-14615502 ] Joseph K. Bradley commented on SPARK-6001: -- [~yalamart] This should probably be done under the Pipelines API, via the R-like stats design linked above. I'd recommend we wait to include this until the initial (LinearRegression) PR for R-like stats is merged, after which this JIRA can follow that design as an example. K-Means clusterer should return the assignments of input points to clusters --- Key: SPARK-6001 URL: https://issues.apache.org/jira/browse/SPARK-6001 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.1 Reporter: Derrick Burns Priority: Minor The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs. The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org