[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-11-01 Thread Yu Ishikawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984574#comment-14984574
 ] 

Yu Ishikawa commented on SPARK-6001:


[~josephkb] can we close this issue? 

> K-Means clusterer should return the assignments of input points to clusters
> ---
>
> Key: SPARK-6001
> URL: https://issues.apache.org/jira/browse/SPARK-6001
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Affects Versions: 1.2.1
>Reporter: Derrick Burns
>Priority: Minor
>
> The K-Means clusterer returns a KMeansModel that contains the cluster 
> centers. However, when available, I suggest that the K-Means clusterer also 
> return an RDD of the assignments of the input data to the clusters. While the 
> assignments can be computed given the KMeansModel, why not return assignments 
> if they are available to save re-computation costs.
> The K-means implementation at 
> https://github.com/derrickburns/generalized-kmeans-clustering returns the 
> assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-07-16 Thread Manoj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629808#comment-14629808
 ] 

Manoj Kumar commented on SPARK-6001:


I just started to work on this.

 K-Means clusterer should return the assignments of input points to clusters
 ---

 Key: SPARK-6001
 URL: https://issues.apache.org/jira/browse/SPARK-6001
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor

 The K-Means clusterer returns a KMeansModel that contains the cluster 
 centers. However, when available, I suggest that the K-Means clusterer also 
 return an RDD of the assignments of the input data to the clusters. While the 
 assignments can be computed given the KMeansModel, why not return assignments 
 if they are available to save re-computation costs.
 The K-means implementation at 
 https://github.com/derrickburns/generalized-kmeans-clustering returns the 
 assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-07-16 Thread Manoj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629981#comment-14629981
 ] 

Manoj Kumar commented on SPARK-6001:


Oops. I just figured out we do not have a KMeans yet in spark.ml

 K-Means clusterer should return the assignments of input points to clusters
 ---

 Key: SPARK-6001
 URL: https://issues.apache.org/jira/browse/SPARK-6001
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor

 The K-Means clusterer returns a KMeansModel that contains the cluster 
 centers. However, when available, I suggest that the K-Means clusterer also 
 return an RDD of the assignments of the input data to the clusters. While the 
 assignments can be computed given the KMeansModel, why not return assignments 
 if they are available to save re-computation costs.
 The K-means implementation at 
 https://github.com/derrickburns/generalized-kmeans-clustering returns the 
 assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-07-16 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630309#comment-14630309
 ] 

Joseph K. Bradley commented on SPARK-6001:
--

It's almost ready to merge...

 K-Means clusterer should return the assignments of input points to clusters
 ---

 Key: SPARK-6001
 URL: https://issues.apache.org/jira/browse/SPARK-6001
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor

 The K-Means clusterer returns a KMeansModel that contains the cluster 
 centers. However, when available, I suggest that the K-Means clusterer also 
 return an RDD of the assignments of the input data to the clusters. While the 
 assignments can be computed given the KMeansModel, why not return assignments 
 if they are available to save re-computation costs.
 The K-means implementation at 
 https://github.com/derrickburns/generalized-kmeans-clustering returns the 
 assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-07-06 Thread Venkata Vineel (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614720#comment-14614720
 ] 

Venkata Vineel commented on SPARK-6001:
---

[~derrickburns]  Can you please assign this to me.

 K-Means clusterer should return the assignments of input points to clusters
 ---

 Key: SPARK-6001
 URL: https://issues.apache.org/jira/browse/SPARK-6001
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor

 The K-Means clusterer returns a KMeansModel that contains the cluster 
 centers. However, when available, I suggest that the K-Means clusterer also 
 return an RDD of the assignments of the input data to the clusters. While the 
 assignments can be computed given the KMeansModel, why not return assignments 
 if they are available to save re-computation costs.
 The K-means implementation at 
 https://github.com/derrickburns/generalized-kmeans-clustering returns the 
 assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6001) K-Means clusterer should return the assignments of input points to clusters

2015-07-06 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615502#comment-14615502
 ] 

Joseph K. Bradley commented on SPARK-6001:
--

[~yalamart]  This should probably be done under the Pipelines API, via the 
R-like stats design linked above.  I'd recommend we wait to include this until 
the initial (LinearRegression) PR for R-like stats is merged, after which this 
JIRA can follow that design as an example.

 K-Means clusterer should return the assignments of input points to clusters
 ---

 Key: SPARK-6001
 URL: https://issues.apache.org/jira/browse/SPARK-6001
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor

 The K-Means clusterer returns a KMeansModel that contains the cluster 
 centers. However, when available, I suggest that the K-Means clusterer also 
 return an RDD of the assignments of the input data to the clusters. While the 
 assignments can be computed given the KMeansModel, why not return assignments 
 if they are available to save re-computation costs.
 The K-means implementation at 
 https://github.com/derrickburns/generalized-kmeans-clustering returns the 
 assignments when available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org