[jira] [Commented] (SPARK-21244) KMeans applied to processed text day clumps almost all documents into one cluster

2017-07-01 Thread Nassir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071447#comment-16071447 ] Nassir commented on SPARK-21244: Hi, The pyspark k-means implementation is on the same 20 newsgroup

[jira] [Updated] (SPARK-21244) KMeans applied to processed text day clumps almost all documents into one cluster

2017-06-28 Thread Nassir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nassir updated SPARK-21244: --- Description: I have observed this problem for quite a while now regarding the implementation of pyspark

[jira] [Created] (SPARK-21244) KMeans applied to processed text day clumps almost all documents into one cluster

2017-06-28 Thread Nassir (JIRA)
Nassir created SPARK-21244: -- Summary: KMeans applied to processed text day clumps almost all documents into one cluster Key: SPARK-21244 URL: https://issues.apache.org/jira/browse/SPARK-21244 Project: Spark

[jira] [Commented] (SPARK-20696) tf-idf document clustering with K-means in Apache Spark putting points into one cluster

2017-06-28 Thread Nassir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066652#comment-16066652 ] Nassir commented on SPARK-20696: Unfortunately, I have not found a place to make this known to the spark

[jira] [Commented] (SPARK-20696) tf-idf document clustering with K-means in Apache Spark putting points into one cluster

2017-06-09 Thread Nassir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044356#comment-16044356 ] Nassir commented on SPARK-20696: This appears to be a problem with the implementation of K-means

[jira] [Created] (SPARK-20696) tf-idf document clustering with K-means in Apache Spark putting points into one cluster

2017-05-10 Thread Nassir (JIRA)
Nassir created SPARK-20696: -- Summary: tf-idf document clustering with K-means in Apache Spark putting points into one cluster Key: SPARK-20696 URL: https://issues.apache.org/jira/browse/SPARK-20696 Project: