[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-05 Thread Parth Gandhi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784831#comment-16784831 ] Parth Gandhi commented on SPARK-26947: -- [~srowen] Yes your suggestion to limit the vocab size

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783858#comment-16783858 ] Sean Owen commented on SPARK-26947: --- That doesn't sound "very big" but how big are the vectors you

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-04 Thread Parth Gandhi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783849#comment-16783849 ] Parth Gandhi commented on SPARK-26947: -- [~srowen] for this particular case, k is set to 1.

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782076#comment-16782076 ] Sean Owen commented on SPARK-26947: --- How big is k? yes, you're going to run out of memory eventually

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1665#comment-1665 ] Marco Gaido commented on SPARK-26947: - Cloud you also please provide the heap dump of the JVM? You

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-02-20 Thread Parth Gandhi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773457#comment-16773457 ] Parth Gandhi commented on SPARK-26947: -- I am unable to attach the dummy dataset as the size of the