Alessio created SPARK-15904: ------------------------------- Summary: High Memory Pressure using MLlib K-means Key: SPARK-15904 URL: https://issues.apache.org/jira/browse/SPARK-15904 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.6.1 Environment: Mac OS X 10.11.6beta on Macbook Pro 13" mid-2012. 16GB of RAM. Reporter: Alessio
Running MLlib K-Means on a ~400MB dataset, persisted on Memory and Disk. Everything's fine, although at the end of K-Means, after the number of iterations, the cost function value and the running time there's a nice "Removing RDD <idx> from persistent list" stage. However, during this stage there's a high memory pressure. Weird, since RDDs are about to be removed. I'm running this cluster analysis on a 16GB machine, with Spark Context as local[*]. My machine has an i5 hyperthreaded dual-core, thus [*] means 4. I'm launching this application though spark-submit with --driver-memory 10G -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org