With a lower number of partitions, I keep losing executors during collect at KMeans.scala:283 The error message is "ExecutorLostFailure (executor lost)". The program recovers by automatically repartitioning the whole dataset (126G), which takes very long and seems to only delay the inevitable failure.
Is there a recommended solution to this issue? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Only-master-is-really-busy-at-KMeans-training-tp12411p12803.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org