Hi All, I was wondering why the recommended number for parallelism was 2 -3 times the number of cores on your cluster. Is the heuristic explained in any of the Spark papers? Or is it more of an agreed upon rule of thumb?
Thanks, Rahul P -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Number-of-Partitions-Recommendations-tp24022.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org