As per Spark programming guide, it says "we should have 2-4 partitions for each CPU in your cluster.". In this case how does 1 CPU core process 2-4 partitions at the same time?
Does it do context switching between tasks or run them in parallel? If it does context switching how is it efficient compared to 1:1 partition vs Core? PS: If we are using Kafka direct API in which kafka partitions= Rdd partitions. Does that mean we should give 40 kafka partitions for 10 CPU Cores? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Relation-between-number-of-partitions-and-cores-tp26658.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org