Hemalatha A Sat, 02 Apr 2016 07:06:11 -0700

Hello,

As per Spark programming guide, it says "we should have 2-4 partitions for
each CPU in your cluster.". In this case how does 1 CPU core process 2-4
partitions at the same time?
Link - http://spark.apache.org/docs/latest/programming-guide.html (under
Rdd section)


Does it do context switching between tasks or run them in parallel? If it
does context switching how is it efficient compared to 1:1 partition vs
Core?

PS: If we are using Kafka direct API  in which kafka partitions=  Rdd
partitions. Does that mean we should give 40 kafka partitions for 10 CPU
Cores?

-- 


Regards
Hemalatha

Reply via email to