Hello,

As per Spark programming guide, it says "we should have 2-4 partitions for
each CPU in your cluster.". In this case how does 1 CPU core process 2-4
partitions at the same time?
Link - http://spark.apache.org/docs/latest/programming-guide.html (under
Rdd section)

Does it do context switching between tasks or run them in parallel? If it
does context switching how is it efficient compared to 1:1 partition vs
Core?

PS: If we are using Kafka direct API  in which kafka partitions=  Rdd
partitions. Does that mean we should give 40 kafka partitions for 10 CPU
Cores?

-- 


Regards
Hemalatha
  • [no subject] Hemalatha A
    • Re: Akhil Das

Reply via email to