Number of executors in Spark - Kafka

2016-01-21 Thread Guillermo Ortiz
I'm using Spark Streaming and Kafka with Direct Approach. I have created a topic with 6 partitions so when I execute Spark there are six RDD. I understand than ideally it should have six executors to process each one one RDD. To do it, when I execute spark-submit (I use YARN) I specific the

Re: Number of executors in Spark - Kafka

2016-01-21 Thread Cody Koeninger
6 kafka partitions will result in 6 spark partitions, not 6 spark rdds. The question of whether you will have a backlog isn't just a matter of having 1 executor per partition. If a single executor can process all of the partitions fast enough to complete a batch in under the required time, you