Partition number of Spark Streaming Kafka receiver-based approach

2015-05-18 Thread Bill Jay
Hi all, I am reading the docs of receiver-based Kafka consumer. The last parameters of KafkaUtils.createStream is per topic number of Kafka partitions to consume. My question is, does the number of partitions for topic in this parameter need to match the number of partitions in Kafka. For

Re: Partition number of Spark Streaming Kafka receiver-based approach

2015-05-18 Thread Saisai Shao
HI Bill, You don't need to match the number of thread to the number of partitions in the specific topic, for example, you have 3 partitions in topic1, but you only set 2 threads, ideally 1 thread will receive 2 partitions and another thread for the left one partition, it depends on the scheduling