Cody, adding partitions to kafka is there as a last resort, I was wondering if I can decrease the processing time by not touching my Kafka cluster. Adrian, repartition looks like a good option and let me check if I can gain performance. Dibyendu, will surely try out this consumer.
Thanks all, will share my findings.. On Thu, Oct 29, 2015 at 7:16 PM, Cody Koeninger <c...@koeninger.org> wrote: > Consuming from kafka is inherently limited to using a number of consumer > nodes less than or equal to the number of kafka partitions. If you think > about it, you're going to be paying some network cost to repartition that > data from a consumer to different processing nodes, regardless of what > Spark consumer library you use. > > If you really need finer grained parallelism, and want to do it in a more > efficient manner, you need to move that partitioning to the producer (i.e. > add more partitions to kafka). > > On Thu, Oct 29, 2015 at 6:11 AM, Adrian Tanase <atan...@adobe.com> wrote: > >> You can call .repartition on the Dstream created by the Kafka direct >> consumer. You take the one-time hit of a shuffle but gain the ability to >> scale out processing beyond your number of partitions. >> >> We’re doing this to scale up from 36 partitions / topic to 140 partitions >> (20 cores * 7 nodes) and it works great. >> >> -adrian >> >> From: varun sharma >> Date: Thursday, October 29, 2015 at 8:27 AM >> To: user >> Subject: Need more tasks in KafkaDirectStream >> >> Right now, there is one to one correspondence between kafka partitions >> and spark partitions. >> I dont have a requirement of one to one semantics. >> I need more tasks to be generated in the job so that it can be >> parallelised and batch can be completed fast. In the previous Receiver >> based approach number of tasks created were independent of kafka >> partitions, I need something like that only. >> Any config available if I dont need one to one semantics? >> Is there any way I can repartition without incurring any additional cost. >> >> Thanks >> *VARUN SHARMA* >> >> > -- *VARUN SHARMA* *Flipkart* *Bangalore*