* 7 nodes) and it works great.
>
> -adrian
>
> From: varun sharma
> Date: Thursday, October 29, 2015 at 8:27 AM
> To: user
> Subject: Need more tasks in KafkaDirectStream
>
> Right now, there is one to one correspondence between kafka partitions and
> spark partition
in the ability to
>> scale out processing beyond your number of partitions.
>>
>> We’re doing this to scale up from 36 partitions / topic to 140 partitions
>> (20 cores * 7 nodes) and it works great.
>>
>> -adrian
>>
>> From: varun sharma
>> Date:
Right now, there is one to one correspondence between kafka partitions and
spark partitions.
I dont have a requirement of one to one semantics.
I need more tasks to be generated in the job so that it can be parallelised
and batch can be completed fast. In the previous Receiver based approach
If you do not need one to one semantics and does not want strict ordering
guarantee , you can very well use the Receiver based approach, and this
consumer from Spark-Packages (
https://github.com/dibbhatt/kafka-spark-consumer) can give much better
alternatives in terms of performance and
) and it works great.
-adrian
From: varun sharma
Date: Thursday, October 29, 2015 at 8:27 AM
To: user
Subject: Need more tasks in KafkaDirectStream
Right now, there is one to one correspondence between kafka partitions and
spark partitions.
I dont have a requirement of one to one semantics.
I need