You will have to do a repartition after creating the dstream to utilize all
cores. directStream keeps exactly the same partitions as in kafka for spark.
Thanks
Best Regards
On Thu, Dec 3, 2015 at 9:42 AM, Charan Ganga Phani Adabala <
char...@eiqnetworks.com> wrote:
> Hi,
>
> We have* 1 kafka
There's a 1:1 relationship between Kafka partitions and Spark partitions.
Have you read
https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md
A direct stream job will use up to spark.executor.cores number of cores.
If you have fewer partitions than cores, there probably won't be