subject:"How the cores are used in Directstream approach"

Re: How the cores are used in Directstream approach

2015-12-07 Thread Akhil Das

You will have to do a repartition after creating the dstream to utilize all cores. directStream keeps exactly the same partitions as in kafka for spark. Thanks Best Regards On Thu, Dec 3, 2015 at 9:42 AM, Charan Ganga Phani Adabala < char...@eiqnetworks.com> wrote: > Hi, > > We have* 1 kafka

Re: How the cores are used in Directstream approach

2015-12-03 Thread Cody Koeninger

There's a 1:1 relationship between Kafka partitions and Spark partitions. Have you read https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md A direct stream job will use up to spark.executor.cores number of cores. If you have fewer partitions than cores, there probably won't be