Re: Spark Streaming 1.3 & Kafka Direct Streams

Cody Koeninger Wed, 01 Apr 2015 11:23:36 -0700

If you want to change topics from batch to batch, you can always just
create a KafkaRDD repeatedly.


The streaming code as it stands assumes a consistent set of topics though.
The implementation is private so you cant subclass it without building your
own spark.

On Wed, Apr 1, 2015 at 1:09 PM, Neelesh <neele...@gmail.com> wrote:

> Thanks Cody, that was really helpful.  I have a much better understanding
> now. One last question -  Kafka topics  are initialized once in the driver,
> is there an easy way of adding/removing topics on the fly?
> KafkaRDD#getPartitions() seems to be computed only once, and no way of
> refreshing them.
>
> Thanks again!
>
> On Wed, Apr 1, 2015 at 10:01 AM, Cody Koeninger <c...@koeninger.org>
> wrote:
>
>> https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md
>>
>> The kafka consumers run in the executors.
>>
>> On Wed, Apr 1, 2015 at 11:18 AM, Neelesh <neele...@gmail.com> wrote:
>>
>>> With receivers, it was pretty obvious which code ran where - each
>>> receiver occupied a core and ran on the workers. However, with the new
>>> kafka direct input streams, its hard for me to understand where the code
>>> that's reading from kafka brokers runs. Does it run on the driver (I hope
>>> not), or does it run on workers?
>>>
>>> Any help appreciated
>>> thanks!
>>> -neelesh
>>>
>>
>>
>

Re: Spark Streaming 1.3 & Kafka Direct Streams

Reply via email to