If you want to change topics from batch to batch, you can always just create a KafkaRDD repeatedly.
The streaming code as it stands assumes a consistent set of topics though. The implementation is private so you cant subclass it without building your own spark. On Wed, Apr 1, 2015 at 1:09 PM, Neelesh <neele...@gmail.com> wrote: > Thanks Cody, that was really helpful. I have a much better understanding > now. One last question - Kafka topics are initialized once in the driver, > is there an easy way of adding/removing topics on the fly? > KafkaRDD#getPartitions() seems to be computed only once, and no way of > refreshing them. > > Thanks again! > > On Wed, Apr 1, 2015 at 10:01 AM, Cody Koeninger <c...@koeninger.org> > wrote: > >> https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md >> >> The kafka consumers run in the executors. >> >> On Wed, Apr 1, 2015 at 11:18 AM, Neelesh <neele...@gmail.com> wrote: >> >>> With receivers, it was pretty obvious which code ran where - each >>> receiver occupied a core and ran on the workers. However, with the new >>> kafka direct input streams, its hard for me to understand where the code >>> that's reading from kafka brokers runs. Does it run on the driver (I hope >>> not), or does it run on workers? >>> >>> Any help appreciated >>> thanks! >>> -neelesh >>> >> >> >