Re: Kafka partition increased while Spark Streaming is running

2016-05-13 Thread chandan prakash
Makes sense. thank you cody. Regards, Chandan On Fri, May 13, 2016 at 8:10 PM, Cody Koeninger wrote: > No, I wouldn't expect it to, once the stream is defined (at least for > the direct stream integration for kafka 0.8), the topicpartitions are > fixed. > > My answer to any

Re: Kafka partition increased while Spark Streaming is running

2016-05-13 Thread Cody Koeninger
No, I wouldn't expect it to, once the stream is defined (at least for the direct stream integration for kafka 0.8), the topicpartitions are fixed. My answer to any question about "but what if checkpoints don't let me do this" is always going to be "well, don't rely on checkpoints." If you want

Re: Kafka partition increased while Spark Streaming is running

2016-05-13 Thread chandan prakash
Follow up question : If spark streaming is using checkpointing (/tmp/checkpointDir) for AtLeastOnce and number of Topics or/and partitions has increased then will gracefully shutting down and restarting from checkpoint will consider new topics or/and partitions ? If the answer is NO

Re: Kafka partition increased while Spark Streaming is running

2016-02-24 Thread Cody Koeninger
That's correct, when you create a direct stream, you specify the topicpartitions you want to be a part of the stream (the other method for creating a direct stream is just a convenience wrapper). On Wed, Feb 24, 2016 at 2:15 AM, 陈宇航 wrote: > Here I use the

Kafka partition increased while Spark Streaming is running

2016-02-24 Thread ??????
Here I use the 'KafkaUtils.createDirectStream' to integrate Kafka with Spark Streaming. I submitted the app, then I changed (increased) Kafka's partition number after it's running for a while. Then I check the input offset with 'rdd.asInstanceOf[HasOffsetRanges].offsetRanges', seeing that only