Makes sense. thank you cody. Regards, Chandan
On Fri, May 13, 2016 at 8:10 PM, Cody Koeninger <c...@koeninger.org> wrote: > No, I wouldn't expect it to, once the stream is defined (at least for > the direct stream integration for kafka 0.8), the topicpartitions are > fixed. > > My answer to any question about "but what if checkpoints don't let me > do this" is always going to be "well, don't rely on checkpoints." > > If you want dynamic topicpartitions, > https://issues.apache.org/jira/browse/SPARK-12177 > > > On Fri, May 13, 2016 at 4:24 AM, chandan prakash > <chandanbaran...@gmail.com> wrote: > > Follow up question : > > > > If spark streaming is using checkpointing (/tmp/checkpointDir) for > > AtLeastOnce and number of Topics or/and partitions has increased > then.... > > will gracefully shutting down and restarting from checkpoint will > consider > > new topics or/and partitions ? > > If the answer is NO then how to start from the same checkpoint with new > > partitions/topics included? > > > > Thanks, > > Chandan > > > > > > On Wed, Feb 24, 2016 at 9:30 PM, Cody Koeninger <c...@koeninger.org> > wrote: > >> > >> That's correct, when you create a direct stream, you specify the > >> topicpartitions you want to be a part of the stream (the other method > for > >> creating a direct stream is just a convenience wrapper). > >> > >> On Wed, Feb 24, 2016 at 2:15 AM, 陈宇航 <yuhang.c...@foxmail.com> wrote: > >>> > >>> Here I use the 'KafkaUtils.createDirectStream' to integrate Kafka with > >>> Spark Streaming. I submitted the app, then I changed (increased) > Kafka's > >>> partition number after it's running for a while. Then I check the input > >>> offset with 'rdd.asInstanceOf[HasOffsetRanges].offsetRanges', seeing > that > >>> only the offset of the initial partitions are returned. > >>> > >>> Does this mean Spark Streaming's Kafka integration can't update its > >>> parallelism when Kafka's partition number is changed? > >> > >> > > > > > > > > -- > > Chandan Prakash > > > -- Chandan Prakash