Would you mind sharing the code leading to your createStream? Are you also setting group.id?
Thanks, Sean On Oct 10, 2014, at 4:31 PM, Abraham Jacob <abe.jac...@gmail.com> wrote: > Hi Folks, > > I am seeing some strange behavior when using the Spark Kafka connector in > Spark streaming. > > I have a Kafka topic which has 8 partitions. I have a kafka producer that > pumps some messages into this topic. > > On the consumer side I have a spark streaming application that that has 8 > executors on 8 worker nodes and 8 ReceiverInputDStream with the same kafka > group id connected to the 8 partitions I have for the topic. Also the kafka > consumer property "auto.offset.reset" is set to "smallest". > > > Now here is the sequence of steps - > > (1) I Start the the spark streaming app. > (2) Start the producer. > > As this point I see the messages that are being pumped from the producer in > Spark Streaming. Then I - > > (1) Stopped the producer > (2) Wait for all the message to be consumed. > (2) Stopped the spark streaming app. > > Now when I restart the spark streaming app (note - the producer is still down > and no messages are being pumped into the topic) - I observe the following - > > (1) Spark Streaming starts reading from each partition right from the > beginning. > > > This is not what I was expecting. I was expecting the consumers started by > spark streaming to start from where it left off.... > > Is my assumption not correct that "the consumers (the kafka/spark connector) > to start reading from the topic where it last left off."..? > > Has anyone else seen this behavior? Is there a way to make it such that it > starts from where it left off? > > Regards, > - Abraham --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org