Hi Alon, No this will not be guarantee that same set of messages will come in same RDD. This fix just re-play the messages from last processed offset in same order. Again this is just a interim fix we needed to solve our use case . If you do not need this message re-play feature, just do not perform the ack ( Acknowledgement) call in the Driver code. Then the processed messages will not be written to ZK and hence replay will not happen.
Regards, Dibyendu On Mon, Sep 15, 2014 at 4:48 PM, Alon Pe'er <alo...@supersonicads.com> wrote: > Hi Dibyendu, > > Thanks for your great work! > > I'm new to Spark Streaming, so I just want to make sure I understand Driver > failure issue correctly. > > In my use case, I want to make sure that messages coming in from Kafka are > always broken into the same set of RDDs, meaning that if a set of messages > are assigned to one RDD, and the Driver dies before this RDD is processed, > then once the Driver recovers, the same set of messages are assigned to a > single RDD, instead of arbitrarily repartitioning the messages across > different RDDs. > > Does your Receiver guarantee this behavior, until the problem is fixed in > Spark 1.2? > > Regards, > Alon > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p14233.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >