Re: Spark Streaming Checkpoint and Exactly Once Guarantee on Kafka Direct Stream

2017-06-06 Thread ALunar Beach
Thanks TD. In pre-structured streaming, exactly once guarantee on input is not guaranteed. is it? On Tue, Jun 6, 2017 at 4:30 AM, Tathagata Das wrote: > This is the expected behavior. There are some confusing corner cases. > If you are starting to play with Spark

Spark Streaming Checkpoint and Exactly Once Guarantee on Kafka Direct Stream

2017-06-05 Thread ALunar Beach
I am using Spark Streaming Checkpoint and Kafka Direct Stream. It uses a 30 sec batch duration and normally the job is successful in 15-20 sec. If the spark application fails after the successful completion (149668428ms in the log below) and restarts, it's duplicating the last batch again.