s*
>>> *17/06/05 13:42:31 INFO JobGenerator: Batches pending processing (0
>>> batches): *
>>> *17/06/05 13:42:31 INFO JobGenerator: Batches to reschedule (10
>>> batches): *149668428 ms, 149668431 ms, 149668434 ms,
>>> 1496684370000 ms, 1496
8 ms, 149668431 ms, 149668434 ms, 149668437 ms,
>> 149668440 ms, 149668443 ms, 149668446 ms, 149668449 ms,
>> 1496684520000 ms, 149668455 ms
>> 17/06/05 13:42:31 INFO JobScheduler: Added jobs for time 1496684280000 ms
>> 17/06/05 13:42:31 IN
49668452 ms,
> 149668455 ms
> 17/06/05 13:42:31 INFO JobScheduler: Added jobs for time 149668428 ms
> 17/06/05 13:42:31 INFO JobScheduler: Starting job streaming job
> 149668428 ms.0 from job set of time 149668428 ms
>
>
>
> ------
-Spark-Streaming-Checkpoint-and-Exactly-Once-Guarantee-on-Kafka-Direct-Stream-tp28743.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
I am using Spark Streaming Checkpoint and Kafka Direct Stream.
It uses a 30 sec batch duration and normally the job is successful in 15-20
sec.
If the spark application fails after the successful completion
(149668428ms in the log below) and restarts, it's duplicating the last
batch again.