Hi How can I avoid duplicate processing of kafka messages in spark stream 1.3 because of executor failure.
1.Can I some how access accumulators of failed task in retry task to skip those many events which are already processed by failed task on this partition ? 2.Or I ll have to persist each msg processed and then check before processing each msg whether its already processed by failure task and delete this perisited information at each batch end?