Hi, We are using DirectKafkaInputDStream and store completed consumer offsets in Kafka (0.8.2). However, some of our use case require that offsets be not written if processing of a partition fails with certain exceptions. This allows us to build various backoff strategies for that partition, instead of either blindly committing consumer offsets regardless of errors (because KafkaRDD as HasOffsetRanges is available only on the driver) or relying on Spark's retry logic and continuing without remedial action.
I was playing with SparkListener and found that while one can listen on taskCompletedEvent on the driver and even figure out that there was an error, there is no way of mapping this task back to the partition and retrieving offset range, topic & kafka partition # etc. Any pointers appreciated! Thanks! -neelesh