Hi,
   We are using DirectKafkaInputDStream and store completed consumer
offsets in Kafka (0.8.2). However, some of our use case require that
offsets be not written if processing of a partition fails with certain
exceptions. This allows us to build various backoff strategies for that
partition, instead of either blindly committing consumer offsets regardless
of errors (because KafkaRDD as HasOffsetRanges is available only on the
driver)  or relying on Spark's retry logic and continuing without remedial
action.

I was playing with SparkListener and found that while one can listen on
taskCompletedEvent on the driver and even figure out that there was an
error, there is no way of mapping this task back to the partition and
retrieving offset range, topic & kafka partition # etc.

Any pointers appreciated!

Thanks!
-neelesh

Reply via email to