Hi Kafka Devs,

Need your thoughts on retriable exceptions:

If a user configures Kafka with min.isr > 1 and there are not enough
replicas to safely store the data, there are two possibilities:

1. The lack of replicas was discovered before the message was written. We
throw NotEnoughReplicas.
2. The lack of replicas was discovered after the message was written to
leader. In this case, we throw  NotEnoughReplicasAfterAppend.

Currently, both errors are Retriable. Which means that the new producer
will retry multiple times.
In case of the second exception, this will cause duplicates.

KAFKA-1697 suggests:
"we probably want to make NotEnoughReplicasAfterAppend a non-retriable
exception and let the client decide what to do."

I agreed that the client (the one using the Producer) should weight the
problems duplicates will cause vs. the probability of losing the message
and do something sensible and made the exception non-retriable.

In the RB (https://reviews.apache.org/r/29647/) Joel raised a good point:
(Joel, feel free to correct me if I misrepresented your point)

"I think our interpretation of retriable is as follows (but we can discuss
on the list if that needs to change): if the produce request hits an error,
and there is absolutely no point in retrying then that is a non-retriable
error. MessageSizeTooLarge is an example - since unless the producer
changes the request to make the messages smaller there is no point in
retrying.

...
Duplicates can arise even for other errors (e.g., request timed out). So
that side-effect is not compelling enough to warrant a change to make this
non-retriable. "

*(TL;DR;  )  Should exceptions where retries can cause duplicates should
still be *
*retriable?*

Gwen

Reply via email to