[ 
https://issues.apache.org/jira/browse/KAFKA-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-10829:
----------------------------------
    Labels: new-streams-runtime-should-fix  (was: )

> Kafka Streams handle produce exception improvement
> --------------------------------------------------
>
>                 Key: KAFKA-10829
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10829
>             Project: Kafka
>          Issue Type: Improvement
>          Components: producer , streams
>            Reporter: Guozhang Wang
>            Priority: Major
>              Labels: new-streams-runtime-should-fix
>
> A summary of some recent discussions on how we should improve on embedded 
> producer exception handling.
> Note that below the basline logic would guarantee that our correctness 
> semantics is not violated; and optimization are on top of the baseline to 
> reduce the user's burden by letting the library auto-handle certain types of 
> exception.
> 1) ``Producer.send()`` throw exception directly: 
> 1.a) baseline (to make sure correctness) logic is to always wrap them as 
> StreamsException, it would cause the thread to shutdown and exception handler 
> triggered. The handler could look into the wrapped exception and decide 
> whether the shutdown thread can be restarted.
> 1.b) optimization is to look at the exception, and decide if they can be 
> wrapped as TaskMigratedException instead (e.g. ProducerFenced). This would 
> then be auto-handled by lost-all-tasks and re-join.
> 2) ``Producer.send()`` Callback has an exception:
> 2.a) baseline is first to check if the exception is instanceof 
> RetriableException.
> If not retriable, pass it to the producer exception handler to decide whether 
> to throw or to continue with record dropped. If decide to throw, always warp 
> it as StreamsException and keep it locally; at the same time do not send more 
> records from the caller. In the next send call, check the remembered 
> exception and throw. It would cause the thread to shutdown and exception 
> handler triggered.
> If the exception is not Retriable, always throw it as a fatal 
> StreamsException.
> 2.b) optimization one: if the non-retriable exception can be translated as a 
> TaskMigratedException, then do not wrap it as StreamsException to let the 
> library handle internally.
> 2.c) optimization two: if the retriable exception is a timeout exception, 
> then do not pass to the produce exception handler and treat it as 
> TaskMigrated.
> 3) ``Producer.XXXTxn`` APIs except ``AbortTxn`` throw exception directly:
> 3.a) baseline logic is to capture all KafkaException except TimeoutException, 
> and handle them as *TaskCorrupted* (which include abort the transaction, 
> reset the state, and re-join the group). TimeoutException would be rethrown.
> 3.b) optimization: some exceptions can be handled as TaskMigrated, which 
> would be handled in a lighter way.
> 4) ``Producer.abortTxn`` throw exception:
> 3.a) baseline logic is to capture all KafkaException  except TimeoutException 
> as fatal StreamsException. TimeoutException would be rethrown.
> 3.b) optimization: some exceptions can be ignored (e.g. invalidTxnTransition 
> means the abort did not succeeded).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to