[ https://issues.apache.org/jira/browse/KAFKA-13768 ]


    xuexiaoyue deleted comment on KAFKA-13768:
    ------------------------------------

was (Author: ddrid):
I found that KAFKA-8805; Bump producer epoch on recoverable errors (#7389)  fix 
this by automatically aborting the transaction and bump the producer epoch, but 
it needs the condition that 'coordinatorSupportsBumpingEpoch' to be satisfied. 
If not satisfied, it still turns into fatal error state. I think it may be 
better to transitionToAbortableError and let the user abort it?

> Transactional producer exits because of expiration in RecordAccumulator
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-13768
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13768
>             Project: Kafka
>          Issue Type: Improvement
>          Components: producer 
>    Affects Versions: 2.0.0
>            Reporter: xuexiaoyue
>            Priority: Major
>
> Hi team, I'm using a transactional producer and set request.timeout.ms to a 
> rather small value such as 10s, meanwhile set zookeeper.session.timeout.ms 
> longer such as 30s. 
> When the producer sending records and one broker accidentally shut down, I 
> notice the producer throw out 'org.apache.kafka.common.KafkaException: The 
> client hasn't received acknowledgment for some previously sent messages and 
> can no longer retry them. It isn't safe to continue' and exit.
> Looking into the code, I found that when a batch expired in 
> RecordAccumulator, it will be marked as unsolved in Sender#sendProducerData. 
> And if it's a transactional process, it will be doomed to 
> transitionToFatalError later.
> I'm wondering why we need to transitionToFatalError here? Is it better to 
> abort this transaction instead? I know it's necessary to bump the epoch 
> during the idempotence sending, but why we let the producer crash in this 
> case?
> I found that KAFKA-8805; Bump producer epoch on recoverable errors (#7389)  
> fix this by automatically aborting the transaction and bump the producer 
> epoch, but it needs the condition that 'coordinatorSupportsBumpingEpoch' to 
> be satisfied. If not satisfied, it still turns into fatal error state. I 
> think it may be better to transitionToAbortableError and let the user abort 
> it?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to