rreddy-22 opened a new pull request, #20534:
URL: https://github.com/apache/kafka/pull/20534

   We are seeing cases where a Kafka Streams (KS) thread stalls for ~20 
seconds. During this stall, the broker correctly aborts the open transaction 
(triggered by the 10-second transaction timeout). 
   However, when the KS thread resumes, instead of receiving the expected 
InvalidProducerEpochException (which we already handle gracefully as part of 
transaction abort), the client is instead hit with an InvalidTxnStateException. 
KS currently treats this as a fatal error, causing the application to fail.
   
   To fix this, we've added an epoch check before the verification check to 
send the recoverable  InvalidProducerEpochException instead of the fatal 
InvalidTxnStateException. This helps safeguard both tv1 and tv2 clients


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to