rreddy-22 opened a new pull request, #20534: URL: https://github.com/apache/kafka/pull/20534
We are seeing cases where a Kafka Streams (KS) thread stalls for ~20 seconds. During this stall, the broker correctly aborts the open transaction (triggered by the 10-second transaction timeout). However, when the KS thread resumes, instead of receiving the expected InvalidProducerEpochException (which we already handle gracefully as part of transaction abort), the client is instead hit with an InvalidTxnStateException. KS currently treats this as a fatal error, causing the application to fail. To fix this, we've added an epoch check before the verification check to send the recoverable InvalidProducerEpochException instead of the fatal InvalidTxnStateException. This helps safeguard both tv1 and tv2 clients -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
