m1a2st opened a new pull request, #21176: URL: https://github.com/apache/kafka/pull/21176
In Transaction Version 2, strict epoch validation (`markerEpoch > currentEpoch`) causes hanging transactions in two scenarios: 1. **Coordinator recovery**: When reloading PREPARE_COMMIT/ABORT from transaction log, retried markers are rejected with `InvalidProducerEpochException` because they use the same epoch 2. **Network retry**: When marker write succeeds but response is lost, coordinator retries are rejected for the same reason Both cases leave transactions permanently hanging in PREPARE state, causing clients to fail with `CONCURRENT_TRANSACTIONS`. Detect idempotent marker retries in `ProducerStateManager.checkProducerEpoch()` by checking three conditions: 1. Transaction Version ≥ 2 2. markerEpoch == currentEpoch (same epoch) 3. currentTxnFirstOffset is empty (transaction already completed) When all conditions are met, treat the marker as a successful idempotent retry instead of throwing an error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
