coltmcnealy-lh opened a new pull request, #20833: URL: https://github.com/apache/kafka/pull/20833
Fixes Issue [19853](https://issues.apache.org/jira/browse/KAFKA-19853). Under stress with active restorations going on, the `StateUpdater#runOnce()` method can block on write stalls. This causes the `StreamThread` to block on `TaskManager#handleAssignment()` in the consumer rebalance callback. This is because `TaskManager#handleAssignment()` waits for a future on the State Updater. If rocksdb is stalling, which is very common during restoration or when processing a warmup task, that future can take some time to show up. This blocking can cause transaction timeouts in EOS, which is disruptive. This commit mitigates that issue by committing any open transactions before blocking on the State Updater. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
