ableegoldman opened a new pull request #9173: URL: https://github.com/apache/kafka/pull/9173
We launched a Streams application reading from a single 3000-partition topic and saw continuous rebalancing. Digging into the logs, every time the leader sent a SyncGroup request it would discover that it had dropped out of the group and needed to rejoin. The assignment seemed to take slightly longer than 10s, the session interval, so it seemed to be getting kicked due to heartbeat expiration while the heartbeat thread was disabled. I redeployed the app with this exact patch and saw it stabilize at last. The application was left running for ~20 hours or so and never rebalanced again after the last pod was rolled ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org