philipnee opened a new pull request, #14179: URL: https://github.com/apache/kafka/pull/14179
As title, we discovered a flaky bug during testing that it would seldomly thrown a NOT_COORDINATOR exception, which means the request was routed to a non-coordinator node. We discovered that if we don't check the coordinator node in the commitRequestManager, the request manager will pass on an empty node to the NetworkClientDelegate, which implies the request can be sent to any node in the cluster. This behavior is incorrect as the commit requests need to be routed to a coordinator node. Because the timing coordinator's discovery during integration testing isn't entirely deterministic; therefore, the test became extremely flaky. After fixing this: The coordinator node is mandatory before attempt to enqueue these commit request to the NetworkClient. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org