dimitarndimitrov opened a new pull request, #13804: URL: https://github.com/apache/kafka/pull/13804
In this test broker session timeout is configured aggressively low (to 1 second) so that fencing can happen without much waiting. Then in the final portion of the test when brokers should not be fenced heartbeats are sent roughly 2 times in a session timeout window. However the first time that's done there's other code between sending the heartbeat and taking the timestamp, and in local tests that code can take up to 0.5 seconds (1/2 of the session timeout). That then can result in all brokers being fenced again which would fail the test. This change sends a heartbeat just when a timestamp is taken, which in local tests results flaky failures from 4 out of 50 to 0 out of 50. - In local tests increasing the session timeout from 1 second to 2 seconds also reduced the flaky failures to 0 out of 50 but also consistently increased the test running time with 1 second (which seems expected). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org