arunsarin85 opened a new pull request, #10258:
URL: https://github.com/apache/ozone/pull/10258
## What changes were proposed in this pull request?
- The test now waits longer for Replication Manager and Recon to report the
same unhealthy container numbers.
- It only passes when those numbers match on two polls in a row, not a
single check.
- The @Flaky("HDDS-15223") tag was removed from testMissingContainer.
Please describe your PR in detail:
Replication Manager and Recon don’t always update at the exact same moment.
Under CI load, the test could see them briefly disagree (or “cross” each other)
while both sides are still catching up. The old wait was short (40 seconds) and
treated one matching snapshot as success, which was easy to hit as a timeout or
a misleading pass.
This change only hardens the test:
- 90 second cap on waiting (still polling every second).
- Two consecutive successful full comparisons before the test continues.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15223
## How was this patch tested?
1. Triggered flakky-test-check
https://github.com/arunsarin85/ozone/actions/runs/25793920652
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]