YutaLin opened a new pull request, #10218:
URL: https://github.com/apache/ozone/pull/10218
## What changes were proposed in this pull request?
1. Move resetInFlightSnapshotCount() from notifyLeaderReady() to
notifyNotLeader()
The counter tracks requests where preExecute() ran but
validateAndUpdateCache()
hasn't completed. When an OM steps down as leader, it should reset its
counter
because those tracked requests are no longer its responsibility.
Previously,
resetting in notifyLeaderReady() was incorrect - the new leader never
ran
preExecute() for pending requests, so its counter should start at 0.
2. Override handleRequestFailure() in OMSnapshotCreateRequest
When a request fails after preExecute() (e.g., PrepareState rejection),
the counter was never decremented. This could cause the counter to grow
unbounded during OM prepare mode for upgrades.
3. Add safety net check in validateAndUpdateCache()
Call assertSnapshotLimitNotExceeded() as a hard guarantee that the
snapshot
limit is never exceeded, even if in-flight tracking has bugs during
leader
transitions or other edge cases
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13357
## How was this patch tested?
Add tests and ci(https://github.com/YutaLin/ozone/actions/runs/25562315743)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]