chungen0126 opened a new pull request, #10481:
URL: https://github.com/apache/ozone/pull/10481

   ## What changes were proposed in this pull request?
   Currently in ContainerSet.java, recoveringContainerMap records recovering 
containers and identifies them by their timeout values. However, this 
introduces a issue: if two or more containers start recovering at the exact 
same time, they will have identical timeout values. Because it's a map, the 
newer entry overwrites the older one. As a result, the overwritten container is 
silently dropped from the tracking map. If the actual recovery action for this 
untracked container stucks, the StaleRecoveringContainerScrubbingService will 
be unaware of it and cannot trigger the timeout cleanup. Consequently, the 
container becomes permanently orphaned and stuck in the 'recovering' state.
   
   Solution:
   This PR addresses the issue by refactoring how we track recovering 
containers:
   
   - Introduced RecoveringContainer class: Created a new object to encapsulate 
both timeout and containerId.
   - Custom Comparable Logic: The new RecoveringContainer compares by timeout 
first. If timeouts are identical, it falls back to comparing containerId. This 
ensures that containers with the exact same timeout are treated as distinct 
entries.
   - Replaced Map with Set: Replaced recoveringContainerMap with 
recoveringContainerSet to store these new objects.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-12669
   
   ## How was this patch tested?
   
   Before changes: TestECContainerRecovery failed 2 times in 20 * 10 
iterations. https://github.com/chungen0126/ozone/actions/runs/26313957603
   
   After changes: TestECContainerRecovery passed: 20 * 10 iterations after 
changes. https://github.com/chungen0126/ozone/actions/runs/27263362058
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to