balodesecurity commented on PR #8325:
URL: https://github.com/apache/hadoop/pull/8325#issuecomment-4229268852

   Thanks @ZanderXu for the review!
   
   The `ConcurrentHashMap + synchronized(set)` approach is cleaner for pure 
read/write separation, but I ran into a subtle race in `remove()`: when the 
last block is removed from a set, we remove the empty set from the map. With 
`ConcurrentHashMap`, a concurrent `add()` could:
   1. Get a reference to the set via `computeIfAbsent` (set still in map)
   2. Get preempted while another thread removes the last element and calls 
`map.remove(uuid, set)`
   3. Resume and add a block to a set that is no longer in the map — silently 
losing the block
   
   To fix this, `add()` would need to verify the set is still in the map after 
acquiring `synchronized(set)` and retry if not — which adds a loop.
   
   The `ReentrantReadWriteLock` approach avoids this entirely since the whole 
map + set operation is atomic under the write lock.
   
   Happy to switch to `ConcurrentHashMap + synchronized(set)` if you have a 
preferred pattern for handling the empty-set removal race — or if it's 
acceptable to not remove empty sets from the map. Please let me know!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to