Gargi-jais11 commented on PR #10109:
URL: https://github.com/apache/ozone/pull/10109#issuecomment-4341547514
for markContainerForUnhealthy with DiskBalancer parallely working . I
believe this also needs to refetch the container after writeLock.
Here it works for any container state to mark all — CLOSED, QUASI_CLOSED,
OPEN, RECOVERING — all can become UNHEALTHY.
```
Case 1: Container CLOSED/QUASI_CLOSED + DiskBalancer + Scanner in parallel
DiskBalancer (DN1)
Container Scanner (DN1)
──────────────────────────────
──────────────────────────────────────────
T1: selects C (CLOSED, Disk1)
added to inProgressContainers
T2: container.readLock() on OLD
T3: copy Disk1 → Disk2
scanner reads OLD files on Disk1
(I/O in progress)
finds checksum failure
controller.markContainerUnhealthy(id, reason)
containerSet.getContainer(id)
→ OLD container (Disk1) ← stale ref
handler.markContainerUnhealthy(OLD, reason)
→ writeLock() → BLOCKED (readLock held)
T4: copy done, checksum verified
T5: importContainer → NEW (CLOSED, Disk2)
T6: containerSet.updateContainer(NEW)
← ContainerSet maps C → NEW (Disk2)
T7: readUnlock() on OLD
writeLock ACQUIRED on OLD
state = CLOSED, not UNHEALTHY, volume not failed
OLD: CLOSED → UNHEALTHY ← wrong container
writeUnlock
sendICR: C on DN1 = UNHEALTHY ← stale, wrong
T8: markContainerForDelete(OLD) → DELETED
Final state:
OLD (Disk1): DELETED
NEW (Disk2): CLOSED ← healthy, valid
SCM view of container C: C on DN1 = UNHEALTHY ← wrong
```
**What SCM/RM does in response :**
```
T9: RM next cycle —
ECUnderReplicationHandler.checkAndRemoveUnhealthyReplica()
SCM replica record: DN1 has UNHEALTHY replica of C
checks: is there a CLOSED replica for same index on another DN?
→ if YES: "prefer deleting the UNHEALTHY over CLOSED" →
sendThrottledDeleteCommand(DN1)
→ if NO CLOSED elsewhere: "delete any UNHEALTHY" →
sendThrottledDeleteCommand(DN1)
T10: DN1 receives DeleteContainerCommand for C
containerSet.getContainer(id) → NEW container (Disk2, CLOSED)
NEW container DELETED ← healthy valid replica gone
Now container C is genuinely under-replicated.
RM tries to fix it by replicating — but the replica it just deleted was
the source.
```
**Outcome:** A healthy replica on Disk2 gets deleted. Container C becomes
genuinely under-replicated.
The window between readUnlock (T7) and markContainerForDelete (T8) is the
critical period. If the scanner's sendICR reaches SCM and RM processes it
before FCR corrects the state, the delete command lands on DN1 and hits the
healthy NEW container. This is why markContainerUnhealthy needs the re-fetch —
the consequences of operating on the wrong container are irreversible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]