Gargi-jais11 commented on PR #10109:
URL: https://github.com/apache/ozone/pull/10109#issuecomment-4340923770

   **Second Finding: For QUASI_CLOSED container if SCM sends force Close 
command and DiskBalancer is working on same container**
   **QUASI_CLOSED** containers can be force-closed by SCM 
(`CloseContainerCommand with force=true`). That goes through 
`controller.closeContainer()`.
   
   Here is the exact timeline:
   ```
   DiskBalancer (DN1)                                                           
   CloseContainerCommandHandler
   ─────────────────────────────────────────        
─────────────────────────────────────────
   T1: container = containerSet.getContainer(C)
       → OLD container (QUASI_CLOSED, Disk1)
   
   T2: container.readLock() on OLD
   
   T3: copy Disk1 → Disk2 ...                                       T3a: 
container = containerSet.getContainer(C)
                                                                                
      → OLD container (before updateContainer)
                                                                                
      T3b: switch(container.getContainerState())
                                                                                
      → QUASI_CLOSED + force=true
                                                                                
       → controller.closeContainer(id)
                                                                                
     → containerSet.getContainer(id)
                                                                                
     → OLD container (still, before T5)
                                                                                
     → container.close() 
                                                                                
     → writeLock() → BLOCKED     ←-------   readLock held
   
   T4: copy done, atomic move to Disk2
   T5: importContainer → 
   newContainer (QUASI_CLOSED, Disk2)
   T6: containerSet.updateContainer(newContainer)
       ← ContainerSet now maps C → newContainer
   T7: container.readUnlock()  ← releases
    OLD readLock
                                                                                
         T7a: writeLock ACQUIRED on OLD container
                                                                                
          → OLD: QUASI_CLOSED → CLOSED
                                                                                
          → sendICR(OLD=CLOSED) → SCM told C is CLOSED
   
   T8: container.markContainerForDelete(OLD)
       → writeLock → OLD: CLOSED → DELETED
   ```
   **after T8**
   ```
                                               State                            
                        In ContainerSet?
   
---------------------------------------------------------------------------------------
   OLD container (Disk1)         DELETED                                        
     No (updateContainer removed it)
   
   NEW container (Disk2)       QUASI_CLOSED                                  
Yes — this is the live replica
   ```
   `SCM's view`: Container C on **DN1 = CLOSED** (from ICR sent at T7a), 
`Reality`: Container C on DN1 = **QUASI_CLOSED (newContainer)**.
   
   **This is a kind of regression:**
   SCM thinks it's **CLOSED**. But DN1's next container report says 
**QUASI_CLOSED**. SCM sees a state "regression" **(CLOSED → QUASI_CLOSED)**. 
Depending on the FCR sent to SCM, it may:
   
   Re-send a `force close command → controller.closeContainer(id)` now 
re-fetches from ContainerSet → gets NEW container → closes it correctly → 
CLOSED. Eventually converges.
   Or treat it as an unhealthy/inconsistent replica.
   No data loss — the data is intact on Disk2. But there is a state 
inconsistency window where SCM's cached state (CLOSED) differs from reality 
(QUASI_CLOSED on the new disk).
   
   I think here as well we need to re-fetch the container .
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to