devmadhuu commented on PR #10074:
URL: https://github.com/apache/ozone/pull/10074#issuecomment-4253694369

   > @devmadhuu Thanks for working over this, few points to be considered,
   > 
   >     1. We support moving back container from deleted to closed / 
quasi-closed based on state from DN
   > 
   >     2. open containers can be incremental based on last sync container ID, 
as new containers are created in increment id order. But closed / quasi-closed 
needs to be full sunc. So full sync can be 3 hours gap or more.
   > 
   >     3. Closing may not be required as its temporary state for few minutes, 
DN sync can help over this.
   > 
   >     4. For stale DN or some volume failure, there can be sudden spike of 
container mismatch like container moving to closing state from open state. Need 
consider full db sync for open container difference -- may be not required. And 
for quasi-closed/closed state, any way doing sync container IDs
   > 
   > 
   > Do we really need full db sync for quasi-closed/closed only OR for Open 
container state ?
   
   1. Open containers sync happens in similar fashion based like it used to 
happen earlier for `CLOSED` containers, but difference lies where it check if 
any OPEN container is missing in Recon , then only it applies further logic of 
adding the container, so it is incremental in that sense , but it fetches all 
container ids in OPEN State available at SCM in batches (batch size is 
determined dynamically based on RPC message size). That part is optimized 
already in other PR which now fetches only list of container ids in batches and 
not full `containerInfo` objects which were heavy. Now RPC call is light weight 
and finish fetching 4-5 million container ids in multiple batches in less than 
~20 secs with max upto 8 RPC calls.
   2. CLOSED state containers sync also do the same in batches, with extra 
checks in `syncClosedContainers` -> `processSyncedClosedContainer`
   
   Kindly have a look into this method to understand the flow: 
https://github.com/devmadhuu/ozone/blob/c900adde6560660def18a68b748b45a6e21daace/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerSyncHelper.java#L244
   
   This whole sync gets scheduled to call every 1 hour based on type of sync 
which is decided in 
https://github.com/devmadhuu/ozone/blob/c900adde6560660def18a68b748b45a6e21daace/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerSyncHelper.java#L171
 . Here we decide based on non-Open and open container drift counts 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to