Re: [PR] HDDS-14758. Mismatch in Recon container count between Cluster State and Container Summary [ozone]

via GitHub Mon, 04 May 2026 03:37:57 -0700


devmadhuu commented on code in PR #10074:
URL: https://github.com/apache/ozone/pull/10074#discussion_r3180719831



##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java:
##########
@@ -432,34 +437,76 @@ public void start() {
     } else {
       initializePipelinesFromScm();
     }
-    LOG.debug("Started the SCM Container Info sync scheduler.");
-    long interval = ozoneConfiguration.getTimeDuration(
-        OZONE_RECON_SCM_SNAPSHOT_TASK_INTERVAL_DELAY,
-        OZONE_RECON_SCM_SNAPSHOT_TASK_INTERVAL_DEFAULT, TimeUnit.MILLISECONDS);
-    long initialDelay = ozoneConfiguration.getTimeDuration(
-        OZONE_RECON_SCM_SNAPSHOT_TASK_INITIAL_DELAY,
-        OZONE_RECON_SCM_SNAPSHOT_TASK_INITIAL_DELAY_DEFAULT,
+    // -----------------------------------------------------------------------
+    // Scheduler (incremental/targeted sync): runs every 1h (default).
+    //
+    // Each cycle calls decideSyncAction() — two lightweight count RPCs to SCM
+    // — and then:
+    //
+    //   |total drift| > threshold (default 100,000)
+    //       → full snapshot: replace Recon's entire SCM DB from SCM checkpoint
+    //
+    //   0 < |total drift| <= threshold
+    //       → targeted sync: 4-pass incremental repair
+    //
+    //   total drift = 0 but per-state drift (OPEN or QUASI_CLOSED) > 
threshold (default 5)
+    //       → targeted sync: corrects containers stuck in a stale lifecycle 
state
+    //
+    //   no drift detected
+    //       → no action this cycle
+    //
+    // Running this on a 1h cadence means container state discrepancies are
+    // detected and corrected without an unconditional periodic full snapshot.
+    // -----------------------------------------------------------------------
+    long syncInterval = ozoneConfiguration.getTimeDuration(
+        OZONE_RECON_SCM_CONTAINER_SYNC_TASK_INTERVAL_DELAY,
+        OZONE_RECON_SCM_CONTAINER_SYNC_TASK_INTERVAL_DEFAULT, 
TimeUnit.MILLISECONDS);
+    long syncInitialDelay = ozoneConfiguration.getTimeDuration(
+        OZONE_RECON_SCM_CONTAINER_SYNC_TASK_INITIAL_DELAY,
+        OZONE_RECON_SCM_CONTAINER_SYNC_TASK_INITIAL_DELAY_DEFAULT,
         TimeUnit.MILLISECONDS);
-    // This periodic sync with SCM container cache is needed because during
-    // the window when recon will be down and any container being added
-    // newly and went missing, that container will not be reported as missing 
by
-    // recon till there is a difference of container count equivalent to
-    // threshold value defined in "ozone.recon.scm.container.threshold"
-    // between SCM container cache and recon container cache.
+    LOG.debug("Started the SCM Container Info sync scheduler (interval={}ms, 
initialDelay={}ms).",
+        syncInterval, syncInitialDelay);
     scheduler.scheduleWithFixedDelay(() -> {
+      if (!isSyncDataFromSCMRunning.compareAndSet(false, true)) {
+        LOG.debug("SCM container info sync is already running; skipping this 
cycle.");
+        return;
+      }
       try {
-        boolean isSuccess = syncWithSCMContainerInfo();
-        if (!isSuccess) {
-          LOG.debug("SCM container info sync is already running.");
+        ReconStorageContainerSyncHelper.SyncAction action =
+            containerSyncHelper.decideSyncAction();
+        switch (action) {
+        case FULL_SNAPSHOT:
+          LOG.info("Tiered sync decision: FULL_SNAPSHOT. "
+              + "Replacing Recon SCM DB with fresh SCM checkpoint.");

Review Comment:
   @rakeshadr One refinement I can think of is to make the automatic 
`FULL_SNAPSHOT` escalation configurable/opt-in. When the non-OPEN drift exceeds 
the configured threshold, the default behavior can be to log a warning and 
expose it through metrics instead of downloading the SCM checkpoint 
automatically. This PR already adds Recon SCM container sync metrics such as 
full snapshot trigger count, last non-OPEN drift, interval since last full 
snapshot event, per-state drift, targeted sync status, and targeted sync 
duration, so operators can  alert on the condition.
   
     For the explicit `trigger/status/cancel `flow, instead of adding CLI 
support directly, I think it fits better as Recon admin REST APIs because Recon 
already has
     TriggerDBSyncEndpoint for OM sync and targeted SCM sync:
   
   ```
     - `POST /api/v1/triggerdbsync/scm/snapshot` to trigger full SCM DB 
snapshot download
     - `GET /api/v1/triggerdbsync/scm/snapshot/status` to return 
status/phase/start time/duration
     - `POST /api/v1/triggerdbsync/scm/snapshot/cancel` to cancel while the 
checkpoint download is still in progress
   ```
   
     A CLI command can later be a thin wrapper over these REST APIs if needed.
   
    If you think, it makes sense, then I can update this PR to avoid automatic 
full snapshot by default and keep the large-drift condition visible through 
logs + metrics. The explicit trigger/status/cancel REST API  may be better 
handled as a follow-up JIRA unless you feel it should be part of this change.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-14758. Mismatch in Recon container count between Cluster State and Container Summary [ozone]

Reply via email to