devmadhuu commented on code in PR #10074:
URL: https://github.com/apache/ozone/pull/10074#discussion_r3258668514


##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java:
##########
@@ -179,33 +185,175 @@ public void checkAndAddNewContainerBatch(
   }
 
   /**
-   *  Check if container state is not open. In SCM, container state
-   *  changes to CLOSING first, and then the close command is pushed down
-   *  to Datanodes. Recon 'learns' this from DN, and hence replica state
-   *  will move container state to 'CLOSING'.
+   * Returns the total number of containers across all lifecycle states.
    *
-   * @param containerID containerID to check
-   * @param state  state to be compared
+   * <p>This mirrors SCM's {@code ContainerManager#getTotalContainerCount()}
+   * implementation: it sums per-state counts, where each lookup is O(1),
+   * instead of loading container records.
+   *
+   * @return total container count
    */
+  @Override
+  public long getTotalContainerCount() {
+    long total = 0;
+    for (HddsProtos.LifeCycleState state : HddsProtos.LifeCycleState.values()) 
{
+      total += getContainerStateCount(state);
+    }
+    return total;
+  }
 
-  private void checkContainerStateAndUpdate(ContainerID containerID,
-                                            ContainerReplicaProto.State state)
-          throws IOException, InvalidStateTransitionException {
-    ContainerInfo containerInfo = getContainer(containerID);
-    if (containerInfo.getState().equals(HddsProtos.LifeCycleState.OPEN)
-        && !state.equals(ContainerReplicaProto.State.OPEN)
-        && isHealthy(state)) {
-      LOG.info("Container {} has state OPEN, but given state is {}.",
-          containerID, state);
-      final PipelineID pipelineID = containerInfo.getPipelineID();
-      // subtract open container count from the map
+  /**
+   * Transitions a container from OPEN to CLOSING, keeping the per-pipeline
+   * open-container count in {@link #pipelineToOpenContainer} accurate.
+   *
+   * <p>Must be called whenever an OPEN container is moved to CLOSING so that
+   * the pipeline's open-container count stays consistent.  Both the DN-report
+   * driven path ({@link #checkContainerStateAndUpdate}) and the periodic sync
+   * passes ({@code processSyncedClosedContainer}, {@code 
syncQuasiClosedContainers})
+   * use this method to avoid divergence in the count exposed to the Recon 
Node API.
+   *
+   * <p>If the container was recorded without a pipeline (null pipeline at
+   * {@code addNewContainer} time) the count decrement is safely skipped.
+   *
+   * @param containerID   container to advance from OPEN to CLOSING
+   * @param containerInfo already-fetched {@code ContainerInfo} for the 
container
+   *                      (avoids a redundant lookup inside this method)
+   * @throws IOException                     if the state update fails
+   * @throws InvalidStateTransitionException if the container is not in OPEN 
state
+   */
+  void transitionOpenToClosing(ContainerID containerID, ContainerInfo 
containerInfo)
+      throws IOException, InvalidStateTransitionException {
+    PipelineID pipelineID = containerInfo.getPipelineID();
+    if (pipelineID != null) {
       int curCnt = pipelineToOpenContainer.getOrDefault(pipelineID, 0);
       if (curCnt == 1) {
         pipelineToOpenContainer.remove(pipelineID);
       } else if (curCnt > 0) {
         pipelineToOpenContainer.put(pipelineID, curCnt - 1);
       }
-      updateContainerState(containerID, FINALIZE);
+    }
+    updateContainerState(containerID, FINALIZE);  // OPEN → CLOSING
+  }
+
+  /**
+   * Check if container state needs to advance based on a DN replica report and
+   * SCM's authoritative lifecycle state.
+   *
+   * <p>Two scenarios handled:
+   * <ol>
+   *   <li>OPEN in Recon + non-OPEN healthy replica → FINALIZE (OPEN→CLOSING),
+   *       then query SCM to advance further if possible.</li>
+   *   <li>CLOSING in Recon + any report → query SCM to advance to
+   *       QUASI_CLOSED or CLOSED if SCM has already moved there.</li>
+   *   <li>DELETED in Recon + live replica report → rehydrate the container 
from
+   *       SCM if SCM still records it in a live state such as QUASI_CLOSED or
+   *       CLOSED.</li>
+   * </ol>
+   *
+   * <p>Querying SCM for the authoritative state prevents containers from 
getting
+   * permanently stuck at CLOSING when the DN report that would normally
+   * trigger the next transition was missed (e.g., Recon downtime).
+   *
+   * @param containerID containerID to check
+   * @param replicaState replica state reported by DataNode
+   */
+  private void checkContainerStateAndUpdate(ContainerID containerID,
+                                            ContainerReplicaProto.State 
replicaState)
+      throws IOException, InvalidStateTransitionException {
+    ContainerInfo containerInfo = getContainer(containerID);

Review Comment:
   Okay, I would retain only the small `transitionOpenToClosing()` helper 
extraction here, because the targeted sync path also needs it to preserve the 
per-pipeline open-container count  when advancing an `OPEN` container.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to