devmadhuu commented on code in PR #10074:
URL: https://github.com/apache/ozone/pull/10074#discussion_r3258668514
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java:
##########
@@ -179,33 +185,175 @@ public void checkAndAddNewContainerBatch(
}
/**
- * Check if container state is not open. In SCM, container state
- * changes to CLOSING first, and then the close command is pushed down
- * to Datanodes. Recon 'learns' this from DN, and hence replica state
- * will move container state to 'CLOSING'.
+ * Returns the total number of containers across all lifecycle states.
*
- * @param containerID containerID to check
- * @param state state to be compared
+ * <p>This mirrors SCM's {@code ContainerManager#getTotalContainerCount()}
+ * implementation: it sums per-state counts, where each lookup is O(1),
+ * instead of loading container records.
+ *
+ * @return total container count
*/
+ @Override
+ public long getTotalContainerCount() {
+ long total = 0;
+ for (HddsProtos.LifeCycleState state : HddsProtos.LifeCycleState.values())
{
+ total += getContainerStateCount(state);
+ }
+ return total;
+ }
- private void checkContainerStateAndUpdate(ContainerID containerID,
- ContainerReplicaProto.State state)
- throws IOException, InvalidStateTransitionException {
- ContainerInfo containerInfo = getContainer(containerID);
- if (containerInfo.getState().equals(HddsProtos.LifeCycleState.OPEN)
- && !state.equals(ContainerReplicaProto.State.OPEN)
- && isHealthy(state)) {
- LOG.info("Container {} has state OPEN, but given state is {}.",
- containerID, state);
- final PipelineID pipelineID = containerInfo.getPipelineID();
- // subtract open container count from the map
+ /**
+ * Transitions a container from OPEN to CLOSING, keeping the per-pipeline
+ * open-container count in {@link #pipelineToOpenContainer} accurate.
+ *
+ * <p>Must be called whenever an OPEN container is moved to CLOSING so that
+ * the pipeline's open-container count stays consistent. Both the DN-report
+ * driven path ({@link #checkContainerStateAndUpdate}) and the periodic sync
+ * passes ({@code processSyncedClosedContainer}, {@code
syncQuasiClosedContainers})
+ * use this method to avoid divergence in the count exposed to the Recon
Node API.
+ *
+ * <p>If the container was recorded without a pipeline (null pipeline at
+ * {@code addNewContainer} time) the count decrement is safely skipped.
+ *
+ * @param containerID container to advance from OPEN to CLOSING
+ * @param containerInfo already-fetched {@code ContainerInfo} for the
container
+ * (avoids a redundant lookup inside this method)
+ * @throws IOException if the state update fails
+ * @throws InvalidStateTransitionException if the container is not in OPEN
state
+ */
+ void transitionOpenToClosing(ContainerID containerID, ContainerInfo
containerInfo)
+ throws IOException, InvalidStateTransitionException {
+ PipelineID pipelineID = containerInfo.getPipelineID();
+ if (pipelineID != null) {
int curCnt = pipelineToOpenContainer.getOrDefault(pipelineID, 0);
if (curCnt == 1) {
pipelineToOpenContainer.remove(pipelineID);
} else if (curCnt > 0) {
pipelineToOpenContainer.put(pipelineID, curCnt - 1);
}
- updateContainerState(containerID, FINALIZE);
+ }
+ updateContainerState(containerID, FINALIZE); // OPEN → CLOSING
+ }
+
+ /**
+ * Check if container state needs to advance based on a DN replica report and
+ * SCM's authoritative lifecycle state.
+ *
+ * <p>Two scenarios handled:
+ * <ol>
+ * <li>OPEN in Recon + non-OPEN healthy replica → FINALIZE (OPEN→CLOSING),
+ * then query SCM to advance further if possible.</li>
+ * <li>CLOSING in Recon + any report → query SCM to advance to
+ * QUASI_CLOSED or CLOSED if SCM has already moved there.</li>
+ * <li>DELETED in Recon + live replica report → rehydrate the container
from
+ * SCM if SCM still records it in a live state such as QUASI_CLOSED or
+ * CLOSED.</li>
+ * </ol>
+ *
+ * <p>Querying SCM for the authoritative state prevents containers from
getting
+ * permanently stuck at CLOSING when the DN report that would normally
+ * trigger the next transition was missed (e.g., Recon downtime).
+ *
+ * @param containerID containerID to check
+ * @param replicaState replica state reported by DataNode
+ */
+ private void checkContainerStateAndUpdate(ContainerID containerID,
+ ContainerReplicaProto.State
replicaState)
+ throws IOException, InvalidStateTransitionException {
+ ContainerInfo containerInfo = getContainer(containerID);
Review Comment:
Okay, I would retain only the small `transitionOpenToClosing()` helper
extraction here, because the targeted sync path also needs it to preserve the
per-pipeline open-container count when advancing an `OPEN` container.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]