devmadhuu commented on code in PR #10547:
URL: https://github.com/apache/ozone/pull/10547#discussion_r3457667667


##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/AnalyzeSubcommand.java:
##########
@@ -34,65 +41,123 @@
  */
 @Command(
     name = "analyze",
-    description = "Analyze container consistency between on-disk container " +
-            "directories on this DataNode and SCM metadata. Must be run 
locally on a DataNode.")
+    description = {
+        "Analyze container consistency between on-disk container directories 
on this DataNode and SCM metadata.",
+        "Must be run locally on a DataNode.",
+        "",
+        "Each reported container occurrence includes a status:",
+        "  MISSING_METADATA: metadata/{containerId}.container does not exist.",
+        "  INVALID_METADATA: metadata file exists but cannot be parsed, or the 
container ID in the metadata",
+        "      does not match the directory name.",
+        "  VALID: metadata file is present and consistent with the directory."

Review Comment:
   Why we want to return VALID ones ? These may run into millions. In #10414 , 
are we returning VALID ones as well ? And as per #10414 PR, we are already 
detecting duplicate container dirs as well as above discrepancies mentioned in 
above command description.
   
   Also this command description is not complete. As per this PR we are 
identifying orphan containers only right ?



##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/AnalyzeSubcommand.java:
##########
@@ -34,65 +41,123 @@
  */
 @Command(
     name = "analyze",
-    description = "Analyze container consistency between on-disk container " +
-            "directories on this DataNode and SCM metadata. Must be run 
locally on a DataNode.")
+    description = {
+        "Analyze container consistency between on-disk container directories 
on this DataNode and SCM metadata.",
+        "Must be run locally on a DataNode.",
+        "",
+        "Each reported container occurrence includes a status:",
+        "  MISSING_METADATA: metadata/{containerId}.container does not exist.",
+        "  INVALID_METADATA: metadata file exists but cannot be parsed, or the 
container ID in the metadata",
+        "      does not match the directory name.",
+        "  VALID: metadata file is present and consistent with the directory."
+    })
 public class AnalyzeSubcommand extends AbstractSubcommand implements 
Callable<Void> {
-  @CommandLine.Option(names = {"--count"},
-          defaultValue = "20",
-          description = "Number of containers to display")
-  private int count;
+  @CommandLine.Mixin
+  private ListLimitOptions listOptions;
+
+  @CommandLine.Option(names = {"--scm-db"},
+      description = "Path to an offline scm.db directory, or its parent 
metadata directory.")
+  private File scmDb;
 
   @Override
   public Void call() throws Exception {
-    if (count < 1) {
-      throw new IOException("Count must be an integer greater than 0.");
-    }
+    listOptions.getLimit(); //This triggers ListLimitOptions validation

Review Comment:
   I think this validation is already being called at 
`printContainerOccurrenceReport` . May be wrap it in an explicit 
`validateOptions() `method with a comment that says why it must run before 
scanning.



##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/ScmContainerMetadataReader.java:
##########
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.debug.datanode.container.analyze;
+
+import static org.apache.hadoop.hdds.scm.metadata.SCMDBDefinition.CONTAINERS;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.Objects;
+import java.util.Optional;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.scm.container.ContainerID;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.metadata.SCMDBDefinition;
+import org.apache.hadoop.hdds.utils.db.CodecException;
+import org.apache.hadoop.hdds.utils.db.DBStore;
+import org.apache.hadoop.hdds.utils.db.DBStoreBuilder;
+import org.apache.hadoop.hdds.utils.db.RocksDatabaseException;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.hadoop.hdds.utils.db.cache.TableCache.CacheType;
+import org.apache.hadoop.ozone.OzoneConsts;
+
+/**
+ * Read-only lookup of container metadata from {@code scm.db}.
+ */
+public final class ScmContainerMetadataReader implements AutoCloseable {
+
+  private final DBStore dbStore;
+  private final Table<ContainerID, ContainerInfo> containerTable;
+
+  public ScmContainerMetadataReader(ConfigurationSource conf, File scmDbPath)
+      throws IOException {
+    File scmDbDir = resolveScmDbDirectory(scmDbPath);
+    try {
+      this.dbStore = DBStoreBuilder.newBuilder(conf, SCMDBDefinition.get(), 
scmDbDir.getName(),
+          scmDbDir.getParentFile().toPath())

Review Comment:
   This might throw `NPE` if user passes just `scm.db` without any parent. 
Convert to absolute path ?



##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/AnalyzeSubcommand.java:
##########
@@ -34,65 +41,123 @@
  */
 @Command(
     name = "analyze",
-    description = "Analyze container consistency between on-disk container " +
-            "directories on this DataNode and SCM metadata. Must be run 
locally on a DataNode.")
+    description = {
+        "Analyze container consistency between on-disk container directories 
on this DataNode and SCM metadata.",
+        "Must be run locally on a DataNode.",
+        "",
+        "Each reported container occurrence includes a status:",
+        "  MISSING_METADATA: metadata/{containerId}.container does not exist.",
+        "  INVALID_METADATA: metadata file exists but cannot be parsed, or the 
container ID in the metadata",
+        "      does not match the directory name.",
+        "  VALID: metadata file is present and consistent with the directory."
+    })
 public class AnalyzeSubcommand extends AbstractSubcommand implements 
Callable<Void> {
-  @CommandLine.Option(names = {"--count"},
-          defaultValue = "20",
-          description = "Number of containers to display")
-  private int count;
+  @CommandLine.Mixin
+  private ListLimitOptions listOptions;
+
+  @CommandLine.Option(names = {"--scm-db"},
+      description = "Path to an offline scm.db directory, or its parent 
metadata directory.")
+  private File scmDb;
 
   @Override
   public Void call() throws Exception {
-    if (count < 1) {
-      throw new IOException("Count must be an integer greater than 0.");
-    }
+    listOptions.getLimit(); //This triggers ListLimitOptions validation
     OzoneConfiguration conf = getOzoneConf();
     ContainerScanResult scanResult = ContainerDirectoryScanner.scan(conf);
     Map<Long, List<ContainerDiskOccurrence>> enrichedDuplicates =
         ContainerDirectoryScanner.enrichDuplicates(scanResult.getDuplicates());
 
-    // TODO: SCM metadata lookup from --scm-db when provided.
-    // TODO: For each id in scanResult.getSingles().keySet() classified 
NOT_IN_SCM or DELETED:
-    //   enrichOccurrence(id, scanResult.getSingles().get(id)) and report.
-    // TODO: For each id in enrichedDuplicates.keySet() classified NOT_IN_SCM 
or DELETED:
-    //   enrichedDuplicates.get(id) is already enriched — just report.
+    if (scmDb != null) {
+      findOrphanAndDeletedButPresentContainers(conf, scanResult, 
enrichedDuplicates);
+    } else {
+      out().println("To identify orphan containers (wrt SCM) and containers 
that are marked as DELETED in SCM but"
+          + " exist in the datanode's current directory, provide the SCM 
database path using the --scm-db option."
+      );
+    }
 
     printDuplicates(enrichedDuplicates);
     printVolumeScanErrors(scanResult.getVolumeScanErrors());
     return null;
   }
 
-  private void printDuplicates(Map<Long, List<ContainerDiskOccurrence>> 
duplicates) {
-    long totalDuplicateIds = duplicates.size();
-    out().printf("Number of containers with duplicate container directories on 
this DataNode: %d%n", totalDuplicateIds);
+  private void findOrphanAndDeletedButPresentContainers(OzoneConfiguration 
conf, ContainerScanResult scanResult,
+      Map<Long, List<ContainerDiskOccurrence>> enrichedDuplicates) throws 
IOException {
+    Map<Long, List<ContainerDiskOccurrence>> enrichedOrphanContainers = new 
HashMap<>();
+    Map<Long, List<ContainerDiskOccurrence>> enrichedDeletedButPresent = new 
HashMap<>();
 
-    if (totalDuplicateIds == 0) {
+    try (ScmContainerMetadataReader reader = new 
ScmContainerMetadataReader(conf, scmDb)) {
+      Set<Long> containerIds = new HashSet<>(scanResult.getSingles().keySet());
+      containerIds.addAll(enrichedDuplicates.keySet());
+
+      for (long containerId : containerIds) {
+        Optional<ScmContainerMetadataReader.ScmContainerClassification> 
classification = reader.classify(containerId);
+        if (!classification.isPresent()) {
+          continue;
+        }
+        List<ContainerDiskOccurrence> occurrences = 
enrichedDuplicates.get(containerId);
+        if (occurrences == null) {
+          String path = scanResult.getSingles().get(containerId);
+          occurrences = 
Collections.singletonList(ContainerDirectoryScanner.enrichOccurrence(containerId,
 path));
+        }
+        if (classification.get() == 
ScmContainerMetadataReader.ScmContainerClassification.NOT_IN_SCM) {
+          enrichedOrphanContainers.put(containerId, occurrences);
+        } else {
+          enrichedDeletedButPresent.put(containerId, occurrences);
+        }
+      }
+    }
+
+    printContainerOccurrenceReport("Number of orphan containers(wrt SCM) on 
this DataNode: %d%n",
+        enrichedOrphanContainers);
+    printContainerOccurrenceReport("Number of deleted but present containers 
on this DataNode: %d%n",

Review Comment:
   ```suggestion
       printContainerOccurrenceReport("Containers marked DELETED in SCM but 
present on disk on this DataNode: %d%n",
   ```



##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/ScmContainerMetadataReader.java:
##########
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.debug.datanode.container.analyze;
+
+import static org.apache.hadoop.hdds.scm.metadata.SCMDBDefinition.CONTAINERS;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.Objects;
+import java.util.Optional;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.scm.container.ContainerID;
+import org.apache.hadoop.hdds.scm.container.ContainerInfo;
+import org.apache.hadoop.hdds.scm.metadata.SCMDBDefinition;
+import org.apache.hadoop.hdds.utils.db.CodecException;
+import org.apache.hadoop.hdds.utils.db.DBStore;
+import org.apache.hadoop.hdds.utils.db.DBStoreBuilder;
+import org.apache.hadoop.hdds.utils.db.RocksDatabaseException;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.hadoop.hdds.utils.db.cache.TableCache.CacheType;
+import org.apache.hadoop.ozone.OzoneConsts;
+
+/**
+ * Read-only lookup of container metadata from {@code scm.db}.
+ */
+public final class ScmContainerMetadataReader implements AutoCloseable {
+
+  private final DBStore dbStore;
+  private final Table<ContainerID, ContainerInfo> containerTable;
+
+  public ScmContainerMetadataReader(ConfigurationSource conf, File scmDbPath)
+      throws IOException {
+    File scmDbDir = resolveScmDbDirectory(scmDbPath);
+    try {
+      this.dbStore = DBStoreBuilder.newBuilder(conf, SCMDBDefinition.get(), 
scmDbDir.getName(),
+          scmDbDir.getParentFile().toPath())
+          .setOpenReadOnly(true)

Review Comment:
   What if someone points to online live DB, it will fail, if another process 
holds the write lock.



##########
hadoop-ozone/cli-debug/src/main/java/org/apache/hadoop/ozone/debug/datanode/container/analyze/AnalyzeSubcommand.java:
##########
@@ -34,65 +41,123 @@
  */
 @Command(
     name = "analyze",
-    description = "Analyze container consistency between on-disk container " +
-            "directories on this DataNode and SCM metadata. Must be run 
locally on a DataNode.")
+    description = {
+        "Analyze container consistency between on-disk container directories 
on this DataNode and SCM metadata.",
+        "Must be run locally on a DataNode.",
+        "",
+        "Each reported container occurrence includes a status:",
+        "  MISSING_METADATA: metadata/{containerId}.container does not exist.",
+        "  INVALID_METADATA: metadata file exists but cannot be parsed, or the 
container ID in the metadata",
+        "      does not match the directory name.",
+        "  VALID: metadata file is present and consistent with the directory."
+    })
 public class AnalyzeSubcommand extends AbstractSubcommand implements 
Callable<Void> {
-  @CommandLine.Option(names = {"--count"},
-          defaultValue = "20",
-          description = "Number of containers to display")
-  private int count;
+  @CommandLine.Mixin
+  private ListLimitOptions listOptions;
+
+  @CommandLine.Option(names = {"--scm-db"},
+      description = "Path to an offline scm.db directory, or its parent 
metadata directory.")
+  private File scmDb;
 
   @Override
   public Void call() throws Exception {
-    if (count < 1) {
-      throw new IOException("Count must be an integer greater than 0.");
-    }
+    listOptions.getLimit(); //This triggers ListLimitOptions validation
     OzoneConfiguration conf = getOzoneConf();
     ContainerScanResult scanResult = ContainerDirectoryScanner.scan(conf);
     Map<Long, List<ContainerDiskOccurrence>> enrichedDuplicates =
         ContainerDirectoryScanner.enrichDuplicates(scanResult.getDuplicates());
 
-    // TODO: SCM metadata lookup from --scm-db when provided.
-    // TODO: For each id in scanResult.getSingles().keySet() classified 
NOT_IN_SCM or DELETED:
-    //   enrichOccurrence(id, scanResult.getSingles().get(id)) and report.
-    // TODO: For each id in enrichedDuplicates.keySet() classified NOT_IN_SCM 
or DELETED:
-    //   enrichedDuplicates.get(id) is already enriched — just report.
+    if (scmDb != null) {
+      findOrphanAndDeletedButPresentContainers(conf, scanResult, 
enrichedDuplicates);
+    } else {
+      out().println("To identify orphan containers (wrt SCM) and containers 
that are marked as DELETED in SCM but"
+          + " exist in the datanode's current directory, provide the SCM 
database path using the --scm-db option."
+      );
+    }
 
     printDuplicates(enrichedDuplicates);
     printVolumeScanErrors(scanResult.getVolumeScanErrors());
     return null;
   }
 
-  private void printDuplicates(Map<Long, List<ContainerDiskOccurrence>> 
duplicates) {
-    long totalDuplicateIds = duplicates.size();
-    out().printf("Number of containers with duplicate container directories on 
this DataNode: %d%n", totalDuplicateIds);
+  private void findOrphanAndDeletedButPresentContainers(OzoneConfiguration 
conf, ContainerScanResult scanResult,
+      Map<Long, List<ContainerDiskOccurrence>> enrichedDuplicates) throws 
IOException {
+    Map<Long, List<ContainerDiskOccurrence>> enrichedOrphanContainers = new 
HashMap<>();
+    Map<Long, List<ContainerDiskOccurrence>> enrichedDeletedButPresent = new 
HashMap<>();
 
-    if (totalDuplicateIds == 0) {
+    try (ScmContainerMetadataReader reader = new 
ScmContainerMetadataReader(conf, scmDb)) {
+      Set<Long> containerIds = new HashSet<>(scanResult.getSingles().keySet());
+      containerIds.addAll(enrichedDuplicates.keySet());
+
+      for (long containerId : containerIds) {
+        Optional<ScmContainerMetadataReader.ScmContainerClassification> 
classification = reader.classify(containerId);
+        if (!classification.isPresent()) {
+          continue;
+        }
+        List<ContainerDiskOccurrence> occurrences = 
enrichedDuplicates.get(containerId);
+        if (occurrences == null) {
+          String path = scanResult.getSingles().get(containerId);
+          occurrences = 
Collections.singletonList(ContainerDirectoryScanner.enrichOccurrence(containerId,
 path));
+        }
+        if (classification.get() == 
ScmContainerMetadataReader.ScmContainerClassification.NOT_IN_SCM) {
+          enrichedOrphanContainers.put(containerId, occurrences);
+        } else {
+          enrichedDeletedButPresent.put(containerId, occurrences);
+        }
+      }
+    }
+
+    printContainerOccurrenceReport("Number of orphan containers(wrt SCM) on 
this DataNode: %d%n",
+        enrichedOrphanContainers);
+    printContainerOccurrenceReport("Number of deleted but present containers 
on this DataNode: %d%n",
+        enrichedDeletedButPresent);
+  }
+
+  private void printContainerOccurrenceReport(String countFormat, 
+      Map<Long, List<ContainerDiskOccurrence>> containersById) {
+    long total = containersById.size();
+    out().printf(countFormat, total);
+    if (total == 0) {
       return;
     }
 
-    if (totalDuplicateIds > count) {
-      out().printf("Showing first %d:%n", count);
+    if (!listOptions.isAll()) {
+      int limit = listOptions.getLimit();
+      if (total > limit) {
+        out().printf("Showing first %d:%n", limit);
+      }
+
+      containersById.entrySet().stream()
+          .sorted(Map.Entry.comparingByKey())
+          .limit(limit)
+          .forEach(entry -> printContainerEntry(entry.getKey(), 
entry.getValue()));
+    } else {

Review Comment:
   This else block code and above stream logic seems duplicate. Pls check and 
refactor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to