[GitHub] [ozone] guihecheng commented on a diff in pull request #3292: HDDS-6541. [Merge rocksdb in datanode] Per-disk DB location management.

GitBox Thu, 14 Apr 2022 03:47:06 -0700


guihecheng commented on code in PR #3292:
URL: https://github.com/apache/ozone/pull/3292#discussion_r850322484



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/utils/HddsVolumeUtil.java:
##########
@@ -234,4 +257,154 @@ public static boolean checkVolume(HddsVolume hddsVolume, 
String scmId,
       return true;
     }
   }
+
+  /**
+   * Randomly pick a DbVolume for HddsVolume and init db instance.
+   * Use the HddsVolume directly if no DbVolume found.
+   * @param hddsVolume
+   * @param dbVolumeSet
+   * @param clusterID
+   * @param conf
+   */
+  public static void formatDbStoreForHddsVolume(HddsVolume hddsVolume,
+      MutableVolumeSet dbVolumeSet, String clusterID,
+      ConfigurationSource conf) throws IOException {
+    DbVolume chosenDbVolume = null;
+    File dbParentDir;
+
+    if (dbVolumeSet == null || dbVolumeSet.getVolumesList().isEmpty()) {
+      // No extra db volumes specified, just create db under the HddsVolume.
+      dbParentDir = new File(hddsVolume.getStorageDir(), clusterID);
+    } else {
+      // Randomly choose a DbVolume for simplicity.
+      List<DbVolume> dbVolumeList = StorageVolumeUtil.getDbVolumesList(
+          dbVolumeSet.getVolumesList());
+      chosenDbVolume = dbVolumeList.get(
+          ThreadLocalRandom.current().nextInt(dbVolumeList.size()));
+      dbParentDir = new File(chosenDbVolume.getStorageDir(), clusterID);
+    }
+
+    // Init subdir with the storageID of HddsVolume.
+    File storageIdDir = new File(dbParentDir, hddsVolume.getStorageID());
+    if (!storageIdDir.mkdirs() && !storageIdDir.exists()) {
+      throw new IOException("Can't make subdir under "
+          + dbParentDir.getAbsolutePath() + " for volume "
+          + hddsVolume.getStorageID());
+    }
+
+    // Init the db instance for HddsVolume under the subdir above.
+    String containerDBPath = new File(storageIdDir, CONTAINER_DB_NAME)
+        .getAbsolutePath();
+    try {
+      initPerDiskDBStore(containerDBPath, conf);
+    } catch (IOException e) {
+      throw new IOException("Can't init db instance under path "
+          + containerDBPath + " for volume " + hddsVolume.getStorageID());
+    }
+
+    // Set the dbVolume and dbParentDir of the HddsVolume for db path lookup.
+    hddsVolume.setDbVolume(chosenDbVolume);
+    hddsVolume.setDbParentDir(storageIdDir);
+  }
+
+  /**
+   * Load already formatted db instances for all HddsVolumes.
+   * @param hddsVolumeSet
+   * @param dbVolumeSet
+   * @param conf
+   * @param logger
+   */
+  public static void loadAllHddsVolumeDbStore(MutableVolumeSet hddsVolumeSet,
+      MutableVolumeSet dbVolumeSet, ConfigurationSource conf, Logger logger) {
+    // Scan subdirs under the db volumes and build a one-to-one map
+    // between each HddsVolume -> DbVolume.
+    mapDbVolumesToDataVolumesIfNeeded(hddsVolumeSet, dbVolumeSet);
+
+    List<HddsVolume> hddsVolumeList = StorageVolumeUtil.getHddsVolumesList(
+        hddsVolumeSet.getVolumesList());
+
+    for (HddsVolume volume : hddsVolumeList) {
+      String clusterID = volume.getClusterID();
+
+      // DN startup for the first time, not registered yet,
+      // so the DbVolume is not formatted.
+      if (clusterID == null) {
+        continue;
+      }
+
+      File clusterIdDir = new File(volume.getDbVolume() == null ?
+          volume.getStorageDir() : volume.getDbVolume().getStorageDir(),
+          clusterID);
+
+      if (!clusterIdDir.exists()) {
+        if (logger != null) {
+          logger.error("HddsVolume {} db instance not formatted, " +
+                  "clusterID {} directory not found",
+              volume.getStorageDir().getAbsolutePath(),
+              clusterIdDir.getAbsolutePath());
+        }
+        continue;

Review Comment:
   For 2, failed volume check doesn't belong here, it is an async check and 
DbVolumes are handled as other volumes, please check 
`StorageVolumeChecker.java`.  And I still want a clear choice between (A)"throw 
& out" (B)"continue"?
   
   For 3.1, let me explain `hddsFiles.length == 2`:  one file is for VERSION; 
the other file is either clusterIDDir or scmIDDir, SCM HA checks whether the 
Dir is (A)clusterIDDir or (B)scmIDDir. If A, do nothing because it means SCM HA 
is finalized; If B, upgrade this volume for SCM HA by linking. But no matter A 
or B, we have an IDDir which implies that this HddsVolume is formatted on the 
first DN register, and we won't do db format again for it neither on DN 
register nor DN restart. Please check the code inside this function for the 
words above `VersionedDatanodeFeatures.ScmHA.upgradeVolumeIfNeeded`.
   So here I mean I don't actually depends on SCM HA to check whether we have 
to format db instance or not, clear?
   
   For 3.2, after 3.1, I think you should know about the meaning of 
`hddsFiles.length`. And let me say more about when we have the chance to create 
a db instance:
   - Along with HddsVolume format on DN first register, so only when we 
have`hddsFiles.length == 1` which means that we only got a VERSION file on the 
HddsVolume, so we have to create the clusterIDDir/scmIDDir only on the first DN 
register.
   - Along with load on DN restart, I think this is where you actually suspect 
that we may have duplicate db instance creation. But I should say that, there 
is a check `!storageIdDir.exists()` at line 350 of file `HddsVolumeUtil.java` 
that checks if the subdirectry with `StorageID` exists, we only load the db 
instance when this exists. And for now I think I should have a stricter check 
for the `container.db` directory exactly, then we prevent from creating db 
instances on all pathes. Make sense?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] guihecheng commented on a diff in pull request #3292: HDDS-6541. [Merge rocksdb in datanode] Per-disk DB location management.

Reply via email to