[jira] [Commented] (HDFS-17496) DataNode supports more fine-grained dataset lock based on blockid

ASF GitHub Bot (Jira) Thu, 16 Jan 2025 01:51:05 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913639#comment-17913639
 ]


ASF GitHub Bot commented on HDFS-17496:
---------------------------------------

kokon191 commented on code in PR #7280:
URL: https://github.com/apache/hadoop/pull/7280#discussion_r1918141263


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java:
##########
@@ -112,6 +116,21 @@ public static boolean dirNoFilesRecursive(
     return true;
   }
 
+  /**
+   * Take an example.
+   * We hava a block with blockid mapping to:
+   * 
"/data1/hadoop/hdfs/datanode/current/BP-xxxx/current/finalized/subdir0/subdir0"
+   * We return "subdir0/subdir0".

Review Comment:
   Can use `subdir0/subdir1` as the example to make it more generic than 
`subdir0/subdir0`



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java:
##########
@@ -120,13 +139,21 @@ public static boolean dirNoFilesRecursive(
    * @return
    */
   public static File idToBlockDir(File root, long blockId) {
-    int d1 = (int) ((blockId >> 16) & 0x1F);
-    int d2 = (int) ((blockId >> 8) & 0x1F);
-    String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
-        DataStorage.BLOCK_SUBDIR_PREFIX + d2;
+    String path = idToBlockDirSuffix(blockId);
     return new File(root, path);
   }
 
+  public static List<String> getAllSubDirNameForDataSetLock() {

Review Comment:
   `Name` -> `Names`



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataSetSubLockStrategy.java:
##########
@@ -0,0 +1,36 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdfs.server.datanode;
+
+import java.util.List;
+
+/**
+ * This interface is used to generate sub lock name for a blockid.
+ */
+public interface DataSetSubLockStrategy {
+
+  /**
+   * Generate sub lock name for the given blockid.
+   * @param blockid the block id.
+   * @return sub lock name for the input blockid.
+   */
+  String blockIdToSubLock(long blockid);
+
+  List<String> getAllSubLockName();

Review Comment:
   `Name` -> `Names`



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/DataNodeLockManager.java:
##########
@@ -29,7 +29,8 @@ public interface DataNodeLockManager<T extends 
AutoCloseDataSetLock> {
    */

Review Comment:
   Update this javadoc to have description for 3 levels of locks. I think there 
might be other spots in code comments/javadocs where 2 levels of locks are 
used, might need to comb through the code to find them.





> DataNode supports more fine-grained dataset lock based on blockid
> -----------------------------------------------------------------
>
>                 Key: HDFS-17496
>                 URL: https://issues.apache.org/jira/browse/HDFS-17496
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: farmmamba
>            Assignee: farmmamba
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.5.0
>
>         Attachments: image-2024-04-23-16-17-07-057.png
>
>
> Recently, we used NvmeSSD as volumes in datanodes and performed some stress 
> tests.
> We found that NvmeSSD and HDD disks achieve similar performance when create 
> lots of small files, such as 10KB.
> This phenomenon is counterintuitive.  After analyzing the metric monitoring , 
> we found that fsdataset lock became the bottleneck in high concurrency 
> scenario.
>  
> Currently, we have two level locks which are BLOCK_POOL and VOLUME.  We can 
> further split the volume lock to DIR lock.
> DIR lock is defined as below： given a blockid, we can determine which subdir 
> this block will be placed in finalized dir. We just use 
> subdir[0-31]/subdir[0-31] as the
> name of DIR lock.
> More details, please refer to method DatanodeUtil#idToBlockDir：
> {code:java}
>   public static File idToBlockDir(File root, long blockId) {
>     int d1 = (int) ((blockId >> 16) & 0x1F);
>     int d2 = (int) ((blockId >> 8) & 0x1F);
>     String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
>         DataStorage.BLOCK_SUBDIR_PREFIX + d2;
>     return new File(root, path);
>   } {code}
> The performance comparison is as below:
> experimental setup:
> 3 DataNodes with single disk.
> 10 Cients concurrent write and delete files after writing.
> 550 threads per Client.
> !image-2024-04-23-16-17-07-057.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17496) DataNode supports more fine-grained dataset lock based on blockid

Reply via email to