[jira] [Commented] (HDFS-17496) DataNode supports more fine-grained dataset lock based on blockid

ASF GitHub Bot (Jira) Sun, 08 Dec 2024 02:36:09 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903871#comment-17903871
 ]


ASF GitHub Bot commented on HDFS-17496:
---------------------------------------

hfutatzhanghb commented on code in PR #6764:
URL: https://github.com/apache/hadoop/pull/6764#discussion_r1874736387


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java:
##########
@@ -127,6 +129,31 @@ public static File idToBlockDir(File root, long blockId) {
     return new File(root, path);
   }
 
+  /**
+   * Take an example. We hava a block with blockid mapping to:
+   * 
"/data1/hadoop/hdfs/datanode/current/BP-xxxx/current/finalized/subdir0/subdir0"
+   * We return "subdir0/subdir0"
+   * @param blockId blockId
+   * @return The two-level subdir name
+   */
+  public static String idToBlockDirSuffixName(long blockId) {
+    int d1 = (int) ((blockId >> 16) & 0x1F);
+    int d2 = (int) ((blockId >> 8) & 0x1F);
+    return DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
+        DataStorage.BLOCK_SUBDIR_PREFIX + d2;
+  }
+
+  public static List<String> getAllSubDirNameForDataSetLock() {

Review Comment:
   This method is used to generate all subdir lock name.  We use 0x1F as end 
condition because the layout of DataNode.
   Each blockpool dir under one volume has two layer subdir, like  
subdir1/subdir31.





> DataNode supports more fine-grained dataset lock based on blockid
> -----------------------------------------------------------------
>
>                 Key: HDFS-17496
>                 URL: https://issues.apache.org/jira/browse/HDFS-17496
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: farmmamba
>            Assignee: farmmamba
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2024-04-23-16-17-07-057.png
>
>
> Recently, we used NvmeSSD as volumes in datanodes and performed some stress 
> tests.
> We found that NvmeSSD and HDD disks achieve similar performance when create 
> lots of small files, such as 10KB.
> This phenomenon is counterintuitive.  After analyzing the metric monitoring , 
> we found that fsdataset lock became the bottleneck in high concurrency 
> scenario.
>  
> Currently, we have two level locks which are BLOCK_POOL and VOLUME.  We can 
> further split the volume lock to DIR lock.
> DIR lock is defined as below： given a blockid, we can determine which subdir 
> this block will be placed in finalized dir. We just use 
> subdir[0-31]/subdir[0-31] as the
> name of DIR lock.
> More details, please refer to method DatanodeUtil#idToBlockDir：
> {code:java}
>   public static File idToBlockDir(File root, long blockId) {
>     int d1 = (int) ((blockId >> 16) & 0x1F);
>     int d2 = (int) ((blockId >> 8) & 0x1F);
>     String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
>         DataStorage.BLOCK_SUBDIR_PREFIX + d2;
>     return new File(root, path);
>   } {code}
> The performance comparison is as below:
> experimental setup:
> 3 DataNodes with single disk.
> 10 Cients concurrent write and delete files after writing.
> 550 threads per Client.
> !image-2024-04-23-16-17-07-057.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17496) DataNode supports more fine-grained dataset lock based on blockid

Reply via email to