[ https://issues.apache.org/jira/browse/HDFS-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913639#comment-17913639 ]
ASF GitHub Bot commented on HDFS-17496: --------------------------------------- kokon191 commented on code in PR #7280: URL: https://github.com/apache/hadoop/pull/7280#discussion_r1918141263 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java: ########## @@ -112,6 +116,21 @@ public static boolean dirNoFilesRecursive( return true; } + /** + * Take an example. + * We hava a block with blockid mapping to: + * "/data1/hadoop/hdfs/datanode/current/BP-xxxx/current/finalized/subdir0/subdir0" + * We return "subdir0/subdir0". Review Comment: Can use `subdir0/subdir1` as the example to make it more generic than `subdir0/subdir0` ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java: ########## @@ -120,13 +139,21 @@ public static boolean dirNoFilesRecursive( * @return */ public static File idToBlockDir(File root, long blockId) { - int d1 = (int) ((blockId >> 16) & 0x1F); - int d2 = (int) ((blockId >> 8) & 0x1F); - String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP + - DataStorage.BLOCK_SUBDIR_PREFIX + d2; + String path = idToBlockDirSuffix(blockId); return new File(root, path); } + public static List<String> getAllSubDirNameForDataSetLock() { Review Comment: `Name` -> `Names` ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataSetSubLockStrategy.java: ########## @@ -0,0 +1,36 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * <p> + * http://www.apache.org/licenses/LICENSE-2.0 + * <p> + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hdfs.server.datanode; + +import java.util.List; + +/** + * This interface is used to generate sub lock name for a blockid. + */ +public interface DataSetSubLockStrategy { + + /** + * Generate sub lock name for the given blockid. + * @param blockid the block id. + * @return sub lock name for the input blockid. + */ + String blockIdToSubLock(long blockid); + + List<String> getAllSubLockName(); Review Comment: `Name` -> `Names` ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/DataNodeLockManager.java: ########## @@ -29,7 +29,8 @@ public interface DataNodeLockManager<T extends AutoCloseDataSetLock> { */ Review Comment: Update this javadoc to have description for 3 levels of locks. I think there might be other spots in code comments/javadocs where 2 levels of locks are used, might need to comb through the code to find them. > DataNode supports more fine-grained dataset lock based on blockid > ----------------------------------------------------------------- > > Key: HDFS-17496 > URL: https://issues.apache.org/jira/browse/HDFS-17496 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode > Reporter: farmmamba > Assignee: farmmamba > Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: image-2024-04-23-16-17-07-057.png > > > Recently, we used NvmeSSD as volumes in datanodes and performed some stress > tests. > We found that NvmeSSD and HDD disks achieve similar performance when create > lots of small files, such as 10KB. > This phenomenon is counterintuitive. After analyzing the metric monitoring , > we found that fsdataset lock became the bottleneck in high concurrency > scenario. > > Currently, we have two level locks which are BLOCK_POOL and VOLUME. We can > further split the volume lock to DIR lock. > DIR lock is defined as below: given a blockid, we can determine which subdir > this block will be placed in finalized dir. We just use > subdir[0-31]/subdir[0-31] as the > name of DIR lock. > More details, please refer to method DatanodeUtil#idToBlockDir: > {code:java} > public static File idToBlockDir(File root, long blockId) { > int d1 = (int) ((blockId >> 16) & 0x1F); > int d2 = (int) ((blockId >> 8) & 0x1F); > String path = DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP + > DataStorage.BLOCK_SUBDIR_PREFIX + d2; > return new File(root, path); > } {code} > The performance comparison is as below: > experimental setup: > 3 DataNodes with single disk. > 10 Cients concurrent write and delete files after writing. > 550 threads per Client. > !image-2024-04-23-16-17-07-057.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org