Hao-Nan Zhu created HDFS-17619:
----------------------------------
Summary: Use ConcurrentHashMap to avoid synchronized block
Key: HDFS-17619
URL: https://issues.apache.org/jira/browse/HDFS-17619
Project: Hadoop HDFS
Issue Type: Improvement
Components: server
Affects Versions: 3.3.6, 3.3.0
Reporter: Hao-Nan Zhu
Hi, I’ve encountered performance bottlenecks in _PendingReconstructionBlocks_
that have a chance to be optimized.
The
[decrement|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L112]
method in _hdfs.server.blockmanagment.PendingReconstructionBlocks_ encloses a
synchronized block that locks {_}pendingReconstructions{_}. Within this
synchronized block, it calls
[decrementReplicas|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L237]
function, which contains a loop that iterates over all the datanodes. This
could take a long time if the number of datanodes is large, and eventually
there is a chance of lock contention on the _pendingReconstructions_ object.
To mitigate this while maintaining thread safety, the optimization could be
using _ConcurrentHashMap_ for _pendingReconstructions_ and ensuring the access
to _target_ is thread safe as well. A similar issue is also observed at
[pendingReconcstructionCheck|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L277],
which can be addressed with the same strategy.
I’m looking into creating a patch for this, but before that, I wonder if it is
worth optimizing. Also, please let me know if there is something wrong with my
understanding or analysis. Thanks!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]