Hao-Nan Zhu created HDFS-17598:
----------------------------------
Summary: Optimizations for DatanodeManager for large-scale cases
Key: HDFS-17598
URL: https://issues.apache.org/jira/browse/HDFS-17598
Project: Hadoop HDFS
Issue Type: Improvement
Components: performance
Affects Versions: 3.4.0
Reporter: Hao-Nan Zhu
Hello,
I wonder if there are chances to optimize a little bit for the
{_}DatanodeManager{_}, for its performance when the number of _datanodes_ is
large
*
[_fetchDatanodes_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1144]
calls
[_removeDecomNodeFromList_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L817]
for both lists for live and dead datanodes. _removeDecomNodeFromList_ will
have to iterate all datanodes in the list. This can be optimized by checking
whether the node is decommissioned using _node.isDecommissioned()_ before
adding the node to the lists of live and dead datanodes.
*
[_getNumLiveDataNodes_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1055]
iterates over all datanodes. However,
[_getNumDeadDataNodes_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1068]
gets the size in a different (presumably more efficient) way. Is there a
reason that _getNumLiveDataNodes_ has to iterate all over the
{_}datanodeMap{_}? Can we use the same way for _getNumLiveDataNodes?_
And similar observations for
[_resetLastCachingDirectiveSentTime_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1560]
and
[_getDatanodeListForReport_|https://github.com/naver/hadoop/blob/0c0a80f96283b5a7be234663e815bc04bafc8be2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1253].
It seems optimizing these methods can contribute to more performant checks,
especially when the number of datanodes is larger. Are there any plans on
having these types of large-scale (micro) optimizations?
Please let me know if I need to provide more information. Thanks!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]