[jira] [Created] (HDFS-17090) Decommission will be stuck for long time when restart because overlapped process Register and BlockReport.

Xiaoqiao He (Jira) Sun, 16 Jul 2023 21:00:11 -0700

Xiaoqiao He created HDFS-17090:
----------------------------------

             Summary: Decommission will be stuck for long time when restart 
because overlapped process Register and BlockReport.
                 Key: HDFS-17090
                 URL: https://issues.apache.org/jira/browse/HDFS-17090
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
            Reporter: Xiaoqiao He
            Assignee: Xiaoqiao He



I met one corner case recently, which decommission DataNode impact performance 
of NameNode. After dig carefully, I have reproduced this case.
a. Add some DataNodes to exclude and prepare to decommission this Datanodes.
b. Execute bin/hdfs dfsadmin -refresh (This is optional step).
c. Restart NameNode for upgrade or other reason before complete to decommission.
d. All DataNodes will be trigger to register and FBR.
e. Considering that the load of NameNode will be very high, especially 8040 
CallQueue will be full for a long time because RPC flood about 
register/heartbeat/FBR from DataNodes.
f. For one decommission in-progress node, it will not complete to decommission 
until next FBR even all replicas of this node has been processed, because the 
request order register-heartbeat-(blockreport, register), and the second 
register could be one retry RPC request from DataNode (No more log information 
from DataNode to confirm), and for (blockreport, register), NameNode could 
process one storage then process register then process remaining storages in 
order. 
g. Because the second register RPC, the related DataNodes will be marked 
unhealthy by BlockManager#isNodeHealthyForDecommissionOrMaintenance. So 
decommission will be stuck for long time until next FBR. Thus NameNode need to 
scan this DataNode at every round to check if could complete which hold the 
global write lock and impact performance of NameNode.

To improve it, I think we could filter the repeated register RPC request at 
startup progress. Not think carefully if it will involve other risks when 
filter register directly. Welcome anymore discussions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-17090) Decommission will be stuck for long time when restart because overlapped process Register and BlockReport.

Reply via email to