[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

JIRA Mon, 14 Jan 2019 22:23:22 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742785#comment-16742785
 ]


Íñigo Goiri commented on HDFS-14186:
------------------------------------

For my own sanity, the path for the lifeline is the following:
# {{NamenodeRPCServer#sendLifeline()}} in a different RPC handler.
# {{FSNameSystem#handleLifeline()}} with no lock.
# {{DatanodeManager#handleLifeline()}} uses {{getDatanode()}} and there is no 
locks or anything in these two.
# {{HeartbeatManager#updateLifeline()}} is synchronized within the object.
# {{BlockManager#updateHeartbeatState()}} is unlocked and uses 
{{DatanodeDescriptor#updateHeartbeatState()}} which seems fine.

So the point of conflict is {{HeartbeatManager#updateLifeline()}} which fights 
with {{register()}} and {{updateHeartbeat()}}.
>From your description, I'm guessing that these two functions are the ones 
>taking a long time.
I'm not very familiar with the {{synchronized}} but it looks like it doesn't 
have a particular order.
Could we change the {{HeartbeatManager}} locking model there?
There was some discussion about this in HDFS-9239 but it doesn't look it made 
it very far.
I have to say that is very tempting to make a part of 
{{HeartbeatManager#updateLifeline()}} not synchronized and just update the 
timestamp there if the load is high.

> blockreport storm slow down namenode restart seriously in large cluster
> -----------------------------------------------------------------------
>
>                 Key: HDFS-14186
>                 URL: https://issues.apache.org/jira/browse/HDFS-14186
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
>         Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14186) blockreport storm slow down namenode restart seriously in large cluster

Reply via email to