[ 
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741318#comment-16741318
 ] 

He Xiaoqiao edited comment on HDFS-14186 at 1/12/19 3:31 PM:
-------------------------------------------------------------

Hi [~elgoiri],
{quote}I'm guessing that the lifeline doesn't help because the DN is not 
registered at all?{quote}
Yep, you are correct, lifeline do not help absolutely. Actually I have open the 
feature in my production cluster. In this case, DN has registered and reported 
blocks but it is lost and set DEAD after NN just leave safe mode since its load 
is very high so DN has to register and block report again.
{quote}I'm also curious about the numbers, 300M blocks takes 8 hours? We see 
around 30 minutes with 60M block.{quote}
I found the case to different degrees with above 300M blocks.  The worst case 
which takes more than 8 hours I met once: more than 15K nodes with more than 
500M blocks.
Generally, I think number of slave nodes and block numbers both can impact the 
startup times under the same fsimage+editlogs. blocks number is more affected 
factor of course. 30min can finish to restart with 60M blocks, I see the same 
result as [~elgoiri] mentioned. Actually, with 100M blocks or less we can 
control  startup time within ~40min. It appears with more large-scale. more 
comments are welcome.


was (Author: hexiaoqiao):
Hi [~elgoiri],
{quote}I'm guessing that the lifeline doesn't help because the DN is not 
registered at all?{quote}
Yep, you are correct, lifeline do not help absolutely. Actually I have open the 
feature in my production cluster.
{quote}I'm also curious about the numbers, 300M blocks takes 8 hours? We see 
around 30 minutes with 60M block.{quote}
I found the case to different degrees with above 300M blocks.  The worst case 
which takes more than 8 hours I met: more than 15K nodes with more than 500M 
blocks.
Generally, I think number of slave nodes and block numbers both can impact the 
startup times under the same fsimage+editlogs. blocks number is more affected 
factor of course. 30min can finish to restart with 60M blocks, I see the same 
result as [~elgoiri] mentioned. Actually, with 100M blocks or less we can 
control  startup time within ~40min. It appears with more large-scale. more 
comments are welcome.

> blockreport storm slow down namenode restart seriously in large cluster
> -----------------------------------------------------------------------
>
>                 Key: HDFS-14186
>                 URL: https://issues.apache.org/jira/browse/HDFS-14186
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
>         Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately 
> after register to namenode successfully when restart, and the blockreport 
> storm will make namenode high load to process them. One result is some 
> received RPC have to skip because queue time is timeout. If some datanodes' 
> heartbeat RPC are continually skipped for long times (default is 
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to 
> re-register and send blockreport again, aggravate blockreport storm and trap 
> in a vicious circle, and slow down (more than one hour and even more) 
> namenode startup seriously in a large (several thousands of datanodes) and 
> busy cluster especially. Although there are many work to optimize namenode 
> startup, the issue still exists. 
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to