[ 
https://issues.apache.org/jira/browse/HDFS-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843935#action_12843935
 ] 

dhruba borthakur commented on HDFS-854:
---------------------------------------

> Or probably keep last added/deleted blocks in memory and send block report 
> and doing disk scan once in a while?

Lohit:This is already part of HDFS trunk, I think Suresh has earlier improved 
this to generate block reports from memory.

This patch improves the startup time of the cluster when the entire cluster is 
restarted. In this case, the datanodes are started newly and have to  scan the 
disk to generate the first block report. The faster we can generate the block 
report, the shorter the time for the NN to exit safemode!


> Datanode should scan devices in parallel to generate block report
> -----------------------------------------------------------------
>
>                 Key: HDFS-854
>                 URL: https://issues.apache.org/jira/browse/HDFS-854
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-854.patch
>
>
> A Datanode should scan its disk devices in parallel so that the time to 
> generate a block report is reduced. This will reduce the startup time of a 
> cluster.
> A datanode has 12 disk (each of 1 TB) to store HDFS blocks. There is a total 
> of 150K blocks on these 12 disks. It takes the datanode upto 20 minutes to 
> scan these devices to generate the first block report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to