[
https://issues.apache.org/jira/browse/HDFS-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843935#action_12843935
]
dhruba borthakur commented on HDFS-854:
---------------------------------------
> Or probably keep last added/deleted blocks in memory and send block report
> and doing disk scan once in a while?
Lohit:This is already part of HDFS trunk, I think Suresh has earlier improved
this to generate block reports from memory.
This patch improves the startup time of the cluster when the entire cluster is
restarted. In this case, the datanodes are started newly and have to scan the
disk to generate the first block report. The faster we can generate the block
report, the shorter the time for the NN to exit safemode!
> Datanode should scan devices in parallel to generate block report
> -----------------------------------------------------------------
>
> Key: HDFS-854
> URL: https://issues.apache.org/jira/browse/HDFS-854
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Reporter: dhruba borthakur
> Assignee: Dmytro Molkov
> Attachments: HDFS-854.patch
>
>
> A Datanode should scan its disk devices in parallel so that the time to
> generate a block report is reduced. This will reduce the startup time of a
> cluster.
> A datanode has 12 disk (each of 1 TB) to store HDFS blocks. There is a total
> of 150K blocks on these 12 disks. It takes the datanode upto 20 minutes to
> scan these devices to generate the first block report.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.