[
https://issues.apache.org/jira/browse/HADOOP-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko updated HADOOP-3369:
----------------------------------------
Attachment: fastBlockReports.patch
This patch implements the following approach:
When the name-node is in safe-mode block reports do not cause modifications of
the
queues of over- and under- replicated blocks.
Verification of replication of all blocks is rather performed right before
exiting the safe mode.
Thus only those blocks that really have missing replicas will appear in the
neededReplications.
In my tests this approach completes block processing almost 5 times faster than
the existing one,
which substantially improves the total name-node startup time.
> Fast block processing during name-node startup.
> -----------------------------------------------
>
> Key: HADOOP-3369
> URL: https://issues.apache.org/jira/browse/HADOOP-3369
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.17.0
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Fix For: 0.18.0
>
> Attachments: fastBlockReports.patch
>
>
> The block report processing during the startup period should be optimized.
> As noted in HADOOP-3022 during cluster startup all blocks are
> under-replicated
> because they have not been reported by name-nodes yet.
> Currently, we routinely move blocks to the neededReplications queue when they
> are first reported and then remove them from the list when other nodes report
> it.
> In ideal situation we end up adding all blocks into neededReplications queue
> first
> only in order to remove all of them in the end.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.