A DataNode is regarded as "dead" after 10 minutes of inactivity. If a
DataNode is down for < 10 minutes, and quickly returns, then nothing
happens. After 10 minutes, its blocks are assumed to be lost forever. So the
NameNode then begins scheduling those blocks for re-replication from the two
surviving copies. There is no real-time bound on how long this process
takes. It's a function of your available network bandwidth, server loads,
disk speeds, amount of storage used, etc. Blocks which are down to a single
surviving replica are prioritized over blocks which have two surviving
replicas (in the event of a simultaneous or near-simultaneous double fault),
since they are more vulnerable.

If a DataNode does reappear, then its re-replication is cancelled, and
over-provisioned blocks are scaled back to the target number of replicas. If
that machine comes back with all blocks intact, then the node needs no
"rebuilding." (In fact, some of the over-provisioned replicas that get
removed might be from the original node, if they're available elsewhere
too!)

Don't forget that machines in Hadoop do not have strong notions of identity.
If a particular machine is taken offline and its disks are wiped, the blocks
which were there (which also existed in two other places) will be
re-replicated elsewhere from the live copies. When that same machine is then
brought back online, it has no incentive to "copy back" all the blocks that
it used to have, as there will be three replicas elsewhere in the cluster.
Blocks are never permanently bound to particular machines.

If you add recommissed or new nodes to a cluster, you should run the
rebalancing script which will take a random sampling of blocks from
heavily-laden nodes and move them onto emptier nodes in an attempt to spread
the data as evenly as possible.

- Aaron


On Tue, May 26, 2009 at 3:08 PM, Joe Hammerman <jhammer...@videoegg.com>wrote:

> Hello Hadoop Users list:
>
>                We are running Hadoop version 0.18.2. My team lead has asked
> me to investigate the answer to a particular question regarding Hadoop's
> handling of offline DataNodes - specifically, we would like to know how long
> a node can be offline before it is totally rebuilt when it has been readded
> to the cluster.
>                From what I've been able to determine from the documentation
> it appears to me that the NameNode will simply begin scheduling block
> replication on its remaining cluster members. If the offline node comes back
> online, and it reports all its blocks as being uncorrupted, then the
> NameNode just cleans up the "extra" blocks.
>                In other words, there is no explicit handling based on the
> length of the outage - the behavior of the cluster will depend entirely on
> the outage duration.
>
>                Anyone care to shed some light on this?
>
>                Thanks!
> Regards,
>                Joseph Hammerman
>

Reply via email to