RE: dfs datanode heartbeats and getBlockwork requests

Hairong Kuang Tue, 04 Apr 2006 17:07:10 -0700

I think it is better to implement the start-up delay at the namenode. But
the key is that the name node should be able to tell if it is in a steady
state or not either at start-up time or at runtime after a network
disruption. It should not instruct datanodes to replicate or delete any
blocks before it has reached a steady state.

Hairong 

-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 04, 2006 9:58 AM
To: [email protected]
Subject: Re: dfs datanode heartbeats and getBlockwork requests

Eric Baldeschwieler wrote:
> If we moved to a scheme where the name node was just given a small 
> number of blocks with each heartbeat, there would be no reason to not 
> start reporting blocks immediately, would there?

There would still be a small storm of un-needed replications on startup. 
  Say it takes a minute at startup for all data nodes to report their
complete block lists to the name node.  If heartbeats are every 3 seconds,
then all but the last data node to report in would be handed 20 small lists
of blocks to start replicating.  And the switches could be saturated doing a
lot of un-needed transfers, which would slow startup. 
  Then, for the next minute after startup, the nodes would be told to delete
blocks that are now over-replicated.  We'd like startup to be as fast and
painless as possible.  Waiting a bit before checking to see if blocks are
over- or under-replicated seems a good way.

Doug

RE: dfs datanode heartbeats and getBlockwork requests

Reply via email to