Re: How HDFS decides where to put the block

Harsh J Mon, 18 Apr 2011 07:17:14 -0700

Hello,

On Mon, Apr 18, 2011 at 7:16 PM, Nan Zhu <zhunans...@gmail.com> wrote:
> Hi, all
>
> I'm confused by a question that "how does the HDFS decide where to put the
> data blocks "
>
> I mean that the user invokes some commands like "./hadoop put ***", we
> assume that this  file consistes of 3 blocks, but how HDFS decides where
> these 3 blocks to be put?
>
> Most of the materials don't involve this issue, but just introduce the data
> replica where talking about blocks in HDFS,
>


I'm guessing you're looking for the BlockPlacementPolicy
implementations [1] and how it is applied in the HDFS.

Basically, the NameNode chooses the set of DNs for every new-block
request (from a client) using this policy, and the DFSClient gets a
list of all the nodes. It goes on to pick the first one among them to
write the data to. The replication happens async, later.

[1] - BlockPolicyPlacementDefault is the default implementation in
use. It's source available in the o.a.h.hdfs.server.namenode package.

-- 
Harsh J

Re: How HDFS decides where to put the block

Reply via email to