Yes, that's it Thanks
Nan On Mon, Apr 18, 2011 at 10:16 PM, Harsh J <ha...@cloudera.com> wrote: > Hello, > > On Mon, Apr 18, 2011 at 7:16 PM, Nan Zhu <zhunans...@gmail.com> wrote: > > Hi, all > > > > I'm confused by a question that "how does the HDFS decide where to put > the > > data blocks " > > > > I mean that the user invokes some commands like "./hadoop put ***", we > > assume that this file consistes of 3 blocks, but how HDFS decides where > > these 3 blocks to be put? > > > > Most of the materials don't involve this issue, but just introduce the > data > > replica where talking about blocks in HDFS, > > > > I'm guessing you're looking for the BlockPlacementPolicy > implementations [1] and how it is applied in the HDFS. > > Basically, the NameNode chooses the set of DNs for every new-block > request (from a client) using this policy, and the DFSClient gets a > list of all the nodes. It goes on to pick the first one among them to > write the data to. The replication happens async, later. > > [1] - BlockPolicyPlacementDefault is the default implementation in > use. It's source available in the o.a.h.hdfs.server.namenode package. > > -- > Harsh J > -- Nan Zhu School of Software,5501 Shanghai Jiao Tong University 800,Dongchuan Road,Shanghai,China E-Mail: zhunans...@gmail.com