Re: How HDFS decides where to put the block

real great.. Tue, 19 Apr 2011 03:01:55 -0700

In this context I would like to ask, can we actually place the data where we
wish instead of allowing Hadoop's intelligence to take care of this?


On Tue, Apr 19, 2011 at 10:52 AM, Kai Voigt <k...@123.org> wrote:

> Hi,
>
> I found
> http://hadoopblog.blogspot.com/2009/09/hdfs-block-replica-placement-in-your.htmlexplains
>  the process nicely.
>
> The first replica of each block will be stored on the client machine, if
> it's a datanode itself. Makes sense, as it doesn't require a network
> transfer. Otherwise, a random datanode will be picked for the first replica.
>
> The second replica will be written to a random datanode on a random rack
> other than the rack where the first replica is stored. Here, HDFS's rack
> awareness will be utilized. So HDFS would survive a rack failure.
>
> The second replica will be written to the same rack as the second replica,
> but another random datanode in that rack. That will make the pipeline
> between second and third replica quick.
>
> Does that make sense to you? However, this is the current hard coded
> policy, there's ideas to make that policy customizable (
> https://issues.apache.org/jira/browse/HDFS-385).
>
> Kai
>
> Am 18.04.2011 um 15:46 schrieb Nan Zhu:
>
> > Hi, all
> >
> > I'm confused by a question that "how does the HDFS decide where to put
> the
> > data blocks "
> >
> > I mean that the user invokes some commands like "./hadoop put ***", we
> > assume that this  file consistes of 3 blocks, but how HDFS decides where
> > these 3 blocks to be put?
> >
> > Most of the materials don't involve this issue, but just introduce the
> data
> > replica where talking about blocks in HDFS,
> >
> > can anyone give me some instructions?
> >
> > Thanks
> >
> > Nan
> >
> > --
> > Nan Zhu
> > School of Software,5501
> > Shanghai Jiao Tong University
> > 800,Dongchuan Road,Shanghai,China
> > E-Mail: zhunans...@gmail.com
>
> --
> Kai Voigt
> k...@123.org
>
>
>
>
>


-- 
Regards,
R.V.

Re: How HDFS decides where to put the block

Reply via email to