Yes, it will be kept on the machine you issue the "dfs -put" command if it's
got a datanode running. Otherwise, a random datanode will be chosen to store
the datablocks.


On Fri, Jun 19, 2009 at 10:41 AM, Rajeev Gupta <graj...@in.ibm.com> wrote:

> "If you're inserting
> into HDFS from a machine running a DataNode, the local datanode will always
> be chosen as one of the three replica targets."
> Does that mean that if replication factor is 1, whole file will be kept on
> one node only?
>
> Thanks and regards.
> -Rajeev Gupta
>
>
>
>
>             Aaron Kimball
>             <aa...@cloudera.c
>             om>                                                        To
>                                       core-user@hadoop.apache.org
>             06/19/2009 01:56                                           cc
>             AM
>                                                                   Subject
>                                       Re: HDFS is not loading evenly
>             Please respond to         across all nodes.
>             core-u...@hadoop.
>                apache.org
>
>
>
>
>
>
>
>
> Did you run the dfs put commands from the master node?  If you're inserting
> into HDFS from a machine running a DataNode, the local datanode will always
> be chosen as one of the three replica targets. For more balanced loading,
> you should use an off-cluster machine as the point of origin.
>
> If you experience uneven block distribution, you should also periodically
> rebalance your cluster by running bin/start-balancer.sh every so often. It
> will work in the background to move blocks from heavily-laden nodes to
> underutilized ones.
>
> - Aaron
>
> On Thu, Jun 18, 2009 at 12:57 PM, openresearch <
> qiming...@openresearchinc.com> wrote:
>
> >
> > Hi all
> >
> > I "dfs put" a large dataset onto a 10-node cluster.
> >
> > When I observe the Hadoop progress (via web:50070) and each local file
> > system (via df -k),
> > I notice that my master node is hit 5-10 times harder than others, so
> hard
> > drive is get full quicker than others. Last night load, it actually crash
> > when hard drive was full.
> >
> > To my understand,  data should wrap around all nodes evenly (in a
> > round-robin fashion using 64M as a unit).
> >
> > Is it expected behavior of Hadoop? Can anyone suggest a good
> > troubleshooting
> > way?
> >
> > Thanks
> >
> >
> > --
> > View this message in context:
> >
>
> http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html
>
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
>
>
>

Reply via email to