Hi I also tried the command $ bin/hadoop balancer. But still the same problem.
Aseem -----Original Message----- From: Puri, Aseem [mailto:aseem.p...@honeywell.com] Sent: Friday, April 10, 2009 11:18 AM To: core-user@hadoop.apache.org Subject: RE: More Replication on dfs Hi Alex, Thanks for sharing your knowledge. Till now I have three machines and I have to check the behavior of Hadoop so I want replication factor should be 2. I started my Hadoop server with replication factor 3. After that I upload 3 files to implement word count program. But as my all files are stored on one machine and replicated to other datanodes also, so my map reduce program takes input from one Datanode only. I want my files to be on different data node so to check functionality of map reduce properly. Also before starting my Hadoop server again with replication factor 2 I formatted all Datanodes and deleted all old data manually. Please suggest what I should do now. Regards, Aseem Puri -----Original Message----- From: Mithila Nagendra [mailto:mnage...@asu.edu] Sent: Friday, April 10, 2009 10:56 AM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs To add to the question, how does one decide what is the optimal replication factor for a cluster. For instance what would be the appropriate replication factor for a cluster consisting of 5 nodes. Mithila On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard <a...@cloudera.com> wrote: > Did you load any files when replication was set to 3? If so, you'll have > to > rebalance: > > <http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance r> > < > http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc er > > > > Note that most people run HDFS with a replication factor of 3. There have > been cases when clusters running with a replication of 2 discovered new > bugs, because replication is so often set to 3. That said, if you can do > it, it's probably advisable to run with a replication factor of 3 instead > of > 2. > > Alex > > On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem <aseem.p...@honeywell.com > >wrote: > > > Hi > > > > I am a new Hadoop user. I have a small cluster with 3 > > Datanodes. In hadoop-site.xml values of dfs.replication property is 2 > > but then also it is replicating data on 3 machines. > > > > > > > > Please tell why is it happening? > > > > > > > > Regards, > > > > Aseem Puri > > > > > > > > > > > > >