Aseem, How are you verifying that blocks are not being replicated? Have you ran fsck? *bin/hadoop fsck /*
I'd be surprised if replication really wasn't happening. Can you run fsck and pay attention to "Under-replicated blocks" and "Mis-replicated blocks?" In fact, can you just copy-paste the output of fsck? Alex On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem <aseem.p...@honeywell.com>wrote: > > Hi > I also tried the command $ bin/hadoop balancer. But still the > same problem. > > Aseem > > -----Original Message----- > From: Puri, Aseem [mailto:aseem.p...@honeywell.com] > Sent: Friday, April 10, 2009 11:18 AM > To: core-user@hadoop.apache.org > Subject: RE: More Replication on dfs > > Hi Alex, > > Thanks for sharing your knowledge. Till now I have three > machines and I have to check the behavior of Hadoop so I want > replication factor should be 2. I started my Hadoop server with > replication factor 3. After that I upload 3 files to implement word > count program. But as my all files are stored on one machine and > replicated to other datanodes also, so my map reduce program takes input > from one Datanode only. I want my files to be on different data node so > to check functionality of map reduce properly. > > Also before starting my Hadoop server again with replication > factor 2 I formatted all Datanodes and deleted all old data manually. > > Please suggest what I should do now. > > Regards, > Aseem Puri > > > -----Original Message----- > From: Mithila Nagendra [mailto:mnage...@asu.edu] > Sent: Friday, April 10, 2009 10:56 AM > To: core-user@hadoop.apache.org > Subject: Re: More Replication on dfs > > To add to the question, how does one decide what is the optimal > replication > factor for a cluster. For instance what would be the appropriate > replication > factor for a cluster consisting of 5 nodes. > Mithila > > On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard <a...@cloudera.com> > wrote: > > > Did you load any files when replication was set to 3? If so, you'll > have > > to > > rebalance: > > > > > <http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance > r> > > < > > > http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc > er > > > > > > > Note that most people run HDFS with a replication factor of 3. There > have > > been cases when clusters running with a replication of 2 discovered > new > > bugs, because replication is so often set to 3. That said, if you can > do > > it, it's probably advisable to run with a replication factor of 3 > instead > > of > > 2. > > > > Alex > > > > On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem <aseem.p...@honeywell.com > > >wrote: > > > > > Hi > > > > > > I am a new Hadoop user. I have a small cluster with 3 > > > Datanodes. In hadoop-site.xml values of dfs.replication property is > 2 > > > but then also it is replicating data on 3 machines. > > > > > > > > > > > > Please tell why is it happening? > > > > > > > > > > > > Regards, > > > > > > Aseem Puri > > > > > > > > > > > > > > > > > > > > >