Hi Tom,

Thanks for the trick :).

I tried by setting the replication to 3 in the hadoop-default.xml but then the namenode-logfile in /var/log/hadoop started getting full with the messages marked in bold:

2009-06-24 14:39:06,338 INFO org.apache.hadoop.dfs.StateChange: STATE* SafeModeInfo.leave: Safe mode is OFF. 2009-06-24 14:39:06,339 INFO org.apache.hadoop.dfs.StateChange: STATE* Network topology has 1 racks and 3 datanodes 2009-06-24 14:39:06,339 INFO org.apache.hadoop.dfs.StateChange: STATE* UnderReplicatedBlocks has 48545 blocks 2009-06-24 14:39:07,655 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.45:50010 to replicate blk_-4602580985572290582 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:07,655 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.45:50010 to replicate blk_-4602036196619511999 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:07,666 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.43:50010 to replicate blk_-4601863051065326105 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:07,666 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.43:50010 to replicate blk_-4601770656364938220 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:10,829 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.20.11.44:50010 is added to blk_-4601770656364938220 2009-06-24 14:39:10,832 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.45:50010 to replicate blk_-4601706607039808418 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:10,833 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.45:50010 to replicate blk_-4601652202073012439 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:10,834 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.43:50010 to replicate blk_-4601470720696217621 to datanode(s) 10.20.11.44:50010 2009-06-24 14:39:10,834 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask 10.20.11.43:50010 to replicate blk_-4601267705629076611 to datanode(s) 10.20.11.44:50010 *2009-06-24 14:39:13,899 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,899 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,899 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,900 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,900 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,900 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,901 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1 2009-06-24 14:39:13,901 WARN org.apache.hadoop.fs.FSNamesystem: Not able to place enough replicas, still in need of 1*

It is a very small cluster with limited disk space. The disk was getting full because of all these extra messages there were being written to the logfile. Eventually the file system would file up and hadoop hangs. This happened when i set the dfs.replication = 3 in the hadoop-default.xml and restarted the cluster.

Is there a way i can turn off these WARN messages which are filling up the file system. I can run the command on the command line like you advised with replication set to 3 and then once done, set it back to 2.
Currently the dfs.replication is set to 2 in the hadoop-default.xml.

Thanks,
Usman

Hi Usman,

Before the rebalancer was introduced one trick people used was to
increase the replication on all the files in the system, wait for
re-replication to complete, then decrease the replication to the
original level. You can do this using hadoop fs -setrep.

Cheers,
Tom

On Thu, Jun 25, 2009 at 10:33 AM, Usman Waheed<usm...@opera.com> wrote:
Hi,

One of our test clusters is running HADOOP 15.3 with replication level set
to 2. The datanodes are not balanced at all.

Datanode_1: 52%
Datanode_2: 82%
Datanode_3: 30%

15.3 does not have the rebalancer capability, we are planning to upgrade but
not for now.

If i take out Datanode_1 from the cluster (decommission for sometime) will
HADOOP balance so that Datanode_2 and Datanode_3 will even out to 56%?
Then i can re-introduce Datanode_1 back into the cluster.

Comments/Suggestions please?

Thanks,
Usman


Reply via email to