Unless something has changed recently it won't automatically relocate the blocks. When I did something similar I had a script that walked through the whole set of files that were misreplicated and increased the replication factor then dropped it back down. This triggered relocation of blocks to meet the rack requirements.
Doing this worked, but took about a week to run over a few hundred thousand files that were misreplicated. Here's the script I used (all sorts of caveats about it assuming a replication factor of 3 and no real error handling, etc)... for f in `hadoop fsck / | grep "Replica placement policy is violated" | head -n80000 | awk -F: '{print $1}'`; do hadoop fs -setrep -w 4 $f hadoop fs -setrep 3 $f done On Tue, Mar 20, 2012 at 16:20, Patai Sangbutsarakum <silvianhad...@gmail.com> wrote: > Hadoopers!! > > I am going to restart hadoop cluster in order to enable rack-awareness > first time. > Currently we're running 0.20.203 with 500TB of data on 250+ nodes > (without rack-awareness) > > I am thinking and afraid that when i start dfs (with rack-awareness > enable) the HDFS will be in safemode for hours > busy with relocating block to comply with rack-awareness. > > Anything knob i can dial to prevent that ? > > Thanks in advances > Patai