After a hardware move with an unfortunate mis-setup rack awareness script our hadoop cluster has a large number of mis-replicated blocks. After about a week things haven't gotten better on their own.
Is there a good way to trigger the name node to fix the mis-replicated blocks? Here's what I'm using for now, but it is very slow: for f in `hadoop fsck / | grep "Replica placement policy is violated" | head -n3000 | awk -F: '{print $1}'`; do hadoop fs -setrep 4 $f hadoop fs -setrep 3 $f done John