Thank you Ravi and Ted. I ran hadoop balancer without default threshold. It's been running for last 8 hours! How long does it take given the following DFS stats:
*3140 files and directories, 10295 blocks = 13435 total. Heap Size is 17.88 MB / 963 MB (1%) * Capacity : 3.93 TB DFS Remaining : 2.11 TB DFS Used : 1.31 TB DFS Used%:33.44 % Live Nodes <http://megh01:50070/dfshealth.jsp#LiveNodes> : 10 Dead Nodes<http://megh01:50070/dfshealth.jsp#DeadNodes> : 0 If I interrupt it now, what will happen? I've to run a job now. I think balancing and running a job may not happen together as one will slow down the other. Thanks, Prashant. On Fri, Aug 7, 2009 at 11:28 PM, Ted Dunning <[email protected]> wrote: > Make sure you rebalance soon after adding the new node. Otherwise, you > will > have an age bias in file distribution. This can, in some applications, > lead > to some strange effects. For example, if you have log files that you > delete > when they get too old, disk space will be freed non-uniformly. This > shouldn't much affect performance, but it can lead to a need to rebalance > again (and again) later. Normal file churn combined with occasional > rebalancing should eventually fix this, but it is nicer not to. > > On Fri, Aug 7, 2009 at 10:48 AM, Ravi Phulari <[email protected]> wrote: > > > Use Rebalancer > > > > > > > http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Rebalancer > > - > > Ravi > > > > On 8/7/09 10:38 AM, "prashant ullegaddi" <[email protected]> > wrote: > > > > > Hi, > > > > > > We had a cluster of 9 machines with one name node, and 8 data nodes (2 > > had > > > 220GB hard disk space, rest had 450GB). > > > Most of the space on first machines with 250GB disk space was consumed. > > > Now we added two new machines each with 450GB hard disk space as data > > nodes. > > > > > > Is there any way to redistribute files on HDFS so that there will > > > considerable free space left on first two machines without > > > downloading the files to one local machine and then uploading it back > on > > > HDFS? > > > > > > ~ > > > Prashant, > > > SIEL, > > > IIIT-Hyderabad. > > > > > > > > > > -- > Ted Dunning, CTO > DeepDyve >
