Thank you Ravi and Ted.

I ran hadoop balancer without default threshold. It's been running for last
8 hours!
How long does it take given the following DFS stats:

*3140 files and directories, 10295 blocks = 13435 total. Heap Size is 17.88
MB / 963 MB (1%)
*   Capacity : 3.93 TB DFS Remaining : 2.11 TB DFS Used : 1.31 TB DFS
Used%:33.44 % Live
Nodes <http://megh01:50070/dfshealth.jsp#LiveNodes> : 10 Dead
Nodes<http://megh01:50070/dfshealth.jsp#DeadNodes>
: 0

If I interrupt it now, what will happen? I've to run a job now. I think
balancing and running a job
may not happen together as one will slow down the other.

Thanks,
Prashant.

On Fri, Aug 7, 2009 at 11:28 PM, Ted Dunning <[email protected]> wrote:

> Make sure you rebalance soon after adding the new node.  Otherwise, you
> will
> have an age bias in file distribution.  This can, in some applications,
> lead
> to some strange effects.  For example, if you have log files that you
> delete
> when they get too old, disk space will be freed non-uniformly.  This
> shouldn't much affect performance, but it can lead to a need to rebalance
> again (and again) later.  Normal file churn combined with occasional
> rebalancing should eventually fix this, but it is nicer not to.
>
> On Fri, Aug 7, 2009 at 10:48 AM, Ravi Phulari <[email protected]> wrote:
>
> > Use Rebalancer
> >
> >
> >
> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Rebalancer
> > -
> > Ravi
> >
> > On 8/7/09 10:38 AM, "prashant ullegaddi" <[email protected]>
> wrote:
> >
> > > Hi,
> > >
> > > We had a cluster of 9 machines with one name node, and 8 data nodes (2
> > had
> > > 220GB hard disk space, rest had 450GB).
> > > Most of the space on first machines with 250GB disk space was consumed.
> > > Now we added two new machines each with 450GB hard disk space as data
> > nodes.
> > >
> > > Is there any way to redistribute files on HDFS so that there will
> > > considerable free space left on first two machines without
> > > downloading the files to one local machine and then uploading it back
> on
> > > HDFS?
> > >
> > > ~
> > > Prashant,
> > > SIEL,
> > > IIIT-Hyderabad.
> > >
> >
> >
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Reply via email to