Look at hdfs balancer Artem Ervits On Feb 6, 2015 5:54 PM, "Manoj Venkatesh" <manove...@gmail.com> wrote:
> Dear Hadoop experts, > > I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation > and 2 additional nodes were added later to increase disk and CPU capacity. > What i see is that processing is shared amongst all the nodes whereas the > storage is reaching capacity on the original 6 nodes whereas the newly > added machines have relatively large amount of storage still unoccupied. > > I was wondering if there is an automated or any way of redistributing data > so that all the nodes are equally utilized. I have checked for the > configuration parameter - *dfs.datanode.fsdataset.volume.choosing.policy* > have options 'Round Robin' or 'Available Space', are there any other > configurations which need to be reviewed. > > Thanks, > Manoj >