Hi,
Have you tried;
$ hdfs balancer
On 02/06/2015 09:34 PM, Manoj Venkatesh wrote:
Dear Hadoop experts,
I have a Hadoop cluster of 8 nodes, 6 were added during cluster
creation and 2 additional nodes were added later to increase disk and
CPU capacity. What i see is that processing is shared amongst all the
nodes whereas the storage is reaching capacity on the original 6 nodes
whereas the newly added machines have relatively large amount of
storage still unoccupied.
I was wondering if there is an automated or any way of redistributing
data so that all the nodes are equally utilized. I have checked for
the configuration parameter -
*dfs.datanode.fsdataset.volume.choosing.policy* have options 'Round
Robin' or 'Available Space', are there any other configurations which
need to be reviewed.
Thanks,
Manoj
--
Regards,
Ahmed Ossama