What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
*.......* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well preserved body, but rather to skid in broadside in a cloud of smoke,thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <chen.song...@gmail.com> wrote: > Hey Ravi > > Here are my settings: > dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold > = 21474836480 > (20G) > dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction > = 0.85f > > Chen > > > On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ravi...@ymail.com> wrote: > >> Hi Chen! >> >> Are you running the balancer? What are you setting >> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold >> >> >> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction >> to? >> >> >> >> >> On Wednesday, February 11, 2015 7:44 AM, Chen Song < >> chen.song...@gmail.com> wrote: >> >> >> We have a hadoop cluster consisting of 500 nodes. But the nodes are not >> uniform in term of disk spaces. Half of the racks are newer with 11 volumes >> of 1.1T on each node, while the other half have 5 volume of 900GB on each >> node. >> >> dfs.datanode.fsdataset.volume.choosing.policy is set to >> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy. >> >> It winds up with the state of half of nodes are full while the other half >> underutilized. I am wondering if there is a known solution for this problem. >> >> Thank you for any suggestions. >> >> -- >> Chen Song >> >> >> >> > > > -- > Chen Song > >