Hey, A while ago We've added a new disk (volume) to every datanode in our cluster. We have configured the disks in "data.dfs.dir" in hdfs-site both on the job tracker and on each machine. This went successfully for all of the machines except one, where the new disk was not recognized by hadoop.
We can not find out what's wrong with it. We know that the new disk is not recognized because "http://namenode:50070/" shows smaller capacity to that machine. The mapred + hdfs directories on that drive exist, but they are not identical to the structure of directories in other disks: In the problematic drive there is no "local" directory under "mapred", and no "name", "namesecondary" directories under "hdfs". This problem was not so terrible until now, when the rest of the disks are full: The logs started containing errors such as "No space left on device" and "DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/". Some Hadoop jobs fail with the same errors, and the datanode+tasktracker on that machine crash a lot. How do we install this disk properly? Thanks in advance. Technical info: hadoop-0.20, centos, each machine is datanode and tasktracker (another machine is jobtracker + namenode). -- Oded
