Apart from ensuring all that others here have said, are your mapred.local.dir and dfs.data.dir pointing to the same directory by any chance? If that so happens, the tasktracker could potentially wipe out the datanode directories when restarted.
On Tue, Apr 12, 2011 at 8:16 PM, felix gao <gre1...@gmail.com> wrote: > What reason/condition would cause a datanode’s blocks to be removed? Our > cluster had a one of its datanodes crash because of bad RAM. After the > system was upgraded and the datanode/tasktracker brought online the next day > we noticed the amount of space utilized was minimal and the cluster was > rebalancing blocks to the datanode. It would seem the prior blocks were > removed. Was this because the datanode was declared dead? What is the > criteria for a namenode to decide (Assuming its the namenode) when a > datanode should remove prior blocks? -- Harsh J