mapred.local.dir is under /hadoop/mapred defs.data.dir is /hadoop./dfs. The logs showing
011-04-11 14:34:10,987 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting 2011-04-11 14:34:10,987 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting 2011-04-11 14:34:10,988 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting 2011-04-11 14:34:10,988 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(had41.xxx:50010, storageID=DS-922075132-69.170.130.173-50010-1297386088418, infoPort=50075, ipcPort=50020) 2011-04-11 14:34:10,988 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting 2011-04-11 14:34:11,021 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 69.170.130.173:50010, storageID=DS-922075132-69.170.130.173-50010-1297386088418, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/hadoop/dfs/current'} 2011-04-11 14:34:11,021 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec 2011-04-11 14:34:15,545 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 110651 blocks got processed in 4493 msecs 2011-04-11 14:34:15,545 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic block scanner. 2011-04-11 14:34:15,692 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block blk_-9213431071914219029_3395500 file /hadoop/dfs/current/subdir7/subdir26/blk_-9213431071914219029 then followed by all the blocks in the dfs/current directory. Seems to me hadoop just want to invalidate all the blocks in that box by deleting all of it. On Tue, Apr 12, 2011 at 9:17 AM, Harsh J <ha...@cloudera.com> wrote: > Apart from ensuring all that others here have said, are your > mapred.local.dir and dfs.data.dir pointing to the same directory by > any chance? If that so happens, the tasktracker could potentially wipe > out the datanode directories when restarted. > > On Tue, Apr 12, 2011 at 8:16 PM, felix gao <gre1...@gmail.com> wrote: > > What reason/condition would cause a datanode’s blocks to be removed? > Our > > cluster had a one of its datanodes crash because of bad RAM. After the > > system was upgraded and the datanode/tasktracker brought online the next > day > > we noticed the amount of space utilized was minimal and the cluster was > > rebalancing blocks to the datanode. It would seem the prior blocks were > > removed. Was this because the datanode was declared dead? What is the > > criteria for a namenode to decide (Assuming its the namenode) when a > > datanode should remove prior blocks? > > -- > Harsh J >