[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karthik Ranganathan updated HDFS-142: ------------------------------------- Attachment: HDFS-142-multiple-blocks-datanode-exception.patch Hey guys, If we had 2 or more files in the blocks being written directory, the data node would not be able to start up - because the code tries to add the BlockAndFile objects to a TreeSet internally, but the block and file object does not implement a comparable. The first addition goes through as the TreeSet does not have to compare anything. The data node dies on restart with the following exception: 2010-03-23 15:50:23,152 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.ClassCastException: org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockAndFile cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:542) at java.util.TreeSet.add(TreeSet.java:238) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.getBlockAndFileInfo(FSDataset.java:247) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.recoverBlocksBeingWritten(FSDataset.java:539) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.<init>(FSDataset.java:381) at org.apache.hadoop.hdfs.server.datanode.FSDataset.<init>(FSDataset.java:895) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:305) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:219) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1337) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1292) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1300) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1422) This patch makes the BlockAndFile class implement Comparable, and a unit test (thanks Nick) that verifies this case. > Datanode should delete files under tmp when upgraded from 0.17 > -------------------------------------------------------------- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Raghu Angadi > Assignee: dhruba borthakur > Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, HDFS-142_20.patch > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.