[ https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189281#comment-14189281 ]
Zhe Zhang commented on HDFS-7225: --------------------------------- AFAICT, NN won't try to delete orphan blocks. I verified with the following test: {code} public void testOrphanBlocks() throws IOException { DataNode dn = cluster.getDataNodes().get(0); DatanodeRegistration dnReg = dn.getDNRegistrationForBP(bpid); StorageBlockReport reports[] = new StorageBlockReport[cluster.getStoragesPerDatanode()]; ArrayList<Block> blocks = new ArrayList<Block>(); for (int i = 0; i < 10; i++) { blocks.add(new Block()); } for (int i = 0; i < cluster.getStoragesPerDatanode(); ++i) { BlockListAsLongs bll = new BlockListAsLongs(blocks, null); FsVolumeSpi v = dn.getFSDataset().getVolumes().get(i); DatanodeStorage dns = new DatanodeStorage(v.getStorageID()); reports[i] = new StorageBlockReport(dns, bll.getBlockListAsLongs()); } cluster.getNameNodeRpc().blockReport(dnReg, bpid, reports); LOG.debug("Scheduling to delete " + cluster.getNameNode().getNamesystem().getBlockManager(). getPendingDeletionBlocksCount() + " blocks"); } {code} I wonder if it's the intended behavior for the NN to keep orphan blocks, or we should add the logic to delete them. [~andrew.wang] Do you have a clue? > Failed DataNode lookup can crash NameNode with NullPointerException > ------------------------------------------------------------------- > > Key: HDFS-7225 > URL: https://issues.apache.org/jira/browse/HDFS-7225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.6.0 > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch > > > {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the > {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to > {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated > {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} > which will use it to lookup in a {{TreeMap}}. Since the key type is > {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key > will crash the NameNode with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)