[ 
https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189281#comment-14189281
 ] 

Zhe Zhang commented on HDFS-7225:
---------------------------------

AFAICT, NN won't try to delete orphan blocks. I verified with the following 
test:

{code}
  public void testOrphanBlocks() throws IOException {
    DataNode dn = cluster.getDataNodes().get(0);
    DatanodeRegistration dnReg = dn.getDNRegistrationForBP(bpid);
    StorageBlockReport reports[] =
        new StorageBlockReport[cluster.getStoragesPerDatanode()];

    ArrayList<Block> blocks = new ArrayList<Block>();

    for (int i = 0; i < 10; i++) {
      blocks.add(new Block());
    }
    for (int i = 0; i < cluster.getStoragesPerDatanode(); ++i) {
      BlockListAsLongs bll = new BlockListAsLongs(blocks, null);
      FsVolumeSpi v = dn.getFSDataset().getVolumes().get(i);
      DatanodeStorage dns = new DatanodeStorage(v.getStorageID());
      reports[i] = new StorageBlockReport(dns, bll.getBlockListAsLongs());
    }
    cluster.getNameNodeRpc().blockReport(dnReg, bpid, reports);
    LOG.debug("Scheduling to delete " +
        cluster.getNameNode().getNamesystem().getBlockManager().
            getPendingDeletionBlocksCount() + " blocks");
  }
{code}

I wonder if it's the intended behavior for the NN to keep orphan blocks, or we 
should add the logic to delete them. [~andrew.wang] Do you have a clue?

> Failed DataNode lookup can crash NameNode with NullPointerException
> -------------------------------------------------------------------
>
>                 Key: HDFS-7225
>                 URL: https://issues.apache.org/jira/browse/HDFS-7225
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch
>
>
> {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the 
> {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to 
> {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated 
> {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} 
> which will use it to lookup in a {{TreeMap}}. Since the key type is 
> {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key 
> will crash the NameNode with an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to