[ https://issues.apache.org/jira/browse/HDFS-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499350#comment-14499350 ]
Vinayakumar B commented on HDFS-8113: ------------------------------------- bq. As I said earlier, I don't understand the rationale for keeping blocks with no associated INode out of the BlocksMap. It complicates the block report since it requires us to check whether each block has an associated inode or not before adding it to the BlocksMap. But if that change seems too ambitious for this JIRA, we can deal with that later. As I can see from the code ( trunk code), Its not kept for long time. In case of deletion, after dis-associating from file in main writeLock, blocks are kept in blockmap until aquiring different writelock in the same RPC, and it will be removed from blocksMap. This is just to avoid holding writelock for long time in case of deletion of big directory. But I dont see any case where its kept in blocksmap for long time without any file associated. > NullPointerException in BlockInfoContiguous causes block report failure > ----------------------------------------------------------------------- > > Key: HDFS-8113 > URL: https://issues.apache.org/jira/browse/HDFS-8113 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.6.0 > Reporter: Chengbing Liu > Assignee: Chengbing Liu > Attachments: HDFS-8113.patch > > > The following copy constructor can throw NullPointerException if {{bc}} is > null. > {code} > protected BlockInfoContiguous(BlockInfoContiguous from) { > this(from, from.bc.getBlockReplication()); > this.bc = from.bc; > } > {code} > We have observed that some DataNodes keeps failing doing block reports with > NameNode. The stacktrace is as follows. Though we are not using the latest > version, the problem still exists. > {quote} > 2015-03-08 19:28:13,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > RemoteException in offerService > org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): > java.lang.NullPointerException > at org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.(BlockInfo.java:80) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockToMarkCorrupt.(BlockManager.java:1696) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkReplicaCorrupt(BlockManager.java:2185) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReportedBlock(BlockManager.java:2047) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1950) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1823) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1750) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1069) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:152) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26382) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1623) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)