[jira] [Commented] (HDFS-8113) NullPointerException in BlockInfoContiguous causes block report failure

Vinayakumar B (JIRA) Fri, 17 Apr 2015 00:26:05 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499350#comment-14499350
 ]


Vinayakumar B commented on HDFS-8113:
-------------------------------------

bq. As I said earlier, I don't understand the rationale for keeping blocks with 
no associated INode out of the BlocksMap. It complicates the block report since 
it requires us to check whether each block has an associated inode or not 
before adding it to the BlocksMap. But if that change seems too ambitious for 
this JIRA, we can deal with that later.
As I can see from the code ( trunk code), Its not kept for long time. In case 
of deletion, after dis-associating from file in main writeLock, blocks are kept 
in blockmap until aquiring different writelock in the same RPC, and it will be 
removed from blocksMap.
This is just to avoid holding writelock for long time in case of deletion of 
big directory.
But I dont see any case where its kept in blocksmap for long time without any 
file associated.

> NullPointerException in BlockInfoContiguous causes block report failure
> -----------------------------------------------------------------------
>
>                 Key: HDFS-8113
>                 URL: https://issues.apache.org/jira/browse/HDFS-8113
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Chengbing Liu
>            Assignee: Chengbing Liu
>         Attachments: HDFS-8113.patch
>
>
> The following copy constructor can throw NullPointerException if {{bc}} is 
> null.
> {code}
>   protected BlockInfoContiguous(BlockInfoContiguous from) {
>     this(from, from.bc.getBlockReplication());
>     this.bc = from.bc;
>   }
> {code}
> We have observed that some DataNodes keeps failing doing block reports with 
> NameNode. The stacktrace is as follows. Though we are not using the latest 
> version, the problem still exists.
> {quote}
> 2015-03-08 19:28:13,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> RemoteException in offerService
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.(BlockInfo.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockToMarkCorrupt.(BlockManager.java:1696)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkReplicaCorrupt(BlockManager.java:2185)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReportedBlock(BlockManager.java:2047)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1950)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1823)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1750)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1069)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:152)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26382)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1623)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8113) NullPointerException in BlockInfoContiguous causes block report failure

Reply via email to