[ 
https://issues.apache.org/jira/browse/HDFS-13953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637221#comment-16637221
 ] 

Brahma Reddy Battula commented on HDFS-13953:
---------------------------------------------

is it related to HDFS-10714? NPE is because fault storages and it's addressed 
in HDFS-12299.?

> Failure of last datanode in the pipeline results in block recovery failure 
> and subsequent NPE during fsck
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13953
>                 URL: https://issues.apache.org/jira/browse/HDFS-13953
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Hrishikesh Gadre
>            Priority: Major
>
> A user reported following scenario,
>  * HBase region server created WAL and attempted to write
>  * As part of the pipeline write, following events happened,
>  ** The last data node in the pipeline failed. 
>  ** The region server could not identify this last data node as the root 
> cause of write failure and instead reported NN the first data node in the 
> pipeline as the cause of failure.
>  ** NN created a new write pipeline by replacing the good data node and 
> retaining the faulty data node.
>  ** This process continued for three iterations until NN encountered an NPE.
>  * Now the fsck on the /bhase directory also failing due to NPE in NN
> Following stack traces was found in region server logs
> {noformat}
> WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeStaleReplicas(BlockManager.java:3238)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.updateLastBlock(BlockManager.java:3633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:7374)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:7339)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:777)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.updatePipeline(AuthorizationProviderProxyClientProtocol.java:654){noformat}
>  
> AND
>  
> {noformat}
> WARN org.apache.hadoop.hbase.util.FSHDFSUtils: attempt=0 on 
> file=hdfs://nameservice1/hbase/genie/WALs/ABC,60020,1525325654855-splitting/abc%2C60020%2C1525325654855.null0.1536002440010
>  after 6ms
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction$ReplicaUnderConstruction.isAlive(BlockInfoUnderConstruction.java:121)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.initializeBlockRecovery(BlockInfoUnderConstruction.java:288)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:4846)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3252)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:3196)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:630)
>       at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.recoverLease(AuthorizationProviderProxyClientProtocol.java:372)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:681)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073){noformat}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to