[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016330#comment-16016330 ]
Ravi Prakash commented on HDFS-11817: ------------------------------------- Thanks for your investigation Kihwal! I am seeing something similar on 2.7.3. A block is holding up decommissioning because recovery failed. (The strace below is from the time when the cluster was 2.7.2) . DN2 and DN3 are no longer part of the cluster. DN1 is the node held up for decomissioning. I checked that the block and meta file indeed are in the finalized directory. {code} 2016-09-19 09:02:25,837 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: recoverBlocks FAILED: RecoveringBlock{BP-<someid>:blk_1094097355_20357090; getBlockSize()=0; corrupt=false; offset=-1; locs=[DatanodeInfoWithStorage[<DN1>:50010,null,null], DatanodeInfoWithStorage[<DN2>:50010,null,null], DatanodeInfoWithStorage[<DN3>:50010,null,null]]} org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Failed to finalize INodeFile <filename> since blocks[0] is non-complete, where blocks=[blk_1094097355_20552508{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=0, replicas=[ReplicaUC[[DISK]DS-03bed13e-5cdd-4207-91b6-abd83f9eb7d3:NORMAL:<DN1>:50010|RBW]]}]. at com.google.common.base.Preconditions.checkState(Preconditions.java:172) at org.apache.hadoop.hdfs.server.namenode.INodeFile.assertAllBlocksComplete(INodeFile.java:222) at org.apache.hadoop.hdfs.server.namenode.INodeFile.toCompleteFile(INodeFile.java:209) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4218) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4457) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4419) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:837) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:291) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28768) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.hadoop.ipc.Client.call(Client.java:1475) at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy16.commitBlockSynchronization(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolClientSideTranslatorPB.java:312) at org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2780) at org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2642) at org.apache.hadoop.hdfs.server.datanode.DataNode.access$400(DataNode.java:243) at org.apache.hadoop.hdfs.server.datanode.DataNode$5.run(DataNode.java:2519) at java.lang.Thread.run(Thread.java:744) {code} I am not sure what purpose failing {{commitBlockSyncronization()}} in this case fulfills, so I would be agreeable to your proposed solution bq. We can have commitBlockSynchronization() check for valid storage ID before updating data structures. Even if no valid storage ID is found, we can't fail the operation > A faulty node can cause a lease leak and NPE on accessing data > -------------------------------------------------------------- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.8.0 > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Priority: Critical > Attachments: hdfs-11817_supplement.txt > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org