[ 
https://issues.apache.org/jira/browse/HDFS-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-10989:
-------------------------------
    Affects Version/s: 2.4.1

> Cannot get last block length after namenode failover
> ----------------------------------------------------
>
>                 Key: HDFS-10989
>                 URL: https://issues.apache.org/jira/browse/HDFS-10989
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.1
>            Reporter: zhouyingchao
>
> On a 2.4 cluster, access to a file failed since the last block length cannot 
> be gotten.  The fsck output of the file at the moment of failure was like 
> this:
> /user/XXXXXXXXX 483600487 bytes, 2 block(s), OPENFORWRITE:  MISSING 1 blocks 
> of total size 215165031 B
> 0. BP-219149063-10.108.84.25-1446859315800:blk_2102504098_1035525341 
> len=268435456 repl=3 [10.112.17.43:11402, 10.118.22.46:11402, 
> 10.118.22.49:11402]
> 1. 
> BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036219054{blockUCState=UNDER_RECOVERY,
>  primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-60be75ad-e4a7-4b1e-b3aa-327c85331d42:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-184a1ce9-655a-4e67-b0cc-29ab9984bd0a:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-6d037ac8-4bcc-4cdc-a803-55b1817e0200:NORMAL|RBW]]}
>  len=215165031 MISSING!  Recorded locations [10.114.10.14:11402, 
> 10.118.29.3:11402, 10.118.22.42:11402]
> From those three data nodes, we found that there were IOException related to 
> the block and there were pipeline recreating events.
> We figured out that there was a namenode failover event before the issue 
> happened, and there were some updatePipeline calls to the earlier active 
> namenode:
> 2016-09-27,15:04:36,437 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(block=BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036137092,
>  newGenerationStamp=1036170430, newLength=2624000, 
> newNodes=[10.118.22.42:11402, 10.118.22.49:11402, 10.118.24.3:11402], 
> clientName=DFSClient_NONMAPREDUCE_-442153643_1)
> 2016-09-27,15:04:36,438 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036137092)
>  successfully to 
> BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430
> 2016-09-27,15:10:10,596 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(block=BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430,
>  newGenerationStamp=1036219054, newLength=17138265, 
> newNodes=[10.118.22.49:11402, 10.118.24.3:11402, 10.114.6.45:11402], 
> clientName=DFSClient_NONMAPREDUCE_-442153643_1)
> 2016-09-27,15:10:10,601 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> updatePipeline(BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430)
>  successfully to 
> BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036219054
> Whereas these new data nodes did not show up in the fsck output. It looks 
> like that when data node recovers pipeline (PIPELINE_SETUP_STREAMING_RECOVERY 
> ), the new data nodes would not call notifyNamingnodeReceivingBlock for the 
> transfered block. 
> From code review, the issue also exists in more recent branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to