[ 
https://issues.apache.org/jira/browse/HDFS-16689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582125#comment-17582125
 ] 

ASF GitHub Bot commented on HDFS-16689:
---------------------------------------

ZanderXu commented on PR #4744:
URL: https://github.com/apache/hadoop/pull/4744#issuecomment-1221235271

   @xkrogen Thanks.
   > I think you might have also understood what I was saying in my last comment
   
   Yes, I got it. 
   
   > I am thinking we can add a new parameter to 
LogsPurgeable#selectInputStreams() like preferBulkReads.
   
   This is a good idea, I will fix this patch like this.
   
   > In recoverUnclosedStreams, if the finalization fails, it will just ignore 
it and assume that it will be handled later (by Journal#startLogSegment(), 
which will automatically close an old stream when you try to open a new one).
   
   > In recoverUnclosedStreams, if the finalization fails, it will just ignore 
it
   
   Sorry, I didn't notice this. But I think it's crazy. 
   
   > assume that it will be handled later (by Journal#startLogSegment(), which 
will automatically close an old stream when you try to open a new one).
   
   I'm sorry, I just find this comment, but didn't find related code to 
finalize the previous inProgress segment. Can you share the related code? 
Thanks. 
   
   
   > If there is something preventing the new active from communicating with 
the JNs, or something preventing the JNs from finalizing the old segment, then 
the NN will eventually fail to become active regardless.
   
   Yes, I agree. Standby should crash or fail to become active if it cannot 
finalize the old segment. About this case, How about fix it in a new PR? 
   




> Standby NameNode crashes when transitioning to Active with in-progress tailer
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-16689
>                 URL: https://issues.apache.org/jira/browse/HDFS-16689
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Standby NameNode crashes when transitioning to Active with a in-progress 
> tailer. And the error message like blew:
> {code:java}
> Caused by: java.lang.IllegalStateException: Cannot start writing at txid X 
> when there is a stream available for read: ByteStringEditLog[X, Y], 
> ByteStringEditLog[X, 0]
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:344)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.openForWrite(FSEditLogAsync.java:113)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1423)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:2132)
>       ... 36 more
> {code}
> After tracing and found there is a critical bug in 
> *EditlogTailer#catchupDuringFailover()* when 
> *DFS_HA_TAILEDITS_INPROGRESS_KEY* is true. Because *catchupDuringFailover()* 
> try to replay all missed edits from JournalNodes with *onlyDurableTxns=true*. 
> It may cannot replay any edits when they are some abnormal JournalNodes. 
> Reproduce method, suppose:
> - There are 2 namenode, namely NN0 and NN1, and the status of echo namenode 
> is Active, Standby respectively. And there are 3 JournalNodes, namely JN0, 
> JN1 and JN2. 
> - NN0 try to sync 3 edits to JNs with started txid 3, but only successfully 
> synced them to JN1 and JN3. And JN0 is abnormal, such as GC, bad network or 
> restarted.
> - NN1's lastAppliedTxId is 2, and at the moment, we are trying failover 
> active from NN0 to NN1. 
> - NN1 only got two responses from JN0 and JN1 when it try to selecting 
> inputStreams with *fromTxnId=3*  and *onlyDurableTxns=true*, and the count 
> txid of response is 0, 3 respectively. JN2 is abnormal, such as GC,  bad 
> network or restarted.
> - NN1 will cannot replay any Edits with *fromTxnId=3* from JournalNodes 
> because the *maxAllowedTxns* is 0.
> So I think Standby NameNode should *catchupDuringFailover()* with 
> *onlyDurableTxns=false* , so that it can replay all missed edits from 
> JournalNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to