[ 
https://issues.apache.org/jira/browse/HDFS-14674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896844#comment-16896844
 ] 

Yuxuan Wang commented on HDFS-14674:
------------------------------------

I find the same problem after HDFS-12978. The fix looks good.

> Got an unexpected txid when tail editlog
> ----------------------------------------
>
>                 Key: HDFS-14674
>                 URL: https://issues.apache.org/jira/browse/HDFS-14674
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: wangzhaohui
>            Assignee: wangzhaohui
>            Priority: Major
>         Attachments: HDFS-14674-001.patch, image-2019-07-26-11-34-23-405.png
>
>
> Add the following configuration
> !image-2019-07-26-11-34-23-405.png!
> error:
> {code:java}
> //代码占位符
> [2019-07-17T11:50:21.048+08:00] [INFO] [Edit log tailer] : replaying edit 
> log: 1/20512836 transactions completed. (0%) [2019-07-17T11:50:21.059+08:00] 
> [INFO] [Edit log tailer] : Edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  of size 3126782311 edits # 500 loaded in 3 seconds 
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc 
> expecting start txid #232056752162 [2019-07-17T11:50:21.059+08:00] [INFO] 
> [Edit log tailer] : Start loading edits file 
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  maxTxnipsToRead = 500 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log 
> tailer] : Fast-forwarding stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit 
> log tailer] ip: Fast-forwarding stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662 [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit 
> log tailer] : Unknown error encountered while tailing edits. Shutting down 
> standby NN. java.io.IOException: There appears to be a gap in the edit log. 
> We expected txid 232056752162, but got txid 232077264498. at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:239)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:895) at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
>  [2019-07-17T11:50:21.064+08:00] [INFO] [Edit log tailer] : Exiting with 
> status 1 [2019-07-17T11:50:21.066+08:00] [INFO] [Thread-1] : SHUTDOWN_MSG: 
> /************************************************************ SHUTDOWN_MSG: 
> Shutting down NameNode at ip 
> ************************************************************/
> {code}
>  
> if dfs.ha.tail-edits.max-txns-per-lock value is 500,when the namenode load 
> the editlog util 500,the current namenode will load the next editlog,but 
> editlog more than 500.So,namenode got an unexpected txid when tail editlog.
>  
>  
> {code:java}
> //代码占位符[2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  of size 3126782311 edits # 500 loaded in 3 seconds
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc 
> expecting start txid #232056752162
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading 
> edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  maxTxnsToRead = 500
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding 
> stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding 
> stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662
> [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error 
> encountered while tailing edits. Shutting down standby NN.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 232056752162, but got txid 232077264498.
> {code}
> Read data from JN twice in the same second,changed segmentTxId,finally quit 
> because of txid mismatch.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to