[ https://issues.apache.org/jira/browse/HDFS-14674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915524#comment-16915524 ]
Feilong He edited comment on HDFS-14674 at 8/26/19 6:59 AM: ------------------------------------------------------------ I am looking over branch-3.1. In TestEditLog, it looks there is an issue in the following code. The 2nd variable should be "e" instead of "e.getMessage()". Am I missing something? {{the LOG.error("There appears to be an out-of-order edit in the edit log", e.getMessage());}} was (Author: philohe): I am looking over branch-3.1. In TestEditLog, it looks there is an issue in the following code. The 2nd variable should be "e" instead of "e.getMessage()". Am I missing something? the LOG.error("There appears to be an out-of-order edit in the edit log", e.getMessage()); > [SBN read] Got an unexpected txid when tail editlog > --------------------------------------------------- > > Key: HDFS-14674 > URL: https://issues.apache.org/jira/browse/HDFS-14674 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: wangzhaohui > Assignee: wangzhaohui > Priority: Blocker > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14674-001.patch, HDFS-14674-003.patch, > HDFS-14674-004.patch, HDFS-14674-005.patch, HDFS-14674-006.patch, > HDFS-14674-007.patch, HDFS-14674-008.patch, HDFS-14674-009.patch, > HDFS-14674-010.patch, HDFS-14674-011.patch, > image-2019-08-22-16-24-06-518.png, image.png > > > Add the following configuration > !image-2019-08-22-16-24-06-518.png|width=451,height=80! > error: > {code:java} > // > [2019-07-17T11:50:21.048+08:00] [INFO] [Edit log tailer] : replaying edit > log: 1/20512836 transactions completed. (0%) [2019-07-17T11:50:21.059+08:00] > [INFO] [Edit log tailer] : Edits file > http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH > of size 3126782311 edits # 500 loaded in 3 seconds > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc > expecting start txid #232056752162 [2019-07-17T11:50:21.059+08:00] [INFO] > [Edit log tailer] : Start loading edits file > http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH > maxTxnipsToRead = 500 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log > tailer] : Fast-forwarding stream > 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' > to transaction ID 232056751662 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit > log tailer] ip: Fast-forwarding stream > 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' > to transaction ID 232056751662 [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit > log tailer] : Unknown error encountered while tailing edits. Shutting down > standby NN. java.io.IOException: There appears to be a gap in the edit log. > We expected txid 232056752162, but got txid 232077264498. at > org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:239) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:895) at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > [2019-07-17T11:50:21.064+08:00] [INFO] [Edit log tailer] : Exiting with > status 1 [2019-07-17T11:50:21.066+08:00] [INFO] [Thread-1] : SHUTDOWN_MSG: > /************************************************************ SHUTDOWN_MSG: > Shutting down NameNode at ip > ************************************************************/ > {code} > > if dfs.ha.tail-edits.max-txns-per-lock value is 500,when the namenode load > the editlog util 500,the current namenode will load the next editlog,but > editlog more than 500.So,namenode got an unexpected txid when tail editlog. > > > {code:java} > // > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file > http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH > of size 3126782311 edits # 500 loaded in 3 seconds > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc > expecting start txid #232056752162 > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading > edits file > http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH > maxTxnsToRead = 500 > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding > stream > 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH, > > http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' > to transaction ID 232056751662 > [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding > stream > 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH' > to transaction ID 232056751662 > [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error > encountered while tailing edits. Shutting down standby NN. > java.io.IOException: There appears to be a gap in the edit log. We expected > txid 232056752162, but got txid 232077264498. > {code} > Read data from JN twice in the same second,changed segmentTxId,finally quit > because of txid mismatch. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org