Chao Sun created HDFS-12836: ------------------------------- Summary: startTxId could be greater than endTxId when tailing in-progress edit log Key: HDFS-12836 URL: https://issues.apache.org/jira/browse/HDFS-12836 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Reporter: Chao Sun Assignee: Chao Sun
When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also tail those in progress edit log segments. However, in the following code: {code} if (onlyDurableTxns && inProgressOk) { endTxId = Math.min(endTxId, committedTxnId); } EditLogInputStream elis = EditLogFileInputStream.fromUrl( connectionFactory, url, remoteLog.getStartTxId(), endTxId, remoteLog.isInProgress()); {code} it is possible that {{remoteLog.getStartTxId()}} could be greater than {{endTxId}}, and therefore will cause the following error: {code} 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Error replaying edit log at offset 1048576. Expected transaction ID was 87 Recent opcode offsets: 1048576 org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 86; expected file to go up to 85 at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) 2017-11-17 19:55:41,165 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading edits from disk. Will try again. org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 1048576. Expected transaction ID was 87 Recent opcode offsets: 1048576 at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393) Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 86; expected file to go up to 85 at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205) ... 9 more {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org