[ https://issues.apache.org/jira/browse/HDFS-17631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888652#comment-17888652 ]
ASF GitHub Bot commented on HDFS-17631: --------------------------------------- Hexiaoqiao commented on PR #7066: URL: https://github.com/apache/hadoop/pull/7066#issuecomment-2407402517 @LiuGuH Thanks for your works. I didn't get the information clearly, do you mean that NameNode can not failover to another EditLogInputStream and continued retry when meet some unexpected issues when replay EditLog? And I wonder what will happen when meet the case if deploy with the current logic. Thanks again. > Fix RedundantEditLogInputStream.nextOp() state error when > EditLogInputStream.skipUntil() throw IOException > ----------------------------------------------------------------------------------------------------------- > > Key: HDFS-17631 > URL: https://issues.apache.org/jira/browse/HDFS-17631 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: liuguanghua > Assignee: liuguanghua > Priority: Major > Labels: pull-request-available > > For namenode HA mode, standby namenode load editlog form journalnodes via > QuorumJournalManger.selectInputStreams(). And RedundantEditLogInputStream is > used for combine multiple remote journalnode inputstreams. > The problems is that when read editlog with > RedundantEditLogInputStream.nextOp() if the first stream execute skipUntil() > throw IOException ( network errors, or hardware problems etc..) , it will be > State.OK rather than State.STREAM_FAILED. > And the proper state will be like blew and fault tolerant: > State.SKIP_UNTIL -> State.STREAM_FAILED ->(try next stream) State.SKIP_UNTIL > -> State.OK -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org