[ https://issues.apache.org/jira/browse/HDFS-16659?focusedWorklogId=793334&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793334 ]
ASF GitHub Bot logged work on HDFS-16659: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Jul/22 16:19 Start Date: 20/Jul/22 16:19 Worklog Time Spent: 10m Work Description: ZanderXu commented on PR #4560: URL: https://github.com/apache/hadoop/pull/4560#issuecomment-1190487791 @jojochuang @goiri Can you help me review this patch? Thanks Issue Time Tracking ------------------- Worklog Id: (was: 793334) Time Spent: 40m (was: 0.5h) > JournalNode should throw CacheMissException if SinceTxId is bigger than > HighestWrittenTxId > ------------------------------------------------------------------------------------------ > > Key: HDFS-16659 > URL: https://issues.apache.org/jira/browse/HDFS-16659 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: ZanderXu > Assignee: ZanderXu > Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > JournalNode should throw `CacheMissException` if `sinceTxId` is bigger than > `highestWrittenTxId`. And it will caused EditlogTailer can not able to tail > edits. And it maybe caused ObserverNameNode can not able handle requests from > clients. > Suppose there are 3 journalNodes, JN0 ~ JN1. > The corner case as blew: > * JN0 has some abnormal cases when Active Namenode is journaling Edits with > start txId 11 > * NameNode just ignore the abnormal JN0 and continue to write Edits to > Journal 1 and 2 > * JN0 backed to health > * Observer NameNode try to select EditLogInputStream vis PRC with start txId > 21 > * Journal 1 has some abnormal cases caused slow rpc response > And the expected selecting result is: Response should contain 20 Edits from > txId 21 to txId 40 from JN1 and JN2. Because Active NameNode successfully > write these Edits to JN1 and JN2 and failed write these edits to JN0, so > there is no Edits from id 21 to 40 in the cache of JN0. > But in the current implementation, there is no Edits in the Response. > Because namenode successfully got a response from JN0 that did not contains > any Edits. > And the bug code as blew: > {code:java} > if (sinceTxId > getHighestWrittenTxId()) { > // Requested edits that don't exist yet; short-circuit the cache here > metrics.rpcEmptyResponses.incr(); > return > GetJournaledEditsResponseProto.newBuilder().setTxnCount(0).build(); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org