[ https://issues.apache.org/jira/browse/HDFS-16659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584712#comment-17584712 ]
ASF GitHub Bot commented on HDFS-16659: --------------------------------------- hadoop-yetus commented on PR #4560: URL: https://github.com/apache/hadoop/pull/4560#issuecomment-1226954004 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 44s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 38m 30s | | trunk passed | | +1 :green_heart: | compile | 1m 42s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 18s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 46s | | trunk passed | | +1 :green_heart: | javadoc | 1m 25s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 41s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 34s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 54s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 26s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 1m 26s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 59s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 25s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 23s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 22s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 237m 56s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 6s | | The patch does not generate ASF License warnings. | | | | 347m 4s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4560/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4560 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux b6ef6639ef5c 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 08e0dae7ddf43a22e699a34b21dbe7e32755969c | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4560/3/testReport/ | | Max. process+thread count | 3832 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4560/3/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > JournalNode should throw NewerTxnIdException if SinceTxId is bigger than > HighestWrittenTxId > ------------------------------------------------------------------------------------------- > > Key: HDFS-16659 > URL: https://issues.apache.org/jira/browse/HDFS-16659 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: ZanderXu > Assignee: ZanderXu > Priority: Critical > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > JournalNode should throw `CacheMissException` if `sinceTxId` is bigger than > `highestWrittenTxId` during handling `getJournaledEdits` rpc from NNs. > Current logic may cause in-progress EditlogTailer cannot replay any Edits > from JournalNodes in some corner cases, resulting in ObserverNameNode cannot > handle requests from clients. > Suppose there are 3 journalNodes, JN0 ~ JN1. > * JN0 has some abnormal cases when Active Namenode is syncing 10 Edits with > first txid 11 > * NameNode just ignore the abnormal JN0 and continue to sync Edits to Journal > 1 and 2 > * JN0 backed to health > * NameNode continue sync 10 Edits with first txid 21. > * At this point, there are no Edits 11 ~ 30 in the cache of JN0 > * Observer NameNode try to select EditLogInputStream through > `getJournaledEdits` with since txId 21 > * Journal 2 has some abnormal cases and caused a slow response > The expected result is: Response should contain 20 Edits from txId 21 to txId > 30 from JN1 and JN2. Because Active NameNode successfully write these Edits > to JN1 and JN2 and failed write these edits to JN0. > But in the current implementation, the response is [Response(0) from JN0, > Response(10) from JN1], because there are some abnormal cases in JN2, such > as GC, bad network, cause a slow response. So the `maxAllowedTxns` will be > 0, NameNode will not replay any Edits. > As above, the root case is that JournalNode should throw Miss Cache Exception > when `sinceTxid` is more than `highestWrittenTxId`. > And the bug code as blew: > {code:java} > if (sinceTxId > getHighestWrittenTxId()) { > // Requested edits that don't exist yet; short-circuit the cache here > metrics.rpcEmptyResponses.incr(); > return > GetJournaledEditsResponseProto.newBuilder().setTxnCount(0).build(); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org