[ https://issues.apache.org/jira/browse/HDFS-17093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752071#comment-17752071 ]
ASF GitHub Bot commented on HDFS-17093: --------------------------------------- hadoop-yetus commented on PR #5855: URL: https://github.com/apache/hadoop/pull/5855#issuecomment-1669741496 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 30s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 7s | | trunk passed | | +1 :green_heart: | compile | 0m 52s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 0m 49s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | checkstyle | 0m 47s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 55s | | trunk passed | | +1 :green_heart: | javadoc | 0m 52s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 11s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 1m 59s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 2s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 46s | | the patch passed | | +1 :green_heart: | compile | 0m 50s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 0m 50s | | the patch passed | | +1 :green_heart: | compile | 0m 42s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | javac | 0m 42s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5855/12/artifact/out/blanks-eol.txt) | The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | checkstyle | 0m 35s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 47s | | the patch passed | | +1 :green_heart: | javadoc | 0m 37s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 6s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 1m 53s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 21s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | -1 :x: | unit | 202m 35s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5855/12/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | | The patch does not generate ASF License warnings. | | | | 297m 10s | | | | Reason | Tests | |-------:|:------| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestObserverNode | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5855/12/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5855 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux d0dcaa24cb96 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5af06d98849707bed42863172dc38247aba428c8 | | Default Java | Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5855/12/testReport/ | | Max. process+thread count | 3592 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5855/12/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > In the case of all datanodes sending FBR when the namenode restarts (large > clusters), there is an issue with incomplete block reporting > --------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-17093 > URL: https://issues.apache.org/jira/browse/HDFS-17093 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.3.4 > Reporter: Yanlei Yu > Priority: Minor > Labels: pull-request-available > > In our cluster of 800+ nodes, after restarting the namenode, we found that > some datanodes did not report enough blocks, causing the namenode to stay in > secure mode for a long time after restarting because of incomplete block > reporting > I found in the logs of the datanode with incomplete block reporting that the > first FBR attempt failed, possibly due to namenode stress, and then a second > FBR attempt was made as follows: > {code:java} > .... > 2023-07-17 11:29:28,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Unsuccessfully sent block report 0x6237a52c1e817e, containing 12 storage > report(s), of which we sent 1. The reports had 1099057 total blocks and used > 1 RPC(s). This took 294 msec to generate and 101721 msecs for RPC and NN > processing. Got back no commands. > 2023-07-17 11:37:04,014 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Successfully sent block report 0x62382416f3f055, containing 12 storage > report(s), of which we sent 12. The reports had 1099048 total blocks and used > 12 RPC(s). This took 295 msec to generate and 11647 msecs for RPC and NN > processing. Got back no commands. {code} > There's nothing wrong with that. Retry the send if it fails But on the > namenode side of the logic: > {code:java} > if (namesystem.isInStartupSafeMode() > && !StorageType.PROVIDED.equals(storageInfo.getStorageType()) > && storageInfo.getBlockReportCount() > 0) { > blockLog.info("BLOCK* processReport 0x{} with lease ID 0x{}: " > + "discarded non-initial block report from {}" > + " because namenode still in startup phase", > strBlockReportId, fullBrLeaseId, nodeID); > blockReportLeaseManager.removeLease(node); > return !node.hasStaleStorages(); > } {code} > When a disk was identified as the report is not the first time, namely > storageInfo. GetBlockReportCount > 0, Will remove the ticket from the > datanode, lead to a second report failed because no lease -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org