[ https://issues.apache.org/jira/browse/HDFS-16657?focusedWorklogId=792162&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792162 ]
ASF GitHub Bot logged work on HDFS-16657: ----------------------------------------- Author: ASF GitHub Bot Created on: 18/Jul/22 14:39 Start Date: 18/Jul/22 14:39 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4558: URL: https://github.com/apache/hadoop/pull/4558#issuecomment-1187583332 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 49s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | |||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 40m 6s | | trunk passed | | +1 :green_heart: | compile | 1m 40s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 1m 32s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 1m 19s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 44s | | trunk passed | | +1 :green_heart: | javadoc | 1m 19s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 44s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 53s | | trunk passed | | +1 :green_heart: | shadedclient | 28m 19s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 27s | | the patch passed | | +1 :green_heart: | compile | 1m 33s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 1m 33s | | the patch passed | | +1 :green_heart: | compile | 1m 22s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 1m 22s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 3s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4558/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 5 new + 52 unchanged - 2 fixed = 57 total (was 54) | | +1 :green_heart: | mvnsite | 1m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 58s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 33s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 3m 37s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 45s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 367m 22s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 59s | | The patch does not generate ASF License warnings. | | | | 486m 46s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4558/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4558 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 75c39d2d8fca 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / f025157cdb5b3b06e343b673e227070848f3ae97 | | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4558/2/testReport/ | | Max. process+thread count | 1866 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4558/2/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. Issue Time Tracking ------------------- Worklog Id: (was: 792162) Time Spent: 50m (was: 40m) > Changing pool-level lock to volume-level lock for invalidation of blocks > ------------------------------------------------------------------------ > > Key: HDFS-16657 > URL: https://issues.apache.org/jira/browse/HDFS-16657 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Yuanbo Liu > Priority: Major > Labels: pull-request-available > Attachments: image-2022-07-13-10-25-37-383.png, > image-2022-07-13-10-27-01-386.png, image-2022-07-13-10-27-44-258.png > > Time Spent: 50m > Remaining Estimate: 0h > > Recently we see that the heartbeating of dn become slow in a very busy > cluster, here is the chart: > !image-2022-07-13-10-25-37-383.png|width=665,height=245! > > After getting jstack of the dn, we find that dn heartbeat stuck in > invalidation of blocks: > !image-2022-07-13-10-27-01-386.png|width=658,height=308! > !image-2022-07-13-10-27-44-258.png|width=502,height=325! > The key code is: > {code:java} > // code placeholder > try { > File blockFile = new File(info.getBlockURI()); > if (blockFile != null && blockFile.getParentFile() == null) { > errors.add("Failed to delete replica " + invalidBlks[i] > + ". Parent not found for block file: " + blockFile); > continue; > } > } catch(IllegalArgumentException e) { > LOG.warn("Parent directory check failed; replica " + info > + " is not backed by a local file"); > } {code} > DN is trying to locate parent path of block file, thus there is a disk I/O in > pool-level lock. When the disk becomes very busy with high io wait, All the > pending threads will be blocked by the pool-level lock, and the time of > heartbeat is high. We proposal to change the pool-level lock to volume-level > lock for block invalidation > cc: [~hexiaoqiao] [~Aiphag0] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org