[
https://issues.apache.org/jira/browse/HDFS-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18045241#comment-18045241
]
ASF GitHub Bot commented on HDFS-17864:
---------------------------------------
hadoop-yetus commented on PR #8130:
URL: https://github.com/apache/hadoop/pull/8130#issuecomment-3656885175
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|:----:|----------:|--------:|:--------:|:-------:|
| +0 :ok: | reexec | 2m 44s | | Docker mode activated. |
|||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 1s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +0 :ok: | xmllint | 0m 0s | | xmllint was not available. |
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | | The patch appears to
include 4 new or modified test files. |
|||| _ trunk Compile Tests _ |
| +0 :ok: | mvndep | 8m 26s | | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 35m 5s | | trunk passed |
| +1 :green_heart: | compile | 19m 9s | | trunk passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | compile | 19m 40s | | trunk passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | checkstyle | 3m 25s | | trunk passed |
| +1 :green_heart: | mvnsite | 5m 25s | | trunk passed |
| +1 :green_heart: | javadoc | 4m 2s | | trunk passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javadoc | 3m 58s | | trunk passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | spotbugs | 11m 25s | | trunk passed |
| +1 :green_heart: | shadedclient | 35m 18s | | branch has no errors
when building and testing our client artifacts. |
|||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 3m 30s | | the patch passed |
| +1 :green_heart: | compile | 18m 25s | | the patch passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javac | 18m 25s | | the patch passed |
| +1 :green_heart: | compile | 19m 21s | | the patch passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javac | 19m 21s | | the patch passed |
| -1 :x: | blanks | 0m 0s |
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/blanks-eol.txt)
| The patch has 16 line(s) that end in blanks. Use git apply --whitespace=fix
<<patch_file>>. Refer https://git-scm.com/docs/git-apply |
| -0 :warning: | checkstyle | 3m 49s |
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-checkstyle-root.txt)
| root: The patch generated 2 new + 414 unchanged - 2 fixed = 416 total (was
416) |
| +1 :green_heart: | mvnsite | 5m 21s | | the patch passed |
| -1 :x: | javadoc | 1m 25s |
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt)
| hadoop-common-project_hadoop-common-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 4 new + 4453 unchanged
- 0 fixed = 4457 total (was 4453) |
| -1 :x: | javadoc | 1m 5s |
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt)
|
hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 1 new + 3604 unchanged
- 0 fixed = 3605 total (was 3604) |
| -1 :x: | javadoc | 1m 28s |
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt)
| hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 12 new + 9988 unchanged
- 12 fixed = 10000 total (was 10000) |
| -1 :x: | javadoc | 1m 24s |
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt)
|
hadoop-common-project_hadoop-common-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 3 new + 3737 unchanged
- 0 fixed = 3740 total (was 3737) |
| -1 :x: | javadoc | 1m 5s |
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt)
|
hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 1 new + 3354 unchanged
- 0 fixed = 3355 total (was 3354) |
| -1 :x: | javadoc | 1m 32s |
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt)
| hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04
with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 11 new + 9676
unchanged - 0 fixed = 9687 total (was 9676) |
| +1 :green_heart: | spotbugs | 12m 5s | | the patch passed |
| +1 :green_heart: | shadedclient | 35m 8s | | patch has no errors
when building and testing our client artifacts. |
|||| _ Other Tests _ |
| +1 :green_heart: | unit | 23m 35s | | hadoop-common in the patch
passed. |
| +1 :green_heart: | unit | 2m 59s | | hadoop-hdfs-client in the patch
passed. |
| -1 :x: | unit | 259m 56s |
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
| hadoop-hdfs in the patch passed. |
| +1 :green_heart: | asflicense | 1m 31s | | The patch does not
generate ASF License warnings. |
| | | 543m 35s | | |
| Reason | Tests |
|-------:|:------|
| Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.52 ServerAPI=1.52 base:
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/artifact/out/Dockerfile
|
| GITHUB PR | https://github.com/apache/hadoop/pull/8130 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
| uname | Linux da0c757e4d89 5.15.0-160-generic #170-Ubuntu SMP Wed Oct 1
10:06:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/bin/hadoop.sh |
| git revision | trunk / e8c530695c9d138a67dd0e659b72a7ac0de6f217 |
| Default Java | Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| Multi-JDK versions |
/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
/usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| Test Results |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/testReport/ |
| Max. process+thread count | 3682 (vs. ulimit of 5500) |
| modules | C: hadoop-common-project/hadoop-common
hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: . |
| Console output |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8130/2/console |
| versions | git=2.25.1 maven=3.9.11 spotbugs=4.9.7 |
| Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
This message was automatically generated.
> Improve fsimage load time by making LightWeightGSet and NameCache thread-safe
> -----------------------------------------------------------------------------
>
> Key: HDFS-17864
> URL: https://issues.apache.org/jira/browse/HDFS-17864
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: khazhen
> Priority: Major
> Labels: pull-request-available
>
> HDFS-14617 allows the inode and inode directory sections of the fsimage to be
> loaded in parallel.
> However, increasing the configured number of sections and threads has
> diminishing returns as there are some synchronized points in the loading code
> to protect some in memory structures.
> Currently, there are mainly 3 data structures that need to be protected by
> synchronized blocks:
> # INodeMap (internally based on LightWeightGSet, but it is not thread-safe)
> # BlocksMap (internally based on LightWeightGSet, but it is not thread-safe)
> # NameCache (it is not thread-safe by itself)
> To further improve FSImage loading speed, this PR attempts to make the above
> 3 data structures thread-safe, and then use multiple threads to initialize
> them when NameNode starts.
> Additionally, some optimizations have been made to reduce GC overheads during
> FSImage parsing.
> In our tests, the FSImage loading time (165M inodes & 258M blocks) was
> reduced from 183s to 73s.
> *1. Making LightWeightGSet thread-safe*
> LightWeightGSet is a HashMap-like data structure that uses a fixed-length
> array as hash buckets, with each array element storing the head node of an
> independent linked list.
> Since each linked list is independent, we can allocate a lock for each
> bucket to protect the corresponding linked list.
> To trade off between memory consumption and concurrency, we can let
> multiple buckets share a lock and use a hash-based mapping.
> To minimize changes, we don't plan to implement a completely thread-safe
> GSet to replace LightWeightGSet, as this would require significant changes
> and is unnecessary since all operations on LightWeightGSet are synchronized
> once NameNode finishes starting up.
> We introduced an external synchronization tool GSetConcurrencyController
> to ensure the thread safety of LightWeightGSet during NameNode startup.
> Another issue that needs to be addressed is the GSet's size. Currently,
> the size in LightWeightGSet is not an atomic variable, and even if we use
> segmented locks to protect hash buckets, the size is still inaccurate.
> Fortunately, in the FSImage loading scenario, we can clearly know the
> expected size of INodeMap and BlocksMap after loading, so we can correct its
> size after loading is complete.
> *2. Making NameCache thread-safe*
> This is simpler compared to LightWeightGSet. We only need to combine
> ConcurrentHashMap and AtomicInteger to implement a thread-safe version of
> NameCache.
> *3. Reducing GC pressure during FSImage loading*
> After completing steps 1 and 2, we found that GC gradually became a new
> bottleneck. After analysis, we discovered that the parseDelimitedFrom method
> in ProtoBuffer creates a 4096-byte array as cache when parsing each INode
> object.
> To optimize this issue, we introduced the DelimitedProtoBufParseHelper
> utility class to reuse the cache array.
> Appendix: Test environment and configuration information
> *Hadoop version*: current master, including previous fsimage loading
> optimizations: HDFS-13694, HDFS-14617, HDFS-15493, HDFS-13694
> *FSImage information*:
> Size: 20G (165M inodes & 258M blocks)
> *Config:*
> dfs.image.parallel.threads=16
> dfs.image.parallel.target.sections=128
> dfs.image.parallel.load=true
> *new config in this patch:*
> dfs.image.concurrent.init.inode.map.enable=true
> dfs.image.name.cache.init.thread.num=16
> dfs.image.block.map.init.thread.num=16
> *java version*: jdk17 (as GC became a new bottleneck, jdk17 performs better
> than jdk8 in our test )
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]