pan3793 opened a new pull request, #8239:
URL: https://github.com/apache/hadoop/pull/8239
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
Replace `FastByteComparisons`(derived from Guava) with JDK
`Arrays.compareUnsigned`(added in JDK 9, and optimized in modern JDKs)
Guava v33.4.5 starts to use `Arrays.compareUnsigned` when it is available.
https://github.com/google/guava/commit/b3bb29a54b8f13d6f6630b6cb929867adbf6b9a0
> The benchmarks suggest that the old, `Unsafe`-based implementation is
faster up to 64 elements or so
> but that it's a matter of only about a nanosecond. The new implementation
pulls ahead for larger arrays,
> including an advantage of about 5-10 ns for 1024 elements.
### How was this patch tested?
Create a JMH benchmark https://github.com/pan3793/HADOOP-19810.
TL;DR - JDK `Arrays.compareUnsigned` has better performance on the x86_64
platform, and the same performance on Apple Silicon
## Benchmark results
- Ubuntu 24.04 LTS x86_64
- CPU: Intel i5-9500 (6) @ 4.400GHz
- Kernel: 6.17.9-76061709-generic
- OpenJDK Runtime Environment Temurin-25.0.2+10 (build 25.0.2+10-LTS)
```
Benchmark (length) Mode Cnt Score
Error Units
BytesComparatorBenchmark.hadoopDiffLast 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 1024 ss 5 0.001 ±
0.001 s/op
BytesComparatorBenchmark.hadoopDiffLast 1048576 ss 5 0.598 ±
0.026 s/op
BytesComparatorBenchmark.hadoopDiffLast 1048577 ss 5 0.711 ±
0.021 s/op
BytesComparatorBenchmark.hadoopDiffLast 6710884 ss 5 6.213 ±
0.152 s/op
BytesComparatorBenchmark.hadoopDiffLast 6710883 ss 5 6.198 ±
0.253 s/op
BytesComparatorBenchmark.jdkDiffLast 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 1024 ss 5 ≈ 10⁻³
s/op
BytesComparatorBenchmark.jdkDiffLast 1048576 ss 5 0.391 ±
0.128 s/op
BytesComparatorBenchmark.jdkDiffLast 1048577 ss 5 0.377 ±
0.106 s/op
BytesComparatorBenchmark.jdkDiffLast 6710884 ss 5 4.955 ±
2.631 s/op
BytesComparatorBenchmark.jdkDiffLast 6710883 ss 5 4.310 ±
1.008 s/op
BytesComparatorBenchmark.hadoopEqual 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 1024 ss 5 0.001 ±
0.001 s/op
BytesComparatorBenchmark.hadoopEqual 1048576 ss 5 0.610 ±
0.025 s/op
BytesComparatorBenchmark.hadoopEqual 1048577 ss 5 0.711 ±
0.016 s/op
BytesComparatorBenchmark.hadoopEqual 6710884 ss 5 6.512 ±
1.566 s/op
BytesComparatorBenchmark.hadoopEqual 6710883 ss 5 6.714 ±
1.409 s/op
BytesComparatorBenchmark.jdkEqual 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 1024 ss 5 ≈ 10⁻³
s/op
BytesComparatorBenchmark.jdkEqual 1048576 ss 5 0.381 ±
0.123 s/op
BytesComparatorBenchmark.jdkEqual 1048577 ss 5 0.369 ±
0.049 s/op
BytesComparatorBenchmark.jdkEqual 6710884 ss 5 4.236 ±
0.252 s/op
BytesComparatorBenchmark.jdkEqual 6710883 ss 5 4.184 ±
0.236 s/op
```
- CPU: Apple M1 Max
- OpenJDK Runtime Environment Temurin-25.0.2+10 (build 25.0.2+10-LTS)
```
Benchmark (length) Mode Cnt Score
Error Units
BytesComparatorBenchmark.hadoopDiffLast 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopDiffLast 1024 ss 5 ≈ 10⁻³
s/op
BytesComparatorBenchmark.hadoopDiffLast 1048576 ss 5 0.404 ±
0.015 s/op
BytesComparatorBenchmark.hadoopDiffLast 1048577 ss 5 0.392 ±
0.013 s/op
BytesComparatorBenchmark.hadoopDiffLast 6710884 ss 5 2.872 ±
0.057 s/op
BytesComparatorBenchmark.hadoopDiffLast 6710883 ss 5 2.893 ±
0.046 s/op
BytesComparatorBenchmark.jdkDiffLast 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkDiffLast 1024 ss 5 0.001 ±
0.001 s/op
BytesComparatorBenchmark.jdkDiffLast 1048576 ss 5 0.399 ±
0.009 s/op
BytesComparatorBenchmark.jdkDiffLast 1048577 ss 5 0.390 ±
0.004 s/op
BytesComparatorBenchmark.jdkDiffLast 6710884 ss 5 2.913 ±
0.029 s/op
BytesComparatorBenchmark.jdkDiffLast 6710883 ss 5 2.897 ±
0.015 s/op
BytesComparatorBenchmark.hadoopEqual 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.hadoopEqual 1024 ss 5 ≈ 10⁻³
s/op
BytesComparatorBenchmark.hadoopEqual 1048576 ss 5 0.393 ±
0.011 s/op
BytesComparatorBenchmark.hadoopEqual 1048577 ss 5 0.395 ±
0.012 s/op
BytesComparatorBenchmark.hadoopEqual 6710884 ss 5 2.877 ±
0.005 s/op
BytesComparatorBenchmark.hadoopEqual 6710883 ss 5 2.903 ±
0.023 s/op
BytesComparatorBenchmark.jdkEqual 4 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 8 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 64 ss 5 ≈ 10⁻⁴
s/op
BytesComparatorBenchmark.jdkEqual 1024 ss 5 ≈ 10⁻³
s/op
BytesComparatorBenchmark.jdkEqual 1048576 ss 5 0.406 ±
0.009 s/op
BytesComparatorBenchmark.jdkEqual 1048577 ss 5 0.390 ±
0.008 s/op
BytesComparatorBenchmark.jdkEqual 6710884 ss 5 2.900 ±
0.044 s/op
BytesComparatorBenchmark.jdkEqual 6710883 ss 5 2.912 ±
0.015 s/op
```
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(HADOOP-19810)?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
### AI Tooling
No AI usage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]