> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea > can be found at > http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and > `compiler/intrinsics/base64/TestBase64.java` runned specially for the > correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME > encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd > instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the > JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short > inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on > Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt > Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 > 48.614 ± 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 > 58.199 ± 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 > 69.400 ± 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 > 96.818 ± 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 > 122.856 ± 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 > 130.935 ± 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 > 143.627 ± 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 > 152.311 ± 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 > 342.631 ± 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 > 573.635 ± 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 > 9534.136 ± 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 > 22718.726 ± 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 > 63.558 ± 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 > 82.504 ± 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 > 120.591 ± 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 > 324.314 ± 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 > 532.678 ± 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 > 678.126 ± 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 > 771.603 ± 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 > 889.608 ± 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 > 3663.557 ± 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 > 7017.784 ± 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 > 128670.660 ± 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 > 317113.667 ± 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 > 48.455 ± 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 > 57.937 ± 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 > 73.823 ± 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 > 106.484 ± 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 > 141.004 ± 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 > 156.284 ± 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 > 174.137 ± 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 > 188.445 ± 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 > 610.847 ± 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 > 1155.368 ± 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 > 19751.477 ± 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 > 50046.586 ± 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 > 64.130 ± 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 > 82.096 ± 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 > 118.849 ± 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 > 331.177 ± 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 > 549.117 ± 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 > 702.951 ± 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 > 799.566 ± 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 > 923.749 ± 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 > 4000.725 ± 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 > 7674.994 ± 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 > 142059.001 ± 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 > 355698.369 ± 216.542 ns/op
Dong Bo has updated the pull request incrementally with one additional commit since the last revision: fix misleading annotations ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/faa830cf..a342ad1e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228