Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
On Thu, 2 Feb 2023 15:33:29 GMT, Scott Gibbons wrote: >> Names are important, but always hard to get right. At the very least they >> need to be correct. Maybe call it something like >> `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? > > I prefer leaving them the way they are. I don't think the names, along with > the associated comments within the tables, causes any undue confusion as to > their function. However I will implement the name change if that's all it > takes to procure a review approval. Please provide the specific names you'd > like me to use and I'll change them. Or just approve as-is :-). I meant no disrespect here. By your own words `base64_AVX2_decode_URL_tables_addr` is "essentially incorrect", so I suggested some alternatives that I thought would make the code slightly more approachable. Regardless of whether you fix this detail I am not ready to approve this PR since I would need time to digest the latest changes. I'll ask a more senior engineer to review and give final approval (these changes need 2 reviewers approval anyhow). - PR: https://git.openjdk.org/jdk/pull/12126
Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
On Thu, 2 Feb 2023 00:34:32 GMT, Claes Redestad wrote: >> These tables are used for both URL and non-URL based on the parameter, and >> they are only two of the three lut tables used (the other is in >> `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. >> Does the name really matter that much? It's the same as >> `base64_AVX2_decode_tables_addr` with the addition of URL tables. > > Names are important, but always hard to get right. At the very least they > need to be correct. Maybe call it something like > `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? I prefer leaving them the way they are. I don't think the names, along with the associated comments within the tables, causes any undue confusion as to their function. However I will implement the name change if that's all it takes to procure a review approval. Please provide the specific names you'd like me to use and I'll change them. Or just approve as-is :-). - PR: https://git.openjdk.org/jdk/pull/12126
Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
On Wed, 1 Feb 2023 20:59:24 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: >> >>> 2200: } >>> 2201: >>> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { >> >> Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL >> decoding alike. > > These tables are used for both URL and non-URL based on the parameter, and > they are only two of the three lut tables used (the other is in > `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. > Does the name really matter that much? It's the same as > `base64_AVX2_decode_tables_addr` with the addition of URL tables. Names are important, but always hard to get right. At the very least they need to be correct. Maybe call it something like `..parameterized_decode_tables..` and the other `..shared_decode_tables..`? - PR: https://git.openjdk.org/jdk/pull/12126
Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
On Wed, 1 Feb 2023 20:53:54 GMT, Claes Redestad wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Handle AVX2 URL; address review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: > >> 2200: } >> 2201: >> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { > > Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL > decoding alike. These tables are used for both URL and non-URL based on the parameter, and they are only two of the three lut tables used (the other is in `base64_AVX2_decode_tables_addr` ). Both names are essentially incorrect. Does the name really matter that much? It's the same as `base64_AVX2_decode_tables_addr` with the addition of URL tables. - PR: https://git.openjdk.org/jdk/pull/12126
Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
On Wed, 1 Feb 2023 18:28:25 GMT, Scott Gibbons wrote: >> Added code for Base64 acceleration (encode and decode) which will accelerate >> ~4x for AVX2 platforms. >> >> Encode performance: >> **Old:** >> >> Benchmark (maxNumBytes) Mode Cnt Score Error >> Units >> Base64Encode.testBase64Encode 1024 thrpt3 4309.439 ± 2.632 >> ops/ms >> >> >> **New:** >> >> Benchmark (maxNumBytes) Mode Cnt Score >> Error Units >> Base64Encode.testBase64Encode 1024 thrpt3 24211.397 ± >> 102.026 ops/ms >> >> >> Decode performance: >> **Old:** >> >> Benchmark (errorIndex) (lineSize) (maxNumBytes) >> Mode Cnt ScoreError Units >> Base64Decode.testBase64Decode 144 4 1024 >> thrpt3 3961.768 ± 93.409 ops/ms >> >> **New:** >> Benchmark (errorIndex) (lineSize) (maxNumBytes) >> Mode Cnt ScoreError Units >> Base64Decode.testBase64Decode 144 4 1024 >> thrpt3 14738.051 ± 24.383 ops/ms > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Handle AVX2 URL; address review comments src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202: > 2200: } > 2201: > 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() { Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL decoding alike. - PR: https://git.openjdk.org/jdk/pull/12126
Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]
> Added code for Base64 acceleration (encode and decode) which will accelerate > ~4x for AVX2 platforms. > > Encode performance: > **Old:** > > Benchmark (maxNumBytes) Mode Cnt Score Error > Units > Base64Encode.testBase64Encode 1024 thrpt3 4309.439 ± 2.632 > ops/ms > > > **New:** > > Benchmark (maxNumBytes) Mode Cnt Score Error > Units > Base64Encode.testBase64Encode 1024 thrpt3 24211.397 ± 102.026 > ops/ms > > > Decode performance: > **Old:** > > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode > Cnt ScoreError Units > Base64Decode.testBase64Decode 144 4 1024 thrpt >3 3961.768 ± 93.409 ops/ms > > **New:** > Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode > Cnt ScoreError Units > Base64Decode.testBase64Decode 144 4 1024 thrpt >3 14738.051 ± 24.383 ops/ms Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Handle AVX2 URL; address review comments - Changes: - all: https://git.openjdk.org/jdk/pull/12126/files - new: https://git.openjdk.org/jdk/pull/12126/files/3e66f7be..4007e984 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=05-06 Stats: 94 lines in 4 files changed: 48 ins; 32 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/12126.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126 PR: https://git.openjdk.org/jdk/pull/12126