Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-05 Thread Claes Redestad
On Thu, 2 Feb 2023 15:33:29 GMT, Scott Gibbons  wrote:

>> Names are important, but always hard to get right. At the very least they 
>> need to be correct. Maybe call it something like 
>> `..parameterized_decode_tables..` and the other `..shared_decode_tables..`?
>
> I prefer leaving them the way they are.  I don't think the names, along with 
> the associated comments within the tables, causes any undue confusion as to 
> their function.  However I will implement the name change if that's all it 
> takes to procure a review approval.  Please provide the specific names you'd 
> like me to use and I'll change them.  Or just approve as-is :-).

I meant no disrespect here. By your own words 
`base64_AVX2_decode_URL_tables_addr` is "essentially incorrect", so I suggested 
some alternatives that I thought would make the code slightly more 
approachable. Regardless of whether you fix this detail I am not ready to 
approve this PR since I would need time to digest the latest changes. I'll ask 
a more senior engineer to review and give final approval (these changes need 2 
reviewers approval anyhow).

-

PR: https://git.openjdk.org/jdk/pull/12126


Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-02 Thread Scott Gibbons
On Thu, 2 Feb 2023 00:34:32 GMT, Claes Redestad  wrote:

>> These tables are used for both URL and non-URL based on the parameter, and 
>> they are only two of the three lut tables used (the other is in 
>> `base64_AVX2_decode_tables_addr` ).  Both names are essentially incorrect.  
>> Does the name really matter that much?  It's the same as 
>> `base64_AVX2_decode_tables_addr` with the addition of URL tables.
>
> Names are important, but always hard to get right. At the very least they 
> need to be correct. Maybe call it something like 
> `..parameterized_decode_tables..` and the other `..shared_decode_tables..`?

I prefer leaving them the way they are.  I don't think the names, along with 
the associated comments within the tables, causes any undue confusion as to 
their function.  However I will implement the name change if that's all it 
takes to procure a review approval.  Please provide the specific names you'd 
like me to use and I'll change them.  Or just approve as-is :-).

-

PR: https://git.openjdk.org/jdk/pull/12126


Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-01 Thread Claes Redestad
On Wed, 1 Feb 2023 20:59:24 GMT, Scott Gibbons  wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202:
>> 
>>> 2200: }
>>> 2201: 
>>> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() {
>> 
>> Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL 
>> decoding alike.
>
> These tables are used for both URL and non-URL based on the parameter, and 
> they are only two of the three lut tables used (the other is in 
> `base64_AVX2_decode_tables_addr` ).  Both names are essentially incorrect.  
> Does the name really matter that much?  It's the same as 
> `base64_AVX2_decode_tables_addr` with the addition of URL tables.

Names are important, but always hard to get right. At the very least they need 
to be correct. Maybe call it something like `..parameterized_decode_tables..` 
and the other `..shared_decode_tables..`?

-

PR: https://git.openjdk.org/jdk/pull/12126


Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-01 Thread Scott Gibbons
On Wed, 1 Feb 2023 20:53:54 GMT, Claes Redestad  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Handle AVX2 URL; address review comments
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202:
> 
>> 2200: }
>> 2201: 
>> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() {
> 
> Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL 
> decoding alike.

These tables are used for both URL and non-URL based on the parameter, and they 
are only two of the three lut tables used (the other is in 
`base64_AVX2_decode_tables_addr` ).  Both names are essentially incorrect.  
Does the name really matter that much?  It's the same as 
`base64_AVX2_decode_tables_addr` with the addition of URL tables.

-

PR: https://git.openjdk.org/jdk/pull/12126


Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-01 Thread Claes Redestad
On Wed, 1 Feb 2023 18:28:25 GMT, Scott Gibbons  wrote:

>> Added code for Base64 acceleration (encode and decode) which will accelerate 
>> ~4x for AVX2 platforms.
>> 
>> Encode performance:
>> **Old:**
>> 
>> Benchmark  (maxNumBytes)   Mode  Cnt Score   Error   
>> Units
>> Base64Encode.testBase64Encode   1024  thrpt3  4309.439 ± 2.632  
>> ops/ms
>> 
>> 
>> **New:**
>> 
>> Benchmark  (maxNumBytes)   Mode  Cnt  Score 
>> Error   Units
>> Base64Encode.testBase64Encode   1024  thrpt3  24211.397 ± 
>> 102.026  ops/ms
>> 
>> 
>> Decode performance:
>> **Old:**
>> 
>> Benchmark  (errorIndex)  (lineSize)  (maxNumBytes)   
>> Mode  Cnt ScoreError   Units
>> Base64Decode.testBase64Decode   144   4   1024  
>> thrpt3  3961.768 ± 93.409  ops/ms
>> 
>> **New:**
>> Benchmark  (errorIndex)  (lineSize)  (maxNumBytes)   
>> Mode  Cnt  ScoreError   Units
>> Base64Decode.testBase64Decode   144   4   1024  
>> thrpt3  14738.051 ± 24.383  ops/ms
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Handle AVX2 URL; address review comments

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2202:

> 2200: }
> 2201: 
> 2202: address StubGenerator::base64_AVX2_decode_URL_tables_addr() {

Shouldn't this be `decode_lut_tables`? As it's used for URL and non-URL 
decoding alike.

-

PR: https://git.openjdk.org/jdk/pull/12126


Re: RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v7]

2023-02-01 Thread Scott Gibbons
> Added code for Base64 acceleration (encode and decode) which will accelerate 
> ~4x for AVX2 platforms.
> 
> Encode performance:
> **Old:**
> 
> Benchmark  (maxNumBytes)   Mode  Cnt Score   Error   
> Units
> Base64Encode.testBase64Encode   1024  thrpt3  4309.439 ± 2.632  
> ops/ms
> 
> 
> **New:**
> 
> Benchmark  (maxNumBytes)   Mode  Cnt  Score Error 
>   Units
> Base64Encode.testBase64Encode   1024  thrpt3  24211.397 ± 102.026 
>  ops/ms
> 
> 
> Decode performance:
> **Old:**
> 
> Benchmark  (errorIndex)  (lineSize)  (maxNumBytes)   Mode 
>  Cnt ScoreError   Units
> Base64Decode.testBase64Decode   144   4   1024  thrpt 
>3  3961.768 ± 93.409  ops/ms
> 
> **New:**
> Benchmark  (errorIndex)  (lineSize)  (maxNumBytes)   Mode 
>  Cnt  ScoreError   Units
> Base64Decode.testBase64Decode   144   4   1024  thrpt 
>3  14738.051 ± 24.383  ops/ms

Scott Gibbons has updated the pull request incrementally with one additional 
commit since the last revision:

  Handle AVX2 URL; address review comments

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12126/files
  - new: https://git.openjdk.org/jdk/pull/12126/files/3e66f7be..4007e984

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12126&range=05-06

  Stats: 94 lines in 4 files changed: 48 ins; 32 del; 14 mod
  Patch: https://git.openjdk.org/jdk/pull/12126.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12126/head:pull/12126

PR: https://git.openjdk.org/jdk/pull/12126