Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]

2024-06-19 Thread Shaojin Wen
On Thu, 13 Jun 2024 09:51:21 GMT, Claes Redestad  wrote:

>> Shaojin Wen has updated the pull request incrementally with two additional 
>> commits since the last revision:
>> 
>>  - rename benchmark
>>  - code format & use long address
>
> Replacing `StringUTF16.putChar` with an `Unsafe.putByte` construct seems safe 
> since the former already depends on bounds checks having been done elsewhere. 
> Replacing `byte[]` stores with `Unsafe.putByte` requires a thorough 
> examination as that would drop implicit bounds checks.  
> 
> On my M1 laptop I get similar numbers:
> 
> Name Cnt  Base   Error   Test   Error  Unit  
> Change
> StringBuilders.appendWithBool8Latin1  15 7,500 ± 0,072  6,112 ± 0,049 ns/op   
> 1,23x (p = 0,000*)
> StringBuilders.appendWithBool8Utf16   15 9,650 ± 0,021  7,304 ± 0,047 ns/op   
> 1,32x (p = 0,000*)
> StringBuilders.appendWithNull8Latin1  15 7,118 ± 0,033  5,158 ± 0,062 ns/op   
> 1,38x (p = 0,000*)
> StringBuilders.appendWithNull8Utf16   15 9,336 ± 0,083  6,870 ± 0,069 ns/op   
> 1,36x (p = 0,000*)
>   * = significant
> 
> 
> On a linux-x64 workstation some of the micros regress, though:
> 
> Name Cnt   BaseErrorTest   Error  
> Unit  Change
> StringBuilders.appendWithBool8Latin1  15 10.781 ±  0.236  16.279 ± 1.087 
> ns/op   0.66x (p = 0.000*)
> StringBuilders.appendWithBool8Utf16   15 22.942 ±  0.405  23.733 ± 0.792 
> ns/op   0.97x (p = 0.001*)
> StringBuilders.appendWithNull8Latin1  15 24.313 ± 10.479  24.397 ± 7.483 
> ns/op   1.00x (p = 0.979 )
> StringBuilders.appendWithNull8Utf16   15 38.704 ±  5.972  27.542 ± 5.620 
> ns/op   1.41x (p = 0.000*)
>   * = significant
> 
> 
> `-XX:+TraceMergeStores` seem to indicate the merging happens as it should in 
> each case:
> 
> 
> [TraceMergeStores]: Replace
>  2493  StoreB  === 1387 5370 2980 2939  [[ 1962 ]]  @byte[int:>=0] 
> (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  
> Memory: @byte[int:>=0] 
> (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: 
> StringLatin1::putCharsAt @ bci:50 (line 834) 
> AbstractStringBuilder::appendNull @ bci:34 (line 642) 
> AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ 
> bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) 
> StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub
>  @ bci:17 (line 190)
>  1962  StoreB  === 1387 2493 2494 2442  [[ 1388 ]]  @byte[int:>=0] 
> (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  
> Memory: @byte[int:>=0] 
> (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: 
> StringLatin1::putCharsAt @ bci:62 (line 835) 
> AbstractStringBuilder::appendNull @ bci:34 (line 642) 
> AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append...

The performance regression issue on x64 platform has been fixed. Please 
continue to review @cl4es @liach

-

PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2179620100


Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]

2024-06-13 Thread Shaojin Wen
On Thu, 13 Jun 2024 02:06:42 GMT, Shaojin Wen  wrote:

>> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into 
>> primitive arrays by combining values ​​into larger stores.
>> 
>> This PR rewrites the code of appendNull and append(boolean) methods so that 
>> these two methods can be optimized by C2.
>
> Shaojin Wen has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - rename benchmark
>  - code format & use long address

I also found that the `StringBuilders.appendWithBool8Latin1` case has 
performance regression on Intel® Xeon® Emerald Rapids and AMD EPYCTM Genoa.

I haven't found the reason yet

-

PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2165434407


Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]

2024-06-13 Thread Claes Redestad
On Thu, 13 Jun 2024 02:06:42 GMT, Shaojin Wen  wrote:

>> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into 
>> primitive arrays by combining values ​​into larger stores.
>> 
>> This PR rewrites the code of appendNull and append(boolean) methods so that 
>> these two methods can be optimized by C2.
>
> Shaojin Wen has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - rename benchmark
>  - code format & use long address

Replacing `StringUTF16.putChar` with an `Unsafe.putByte` construct seems safe 
since the former already depends on bounds checks having been done elsewhere. 
Replacing `byte[]` stores with `Unsafe.putByte` requires a thorough examination 
as that would drop implicit bounds checks.  

On my M1 laptop I get similar numbers:

Name Cnt  Base   Error   Test   Error  Unit  
Change
StringBuilders.appendWithBool8Latin1  15 7,500 ± 0,072  6,112 ± 0,049 ns/op   
1,23x (p = 0,000*)
StringBuilders.appendWithBool8Utf16   15 9,650 ± 0,021  7,304 ± 0,047 ns/op   
1,32x (p = 0,000*)
StringBuilders.appendWithNull8Latin1  15 7,118 ± 0,033  5,158 ± 0,062 ns/op   
1,38x (p = 0,000*)
StringBuilders.appendWithNull8Utf16   15 9,336 ± 0,083  6,870 ± 0,069 ns/op   
1,36x (p = 0,000*)
  * = significant


On a linux-x64 workstation some of the micros regress, though:

Name Cnt   BaseErrorTest   Error  Unit  
Change
StringBuilders.appendWithBool8Latin1  15 10.781 ±  0.236  16.279 ± 1.087 ns/op  
 0.66x (p = 0.000*)
StringBuilders.appendWithBool8Utf16   15 22.942 ±  0.405  23.733 ± 0.792 ns/op  
 0.97x (p = 0.001*)
StringBuilders.appendWithNull8Latin1  15 24.313 ± 10.479  24.397 ± 7.483 ns/op  
 1.00x (p = 0.979 )
StringBuilders.appendWithNull8Utf16   15 38.704 ±  5.972  27.542 ± 5.620 ns/op  
 1.41x (p = 0.000*)
  * = significant


`-XX:+TraceMergeStores` seem to indicate the merging happens as it should in 
each case:


[TraceMergeStores]: Replace
 2493  StoreB  === 1387 5370 2980 2939  [[ 1962 ]]  @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  Memory: 
@byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, 
idx=13; !jvms: StringLatin1::putCharsAt @ bci:50 (line 834) 
AbstractStringBuilder::appendNull @ bci:34 (line 642) 
AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 
(line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) 
StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub
 @ bci:17 (line 190)
 1962  StoreB  === 1387 2493 2494 2442  [[ 1388 ]]  @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  Memory: 
@byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, 
idx=13; !jvms: StringLatin1::putCharsAt @ bci:62 (line 835) 
AbstractStringBuilder::appendNull @ bci:34 (line 642) 
AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 
(line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) 
StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub
 @ bci:17 (line 190)
 1388  StoreB  === 1387 1962 1963 1341  [[ 833 ]]  @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  Memory: 
@byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, 
idx=13; !jvms: StringLatin1::putCharsAt @ bci:77 (line 836) 
AbstractStringBuilder::appendNull @ bci:34 (line 642) 
AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 
(line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) 
StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub
 @ bci:17 (line 190)
  833  StoreB  === 1387 1388 1389 1341  [[ 5311 425 1400 ]]  @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe  Memory: 
@byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, 
idx=13; !jvms: StringLatin1::putCharsAt @ bci:92 (line 837) 
AbstractStringBuilder::appendNull @ bci:34 (line 642) 
AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 
(line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) 
StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub
 @ bci:17 (line 190)
[TraceMergeStores]: with
 6207  ConI  === 0  [[ 6208 6210 ]]  #int:1819047278
 6210  StoreI  === 1387 5370 2980 6207  [[ ]]  @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; 
mismatched  Memory: @byte[int:>=0] 
(java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13;

-

PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2165167049


Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]

2024-06-12 Thread Shaojin Wen
> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into 
> primitive arrays by combining values ​​into larger stores.
> 
> This PR rewrites the code of appendNull and append(boolean) methods so that 
> these two methods can be optimized by C2.

Shaojin Wen has updated the pull request incrementally with two additional 
commits since the last revision:

 - rename benchmark
 - code format & use long address

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19626/files
  - new: https://git.openjdk.org/jdk/pull/19626/files/7b3cf602..22d45124

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=06-07

  Stats: 43 lines in 3 files changed: 8 ins; 2 del; 33 mod
  Patch: https://git.openjdk.org/jdk/pull/19626.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19626/head:pull/19626

PR: https://git.openjdk.org/jdk/pull/19626