Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]
On Thu, 13 Jun 2024 09:51:21 GMT, Claes Redestad wrote: >> Shaojin Wen has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - rename benchmark >> - code format & use long address > > Replacing `StringUTF16.putChar` with an `Unsafe.putByte` construct seems safe > since the former already depends on bounds checks having been done elsewhere. > Replacing `byte[]` stores with `Unsafe.putByte` requires a thorough > examination as that would drop implicit bounds checks. > > On my M1 laptop I get similar numbers: > > Name Cnt Base Error Test Error Unit > Change > StringBuilders.appendWithBool8Latin1 15 7,500 ± 0,072 6,112 ± 0,049 ns/op > 1,23x (p = 0,000*) > StringBuilders.appendWithBool8Utf16 15 9,650 ± 0,021 7,304 ± 0,047 ns/op > 1,32x (p = 0,000*) > StringBuilders.appendWithNull8Latin1 15 7,118 ± 0,033 5,158 ± 0,062 ns/op > 1,38x (p = 0,000*) > StringBuilders.appendWithNull8Utf16 15 9,336 ± 0,083 6,870 ± 0,069 ns/op > 1,36x (p = 0,000*) > * = significant > > > On a linux-x64 workstation some of the micros regress, though: > > Name Cnt BaseErrorTest Error > Unit Change > StringBuilders.appendWithBool8Latin1 15 10.781 ± 0.236 16.279 ± 1.087 > ns/op 0.66x (p = 0.000*) > StringBuilders.appendWithBool8Utf16 15 22.942 ± 0.405 23.733 ± 0.792 > ns/op 0.97x (p = 0.001*) > StringBuilders.appendWithNull8Latin1 15 24.313 ± 10.479 24.397 ± 7.483 > ns/op 1.00x (p = 0.979 ) > StringBuilders.appendWithNull8Utf16 15 38.704 ± 5.972 27.542 ± 5.620 > ns/op 1.41x (p = 0.000*) > * = significant > > > `-XX:+TraceMergeStores` seem to indicate the merging happens as it should in > each case: > > > [TraceMergeStores]: Replace > 2493 StoreB === 1387 5370 2980 2939 [[ 1962 ]] @byte[int:>=0] > (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe > Memory: @byte[int:>=0] > (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: > StringLatin1::putCharsAt @ bci:50 (line 834) > AbstractStringBuilder::appendNull @ bci:34 (line 642) > AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ > bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) > StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub > @ bci:17 (line 190) > 1962 StoreB === 1387 2493 2494 2442 [[ 1388 ]] @byte[int:>=0] > (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe > Memory: @byte[int:>=0] > (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: > StringLatin1::putCharsAt @ bci:62 (line 835) > AbstractStringBuilder::appendNull @ bci:34 (line 642) > AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append... The performance regression issue on x64 platform has been fixed. Please continue to review @cl4es @liach - PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2179620100
Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]
On Thu, 13 Jun 2024 02:06:42 GMT, Shaojin Wen wrote: >> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into >> primitive arrays by combining values into larger stores. >> >> This PR rewrites the code of appendNull and append(boolean) methods so that >> these two methods can be optimized by C2. > > Shaojin Wen has updated the pull request incrementally with two additional > commits since the last revision: > > - rename benchmark > - code format & use long address I also found that the `StringBuilders.appendWithBool8Latin1` case has performance regression on Intel® Xeon® Emerald Rapids and AMD EPYCTM Genoa. I haven't found the reason yet - PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2165434407
Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]
On Thu, 13 Jun 2024 02:06:42 GMT, Shaojin Wen wrote: >> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into >> primitive arrays by combining values into larger stores. >> >> This PR rewrites the code of appendNull and append(boolean) methods so that >> these two methods can be optimized by C2. > > Shaojin Wen has updated the pull request incrementally with two additional > commits since the last revision: > > - rename benchmark > - code format & use long address Replacing `StringUTF16.putChar` with an `Unsafe.putByte` construct seems safe since the former already depends on bounds checks having been done elsewhere. Replacing `byte[]` stores with `Unsafe.putByte` requires a thorough examination as that would drop implicit bounds checks. On my M1 laptop I get similar numbers: Name Cnt Base Error Test Error Unit Change StringBuilders.appendWithBool8Latin1 15 7,500 ± 0,072 6,112 ± 0,049 ns/op 1,23x (p = 0,000*) StringBuilders.appendWithBool8Utf16 15 9,650 ± 0,021 7,304 ± 0,047 ns/op 1,32x (p = 0,000*) StringBuilders.appendWithNull8Latin1 15 7,118 ± 0,033 5,158 ± 0,062 ns/op 1,38x (p = 0,000*) StringBuilders.appendWithNull8Utf16 15 9,336 ± 0,083 6,870 ± 0,069 ns/op 1,36x (p = 0,000*) * = significant On a linux-x64 workstation some of the micros regress, though: Name Cnt BaseErrorTest Error Unit Change StringBuilders.appendWithBool8Latin1 15 10.781 ± 0.236 16.279 ± 1.087 ns/op 0.66x (p = 0.000*) StringBuilders.appendWithBool8Utf16 15 22.942 ± 0.405 23.733 ± 0.792 ns/op 0.97x (p = 0.001*) StringBuilders.appendWithNull8Latin1 15 24.313 ± 10.479 24.397 ± 7.483 ns/op 1.00x (p = 0.979 ) StringBuilders.appendWithNull8Utf16 15 38.704 ± 5.972 27.542 ± 5.620 ns/op 1.41x (p = 0.000*) * = significant `-XX:+TraceMergeStores` seem to indicate the merging happens as it should in each case: [TraceMergeStores]: Replace 2493 StoreB === 1387 5370 2980 2939 [[ 1962 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: StringLatin1::putCharsAt @ bci:50 (line 834) AbstractStringBuilder::appendNull @ bci:34 (line 642) AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub @ bci:17 (line 190) 1962 StoreB === 1387 2493 2494 2442 [[ 1388 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: StringLatin1::putCharsAt @ bci:62 (line 835) AbstractStringBuilder::appendNull @ bci:34 (line 642) AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub @ bci:17 (line 190) 1388 StoreB === 1387 1962 1963 1341 [[ 833 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: StringLatin1::putCharsAt @ bci:77 (line 836) AbstractStringBuilder::appendNull @ bci:34 (line 642) AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub @ bci:17 (line 190) 833 StoreB === 1387 1388 1389 1341 [[ 5311 425 1400 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=13; unsafe Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; !jvms: StringLatin1::putCharsAt @ bci:92 (line 837) AbstractStringBuilder::appendNull @ bci:34 (line 642) AbstractStringBuilder::append @ bci:5 (line 587) StringBuilder::append @ bci:2 (line 179) StringBuilders::appendWithNull8Latin1 @ bci:38 (line 267) StringBuilders_appendWithNull8Latin1_jmhTest::appendWithNull8Latin1_avgt_jmhStub @ bci:17 (line 190) [TraceMergeStores]: with 6207 ConI === 0 [[ 6208 6210 ]] #int:1819047278 6210 StoreI === 1387 5370 2980 6207 [[ ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; mismatched Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact[2] *, idx=13; - PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2165167049
Re: RFR: 8333893: Optimization for StringBuilder append boolean & null [v8]
> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into > primitive arrays by combining values into larger stores. > > This PR rewrites the code of appendNull and append(boolean) methods so that > these two methods can be optimized by C2. Shaojin Wen has updated the pull request incrementally with two additional commits since the last revision: - rename benchmark - code format & use long address - Changes: - all: https://git.openjdk.org/jdk/pull/19626/files - new: https://git.openjdk.org/jdk/pull/19626/files/7b3cf602..22d45124 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=06-07 Stats: 43 lines in 3 files changed: 8 ins; 2 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/19626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19626/head:pull/19626 PR: https://git.openjdk.org/jdk/pull/19626