Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов wrote: > Originally this was spotted by by Amir Hadadi in > https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor > > It looks like in the following code in `String(byte[], int, int, Charset)` > > while (offset < sl) { > int b1 = bytes[offset]; > if (b1 >= 0) { > dst[dp++] = (byte)b1; > offset++; // <--- > continue; > } > if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && > offset + 1 < sl) { > int b2 = bytes[offset + 1]; > if (!isNotContinuation(b2)) { > dst[dp++] = (byte)decode2(b1, b2); > offset += 2; > continue; > } > } > // anything not a latin1, including the repl > // we have to go with the utf16 > break; > } > > bounds check elimination is not executed when accessing byte array via > `bytes[offset]. > > The reason, I guess, is that offset variable is modified within the loop > (marked with arrow). > > Possible fix for this could be changing: > > `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)` > > However the best is to invest in C2 optimization to handle all such cases. > > The following benchmark demonstrates good improvement: > > @State(Scope.Thread) > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > public class StringConstructorBenchmark { > private byte[] array; > private String str; > > @Setup > public void setup() { > str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. > Я";//Latin1 ending with Russian > array = str.getBytes(StandardCharsets.UTF_8); > } > > @Benchmark > public String newString() { > return new String(array, 0, array.length, StandardCharsets.UTF_8); > } > > @Benchmark > public String translateEscapes() { > return str.translateEscapes(); > } > } > > Results: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op > > The same is observed in String.translateEscapes() for the same String as in > the benchmark above: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op > > Also I've looked into this with `LinuxPerfAsmProfiler`, full output for > baseline is available here > https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for > patched code here > https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7. > > Here's the part of baseline assembly responsible for `while` loop: > > 3.62% │││ 0x7fed70eb4c1c: mov%ebx,%ecx > 2.29% │││ 0x7fed70eb4c1e: mov%edx,%r9d > 2.22% │││ 0x7fed70eb4c21: mov(%rsp),%r8 >;*iload_2 {reexecute=0 rethrow=0 return_oop=0} > │││ >; - java.lang.String::init@107 (line 537) > 2.32% ↘││ 0x7fed70eb4c25: cmp%r13d,%ecx >││ 0x7fed70eb4c28: jge > 0x7fed70eb5388 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0} >││ >; - java.lang.String::init@110 (line 537) > 3.05%││ 0x7fed70eb4c2e: cmp0x8(%rsp),%ecx >││ 0x7fed70eb4c32: jae > 0x7fed70eb5319 > 2.38%││ 0x7fed70eb4c38: mov%r8,(%rsp) > 2.64%││ 0x7fed70eb4c3c: movslq %ecx,%r8 > 2.46%││ 0x7fed70eb4c3f: mov%rax,%rbx > 3.44%││ 0x7fed70eb4c42: sub%r8,%rbx > 2.62%││ 0x7fed70eb4c45: add$0x1,%rbx > 2.64%││ 0x7fed70eb4c49: and > $0xfffe,%rbx > 2.30%││ 0x7fed70eb4c4d: mov%ebx,%r8d > 3.08%││ 0x7fed70eb4c50: add%ecx,%r8d > 2.55%││ 0x7fed70eb4c53: movslq %r8d,%r8 > 2.45%││ 0x7fed70eb4c56: add > $0xfffe,%r8 > 2.13%││ 0x7fed70eb4c5a: cmp(%rsp),%r8 >││ 0x7fed70eb4c5e: jae > 0x7fed70eb5319 > 3.36%││ 0x7fed70eb4c64: mov%ecx,%edi >;*aload_1
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Tue, 21 Dec 2021 12:50:36 GMT, Roland Westrelin wrote: >> @dean-long actually the issue reproduces with Java 17 where >> `checkBoundsOffCount` was implemented in a more straight forward manner: >> >> >> static void checkBoundsOffCount(int offset, int count, int length) { >> if (offset < 0 || count < 0 || offset > length - count) { >> throw new StringIndexOutOfBoundsException( >> "offset " + offset + ", count " + count + ", length " + length); >> } >> } >> >> >> >> Here's a >> [gist](https://gist.github.com/amirhadadi/9505c3f5d9ad68cad2fbfd1b9e01f0b8) >> with a benchmark you can run. This benchmark compares safe and unsafe reads >> from the byte array (In this gist I didn't modify the code to add the >> offset >= 0 condition). >> >> Here are the results: >> >> >> OpenJDK 17.0.1+12 >> OSX with 2.9 GHz Quad-Core Intel Core i7 >> >> >> >> Benchmark Mode CntScoreError Units >> StringBenchmark.safeDecodingavgt 20 120.312 ± 11.674 ns/op >> StringBenchmark.unsafeDecoding avgt 20 72.628 ± 0.479 ns/op > > @amirhadadi unsafeDecode() is buggy I think. Offsets in the array when read > with unsafe should be computed as `offset * unsafe.ARRAY_BYTE_INDEX_SCALE + > unsafe.ARRAY_BYTE_BASE_OFFSET`. @rwestrel thanks for the correction! Here are the updated results: Benchmark Mode CntScore Error Units StringBenchmark.safeDecodingavgt 20 113.849 ± 1.609 ns/op StringBenchmark.unsafeDecoding avgt 20 85.272 ± 1.462 ns/op - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Sat, 18 Dec 2021 21:48:33 GMT, amirhadadi wrote: >> This does look like a HotSpot JIT compiler issue to me. My guess is that it >> is related to how checkBoundsOffCount() checks for offset < 0: >> >> 396 if ((length | fromIndex | size) < 0 || size > length - >> fromIndex) >> >> using | to combine three values. > > @dean-long actually the issue reproduces with Java 17 where > `checkBoundsOffCount` was implemented in a more straight forward manner: > > > static void checkBoundsOffCount(int offset, int count, int length) { > if (offset < 0 || count < 0 || offset > length - count) { > throw new StringIndexOutOfBoundsException( > "offset " + offset + ", count " + count + ", length " + length); > } > } > > > > Here's a > [gist](https://gist.github.com/amirhadadi/9505c3f5d9ad68cad2fbfd1b9e01f0b8) > with a benchmark you can run. This benchmark compares safe and unsafe reads > from the byte array (In this gist I didn't modify the code to add the offset > >= 0 condition). > > Here are the results: > > > OpenJDK 17.0.1+12 > OSX with 2.9 GHz Quad-Core Intel Core i7 > > > > Benchmark Mode CntScoreError Units > StringBenchmark.safeDecodingavgt 20 120.312 ± 11.674 ns/op > StringBenchmark.unsafeDecoding avgt 20 72.628 ± 0.479 ns/op @amirhadadi unsafeDecode() is buggy I think. Offsets in the array when read with unsafe should be computed as `offset * unsafe.ARRAY_BYTE_INDEX_SCALE + unsafe.ARRAY_BYTE_BASE_OFFSET`. - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Fri, 17 Dec 2021 22:33:30 GMT, Dean Long wrote: >> Originally this was spotted by by Amir Hadadi in >> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor >> >> It looks like in the following code in `String(byte[], int, int, Charset)` >> >> while (offset < sl) { >> int b1 = bytes[offset]; >> if (b1 >= 0) { >> dst[dp++] = (byte)b1; >> offset++; // <--- >> continue; >> } >> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && >> offset + 1 < sl) { >> int b2 = bytes[offset + 1]; >> if (!isNotContinuation(b2)) { >> dst[dp++] = (byte)decode2(b1, b2); >> offset += 2; >> continue; >> } >> } >> // anything not a latin1, including the repl >> // we have to go with the utf16 >> break; >> } >> >> bounds check elimination is not executed when accessing byte array via >> `bytes[offset]. >> >> The reason, I guess, is that offset variable is modified within the loop >> (marked with arrow). >> >> Possible fix for this could be changing: >> >> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)` >> >> However the best is to invest in C2 optimization to handle all such cases. >> >> The following benchmark demonstrates good improvement: >> >> @State(Scope.Thread) >> @BenchmarkMode(Mode.AverageTime) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> public class StringConstructorBenchmark { >> private byte[] array; >> private String str; >> >> @Setup >> public void setup() { >> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. >> Я";//Latin1 ending with Russian >> array = str.getBytes(StandardCharsets.UTF_8); >> } >> >> @Benchmark >> public String newString() { >> return new String(array, 0, array.length, StandardCharsets.UTF_8); >> } >> >> @Benchmark >> public String translateEscapes() { >> return str.translateEscapes(); >> } >> } >> >> Results: >> >> //baseline >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op >> >> //patched >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op >> >> The same is observed in String.translateEscapes() for the same String as in >> the benchmark above: >> >> //baseline >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op >> >> //patched >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op >> >> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for >> baseline is available here >> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for >> patched code here >> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7. >> >> Here's the part of baseline assembly responsible for `while` loop: >> >> 3.62% │││ 0x7fed70eb4c1c: mov%ebx,%ecx >> 2.29% │││ 0x7fed70eb4c1e: mov%edx,%r9d >> 2.22% │││ 0x7fed70eb4c21: mov(%rsp),%r8 >> ;*iload_2 {reexecute=0 rethrow=0 return_oop=0} >> │││ >> ; - java.lang.String::init@107 (line 537) >> 2.32% ↘││ 0x7fed70eb4c25: cmp%r13d,%ecx >>││ 0x7fed70eb4c28: jge >> 0x7fed70eb5388 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0} >>││ >> ; - java.lang.String::init@110 (line 537) >> 3.05%││ 0x7fed70eb4c2e: cmp >> 0x8(%rsp),%ecx >>││ 0x7fed70eb4c32: jae >> 0x7fed70eb5319 >> 2.38%││ 0x7fed70eb4c38: mov%r8,(%rsp) >> 2.64%││ 0x7fed70eb4c3c: movslq %ecx,%r8 >> 2.46%││ 0x7fed70eb4c3f: mov%rax,%rbx >> 3.44%││ 0x7fed70eb4c42: sub%r8,%rbx >> 2.62%││ 0x7fed70eb4c45: add$0x1,%rbx >> 2.64%││ 0x7fed70eb4c49: and >> $0xfffe,%rbx >> 2.30%││ 0x7fed70eb4c4d: mov%ebx,%r8d >> 3.08%││ 0x7fed70eb4c50: add%ecx,%r8d >> 2.55%││ 0x7fed70eb4c53: movslq %r8d,%r8 >> 2.45%││ 0x7fed70eb4c56: add >> $0xfffe,%r8 >> 2.13%││ 0x7fed70eb4c5a: cmp(%rsp),%r8 >>││ 0x7fed70eb4c5e: jae >>
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов wrote: > Originally this was spotted by by Amir Hadadi in > https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor > > It looks like in the following code in `String(byte[], int, int, Charset)` > > while (offset < sl) { > int b1 = bytes[offset]; > if (b1 >= 0) { > dst[dp++] = (byte)b1; > offset++; // <--- > continue; > } > if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && > offset + 1 < sl) { > int b2 = bytes[offset + 1]; > if (!isNotContinuation(b2)) { > dst[dp++] = (byte)decode2(b1, b2); > offset += 2; > continue; > } > } > // anything not a latin1, including the repl > // we have to go with the utf16 > break; > } > > bounds check elimination is not executed when accessing byte array via > `bytes[offset]. > > The reason, I guess, is that offset variable is modified within the loop > (marked with arrow). > > Possible fix for this could be changing: > > `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)` > > However the best is to invest in C2 optimization to handle all such cases. > > The following benchmark demonstrates good improvement: > > @State(Scope.Thread) > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > public class StringConstructorBenchmark { > private byte[] array; > private String str; > > @Setup > public void setup() { > str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. > Я";//Latin1 ending with Russian > array = str.getBytes(StandardCharsets.UTF_8); > } > > @Benchmark > public String newString() { > return new String(array, 0, array.length, StandardCharsets.UTF_8); > } > > @Benchmark > public String translateEscapes() { > return str.translateEscapes(); > } > } > > Results: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op > > The same is observed in String.translateEscapes() for the same String as in > the benchmark above: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op > > Also I've looked into this with `LinuxPerfAsmProfiler`, full output for > baseline is available here > https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for > patched code here > https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7. > > Here's the part of baseline assembly responsible for `while` loop: > > 3.62% │││ 0x7fed70eb4c1c: mov%ebx,%ecx > 2.29% │││ 0x7fed70eb4c1e: mov%edx,%r9d > 2.22% │││ 0x7fed70eb4c21: mov(%rsp),%r8 >;*iload_2 {reexecute=0 rethrow=0 return_oop=0} > │││ >; - java.lang.String::init@107 (line 537) > 2.32% ↘││ 0x7fed70eb4c25: cmp%r13d,%ecx >││ 0x7fed70eb4c28: jge > 0x7fed70eb5388 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0} >││ >; - java.lang.String::init@110 (line 537) > 3.05%││ 0x7fed70eb4c2e: cmp0x8(%rsp),%ecx >││ 0x7fed70eb4c32: jae > 0x7fed70eb5319 > 2.38%││ 0x7fed70eb4c38: mov%r8,(%rsp) > 2.64%││ 0x7fed70eb4c3c: movslq %ecx,%r8 > 2.46%││ 0x7fed70eb4c3f: mov%rax,%rbx > 3.44%││ 0x7fed70eb4c42: sub%r8,%rbx > 2.62%││ 0x7fed70eb4c45: add$0x1,%rbx > 2.64%││ 0x7fed70eb4c49: and > $0xfffe,%rbx > 2.30%││ 0x7fed70eb4c4d: mov%ebx,%r8d > 3.08%││ 0x7fed70eb4c50: add%ecx,%r8d > 2.55%││ 0x7fed70eb4c53: movslq %r8d,%r8 > 2.45%││ 0x7fed70eb4c56: add > $0xfffe,%r8 > 2.13%││ 0x7fed70eb4c5a: cmp(%rsp),%r8 >││ 0x7fed70eb4c5e: jae > 0x7fed70eb5319 > 3.36%││ 0x7fed70eb4c64: mov%ecx,%edi >;*aload_1
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Wed, 15 Dec 2021 08:29:20 GMT, Сергей Цыпанов wrote: >>> @AlanBateman the benchmark is mine along with the changes for >>> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is >>> from SO. >> >> I don't know how we can progress this PR if the patch includes code copied >> from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave >> it to someone else to start again? Maybe you could get Amit to sign the OCA >> and you co-contribute/author the PR? I can't look at the patch so I don't >> know how significant the changes, maybe there are other options. > > @AlanBateman I suggest to decide first whether this should be fixed on > HotSpot level (which is the best option to me) or on Java level. > > If we choose HotSpot then the issue should be reassigned, because I cannot do > the fix myself. > > If we choose Java then I would revert the change for String constructor and > Amir would commit it into this branch having OCA signed. > > What do you think? @stsypanov I just looked at the generated code and make a guess, which may be not entirely true though. I think you could ask hotspot compiler mailing list for more profound analysis. - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Tue, 14 Dec 2021 13:20:46 GMT, Alan Bateman wrote: >>> Originally this was spotted by by Amir Hadadi in >>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor >> >> Before anyone looks at this, can you confirm that the patch does not include >> any code or tests/benchmarks that were taken from SO? > >> @AlanBateman the benchmark is mine along with the changes for >> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is >> from SO. > > I don't know how we can progress this PR if the patch includes code copied > from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave > it to someone else to start again? Maybe you could get Amit to sign the OCA > and you co-contribute/author the PR? I can't look at the patch so I don't > know how significant the changes, maybe there are other options. @AlanBateman I suggest to decide first whether this should be fixed on HotSpot level (which is the best option to me) or on Java level. If we choose HotSpot then the issue should be reassigned, because I cannot do the fix myself. If we choose Java then I would revert the change for String constructor and Amir would commit it into this branch having OCA signed. What do you think? - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Tue, 14 Dec 2021 19:12:29 GMT, Mai Đặng Quân Anh wrote: > The problem, at first glance, seems to be that our compiled code is trying to > compute this mysterious number @merykitty how do you view it? 樂 - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов wrote: > Originally this was spotted by by Amir Hadadi in > https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor > > It looks like in the following code in `String(byte[], int, int, Charset)` > > while (offset < sl) { > int b1 = bytes[offset]; > if (b1 >= 0) { > dst[dp++] = (byte)b1; > offset++; // <--- > continue; > } > if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && > offset + 1 < sl) { > int b2 = bytes[offset + 1]; > if (!isNotContinuation(b2)) { > dst[dp++] = (byte)decode2(b1, b2); > offset += 2; > continue; > } > } > // anything not a latin1, including the repl > // we have to go with the utf16 > break; > } > > bounds check elimination is not executed when accessing byte array via > `bytes[offset]. > > The reason, I guess, is that offset variable is modified within the loop > (marked with arrow). > > Possible fix for this could be changing: > > `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)` > > However the best is to invest in C2 optimization to handle all such cases. > > The following benchmark demonstrates good improvement: > > @State(Scope.Thread) > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > public class StringConstructorBenchmark { > private byte[] array; > private String str; > > @Setup > public void setup() { > str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. > Я";//Latin1 ending with Russian > array = str.getBytes(StandardCharsets.UTF_8); > } > > @Benchmark > public String newString() { > return new String(array, 0, array.length, StandardCharsets.UTF_8); > } > > @Benchmark > public String translateEscapes() { > return str.translateEscapes(); > } > } > > Results: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op > > The same is observed in String.translateEscapes() for the same String as in > the benchmark above: > > //baseline > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op > > //patched > Benchmark Mode Cnt Score Error Units > StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op > > Also I've looked into this with `LinuxPerfAsmProfiler`, full output for > baseline is available here > https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for > patched code here > https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7. > > Here's the part of baseline assembly responsible for `while` loop: > > 3.62% │││ 0x7fed70eb4c1c: mov%ebx,%ecx > 2.29% │││ 0x7fed70eb4c1e: mov%edx,%r9d > 2.22% │││ 0x7fed70eb4c21: mov(%rsp),%r8 >;*iload_2 {reexecute=0 rethrow=0 return_oop=0} > │││ >; - java.lang.String::init@107 (line 537) > 2.32% ↘││ 0x7fed70eb4c25: cmp%r13d,%ecx >││ 0x7fed70eb4c28: jge > 0x7fed70eb5388 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0} >││ >; - java.lang.String::init@110 (line 537) > 3.05%││ 0x7fed70eb4c2e: cmp0x8(%rsp),%ecx >││ 0x7fed70eb4c32: jae > 0x7fed70eb5319 > 2.38%││ 0x7fed70eb4c38: mov%r8,(%rsp) > 2.64%││ 0x7fed70eb4c3c: movslq %ecx,%r8 > 2.46%││ 0x7fed70eb4c3f: mov%rax,%rbx > 3.44%││ 0x7fed70eb4c42: sub%r8,%rbx > 2.62%││ 0x7fed70eb4c45: add$0x1,%rbx > 2.64%││ 0x7fed70eb4c49: and > $0xfffe,%rbx > 2.30%││ 0x7fed70eb4c4d: mov%ebx,%r8d > 3.08%││ 0x7fed70eb4c50: add%ecx,%r8d > 2.55%││ 0x7fed70eb4c53: movslq %r8d,%r8 > 2.45%││ 0x7fed70eb4c56: add > $0xfffe,%r8 > 2.13%││ 0x7fed70eb4c5a: cmp(%rsp),%r8 >││ 0x7fed70eb4c5e: jae > 0x7fed70eb5319 > 3.36%││ 0x7fed70eb4c64: mov%ecx,%edi >;*aload_1
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Tue, 14 Dec 2021 13:20:46 GMT, Alan Bateman wrote: >>> Originally this was spotted by by Amir Hadadi in >>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor >> >> Before anyone looks at this, can you confirm that the patch does not include >> any code or tests/benchmarks that were taken from SO? > >> @AlanBateman the benchmark is mine along with the changes for >> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is >> from SO. > > I don't know how we can progress this PR if the patch includes code copied > from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave > it to someone else to start again? Maybe you could get Amit to sign the OCA > and you co-contribute/author the PR? I can't look at the patch so I don't > know how significant the changes, maybe there are other options. @AlanBateman @stsypanov I have no problem signing the OCA. However like @stsypanov mentioned, I don't think this is the right approach. Adding explicit check that the offset is non negative makes no semantic difference, it just helps the optimizer to figure out that no bounds checking is needed. A better approach is to improve the optimizer so that it can apply bounds checking elimination in this case (and hopefully in many other cases). - PR: https://git.openjdk.java.net/jdk/pull/6812
Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination
On Mon, 13 Dec 2021 09:55:36 GMT, Alan Bateman wrote: >> Originally this was spotted by by Amir Hadadi in >> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor >> >> It looks like in the following code in `String(byte[], int, int, Charset)` >> >> while (offset < sl) { >> int b1 = bytes[offset]; >> if (b1 >= 0) { >> dst[dp++] = (byte)b1; >> offset++; // <--- >> continue; >> } >> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && >> offset + 1 < sl) { >> int b2 = bytes[offset + 1]; >> if (!isNotContinuation(b2)) { >> dst[dp++] = (byte)decode2(b1, b2); >> offset += 2; >> continue; >> } >> } >> // anything not a latin1, including the repl >> // we have to go with the utf16 >> break; >> } >> >> bounds check elimination is not executed when accessing byte array via >> `bytes[offset]. >> >> The reason, I guess, is that offset variable is modified within the loop >> (marked with arrow). >> >> Possible fix for this could be changing: >> >> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)` >> >> However the best is to invest in C2 optimization to handle all such cases. >> >> The following benchmark demonstrates good improvement: >> >> @State(Scope.Thread) >> @BenchmarkMode(Mode.AverageTime) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> public class StringConstructorBenchmark { >> private byte[] array; >> private String str; >> >> @Setup >> public void setup() { >> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. >> Я";//Latin1 ending with Russian >> array = str.getBytes(StandardCharsets.UTF_8); >> } >> >> @Benchmark >> public String newString() { >> return new String(array, 0, array.length, StandardCharsets.UTF_8); >> } >> >> @Benchmark >> public String translateEscapes() { >> return str.translateEscapes(); >> } >> } >> >> Results: >> >> //baseline >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op >> >> //patched >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op >> >> The same is observed in String.translateEscapes() for the same String as in >> the benchmark above: >> >> //baseline >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op >> >> //patched >> Benchmark Mode Cnt Score Error Units >> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op >> >> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for >> baseline is available here >> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for >> patched code here >> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7. >> >> Here's the part of baseline assembly responsible for `while` loop: >> >> 3.62% │││ 0x7fed70eb4c1c: mov%ebx,%ecx >> 2.29% │││ 0x7fed70eb4c1e: mov%edx,%r9d >> 2.22% │││ 0x7fed70eb4c21: mov(%rsp),%r8 >> ;*iload_2 {reexecute=0 rethrow=0 return_oop=0} >> │││ >> ; - java.lang.String::init@107 (line 537) >> 2.32% ↘││ 0x7fed70eb4c25: cmp%r13d,%ecx >>││ 0x7fed70eb4c28: jge >> 0x7fed70eb5388 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0} >>││ >> ; - java.lang.String::init@110 (line 537) >> 3.05%││ 0x7fed70eb4c2e: cmp >> 0x8(%rsp),%ecx >>││ 0x7fed70eb4c32: jae >> 0x7fed70eb5319 >> 2.38%││ 0x7fed70eb4c38: mov%r8,(%rsp) >> 2.64%││ 0x7fed70eb4c3c: movslq %ecx,%r8 >> 2.46%││ 0x7fed70eb4c3f: mov%rax,%rbx >> 3.44%││ 0x7fed70eb4c42: sub%r8,%rbx >> 2.62%││ 0x7fed70eb4c45: add$0x1,%rbx >> 2.64%││ 0x7fed70eb4c49: and >> $0xfffe,%rbx >> 2.30%││ 0x7fed70eb4c4d: mov%ebx,%r8d >> 3.08%││ 0x7fed70eb4c50: add%ecx,%r8d >> 2.55%││ 0x7fed70eb4c53: movslq %r8d,%r8 >> 2.45%││ 0x7fed70eb4c56: add >> $0xfffe,%r8 >> 2.13%││ 0x7fed70eb4c5a: cmp(%rsp),%r8 >>││ 0x7fed70eb4c5e: jae >>