Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2022-01-12 Thread Сергей Цыпанов
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов  wrote:

> Originally this was spotted by by Amir Hadadi in 
> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
> 
> It looks like in the following code in `String(byte[], int, int, Charset)`
> 
> while (offset < sl) {
> int b1 = bytes[offset];
> if (b1 >= 0) {
> dst[dp++] = (byte)b1;
> offset++;  // <---
> continue;
> }
> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
> offset + 1 < sl) {
> int b2 = bytes[offset + 1];
> if (!isNotContinuation(b2)) {
> dst[dp++] = (byte)decode2(b1, b2);
> offset += 2;
> continue;
> }
> }
> // anything not a latin1, including the repl
> // we have to go with the utf16
> break;
> }
> 
> bounds check elimination is not executed when accessing byte array via 
> `bytes[offset].
> 
> The reason, I guess, is that offset variable is modified within the loop 
> (marked with arrow).
> 
> Possible fix for this could be changing:
> 
> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
> 
> However the best is to invest in C2 optimization to handle all such cases.
> 
> The following benchmark demonstrates good improvement:
> 
> @State(Scope.Thread)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class StringConstructorBenchmark {
>   private byte[] array;
>   private String str;
> 
>   @Setup
>   public void setup() {
> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
> Я";//Latin1 ending with Russian
> array = str.getBytes(StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String newString()  {
>   return new String(array, 0, array.length, StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String translateEscapes()  {
>   return str.translateEscapes();
>   }
> }
> 
> Results:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
> 
> The same is observed in String.translateEscapes() for the same String as in 
> the benchmark above:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
> 
> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
> baseline is available here 
> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
> patched code here 
> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
> 
> Here's the part of baseline assembly responsible for `while` loop: 
> 
>   3.62%   │││  0x7fed70eb4c1c:   mov%ebx,%ecx
>   2.29%   │││  0x7fed70eb4c1e:   mov%edx,%r9d
>   2.22%   │││  0x7fed70eb4c21:   mov(%rsp),%r8
>;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>   │││ 
>; - java.lang.String::init@107 (line 537)
>   2.32%   ↘││  0x7fed70eb4c25:   cmp%r13d,%ecx
>││  0x7fed70eb4c28:   jge
> 0x7fed70eb5388   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>││ 
>; - java.lang.String::init@110 (line 537)
>   3.05%││  0x7fed70eb4c2e:   cmp0x8(%rsp),%ecx
>││  0x7fed70eb4c32:   jae
> 0x7fed70eb5319
>   2.38%││  0x7fed70eb4c38:   mov%r8,(%rsp)
>   2.64%││  0x7fed70eb4c3c:   movslq %ecx,%r8
>   2.46%││  0x7fed70eb4c3f:   mov%rax,%rbx
>   3.44%││  0x7fed70eb4c42:   sub%r8,%rbx
>   2.62%││  0x7fed70eb4c45:   add$0x1,%rbx
>   2.64%││  0x7fed70eb4c49:   and
> $0xfffe,%rbx
>   2.30%││  0x7fed70eb4c4d:   mov%ebx,%r8d
>   3.08%││  0x7fed70eb4c50:   add%ecx,%r8d
>   2.55%││  0x7fed70eb4c53:   movslq %r8d,%r8
>   2.45%││  0x7fed70eb4c56:   add
> $0xfffe,%r8
>   2.13%││  0x7fed70eb4c5a:   cmp(%rsp),%r8
>││  0x7fed70eb4c5e:   jae
> 0x7fed70eb5319
>   3.36%││  0x7fed70eb4c64:   mov%ecx,%edi 
>;*aload_1 

Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-21 Thread amirhadadi
On Tue, 21 Dec 2021 12:50:36 GMT, Roland Westrelin  wrote:

>> @dean-long actually the issue reproduces with Java 17 where 
>> `checkBoundsOffCount` was implemented in a more straight forward manner:
>> 
>> 
>> static void checkBoundsOffCount(int offset, int count, int length) {
>> if (offset < 0 || count < 0 || offset > length - count) {
>> throw new StringIndexOutOfBoundsException(
>> "offset " + offset + ", count " + count + ", length " + length);
>> }
>> }
>> 
>> 
>> 
>> Here's a 
>> [gist](https://gist.github.com/amirhadadi/9505c3f5d9ad68cad2fbfd1b9e01f0b8) 
>> with a benchmark you can run. This benchmark compares safe and unsafe reads 
>> from the byte array  (In this gist I didn't modify the code to add the 
>> offset >= 0 condition).
>> 
>> Here are the results:
>> 
>> 
>> OpenJDK 17.0.1+12
>> OSX with 2.9 GHz Quad-Core Intel Core i7
>> 
>> 
>> 
>> Benchmark   Mode  CntScoreError  Units
>> StringBenchmark.safeDecodingavgt   20  120.312 ± 11.674  ns/op
>> StringBenchmark.unsafeDecoding  avgt   20   72.628 ±  0.479  ns/op
>
> @amirhadadi unsafeDecode() is buggy I think. Offsets in the array when read 
> with unsafe should be computed as `offset * unsafe.ARRAY_BYTE_INDEX_SCALE + 
> unsafe.ARRAY_BYTE_BASE_OFFSET`.

@rwestrel thanks for the correction!

Here are the updated results:


Benchmark   Mode  CntScore   Error  Units
StringBenchmark.safeDecodingavgt   20  113.849 ± 1.609  ns/op
StringBenchmark.unsafeDecoding  avgt   20   85.272 ± 1.462  ns/op

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-21 Thread Roland Westrelin
On Sat, 18 Dec 2021 21:48:33 GMT, amirhadadi  wrote:

>> This does look like a HotSpot JIT compiler issue to me.  My guess is that it 
>> is related to how checkBoundsOffCount() checks for offset < 0:
>> 
>> 396  if ((length | fromIndex | size) < 0 || size > length - 
>> fromIndex)
>> 
>> using | to combine three values.
>
> @dean-long actually the issue reproduces with Java 17 where 
> `checkBoundsOffCount` was implemented in a more straight forward manner:
> 
> 
> static void checkBoundsOffCount(int offset, int count, int length) {
> if (offset < 0 || count < 0 || offset > length - count) {
> throw new StringIndexOutOfBoundsException(
> "offset " + offset + ", count " + count + ", length " + length);
> }
> }
> 
> 
> 
> Here's a 
> [gist](https://gist.github.com/amirhadadi/9505c3f5d9ad68cad2fbfd1b9e01f0b8) 
> with a benchmark you can run. This benchmark compares safe and unsafe reads 
> from the byte array  (In this gist I didn't modify the code to add the offset 
> >= 0 condition).
> 
> Here are the results:
> 
> 
> OpenJDK 17.0.1+12
> OSX with 2.9 GHz Quad-Core Intel Core i7
> 
> 
> 
> Benchmark   Mode  CntScoreError  Units
> StringBenchmark.safeDecodingavgt   20  120.312 ± 11.674  ns/op
> StringBenchmark.unsafeDecoding  avgt   20   72.628 ±  0.479  ns/op

@amirhadadi unsafeDecode() is buggy I think. Offsets in the array when read 
with unsafe should be computed as `offset * unsafe.ARRAY_BYTE_INDEX_SCALE + 
unsafe.ARRAY_BYTE_BASE_OFFSET`.

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-18 Thread amirhadadi
On Fri, 17 Dec 2021 22:33:30 GMT, Dean Long  wrote:

>> Originally this was spotted by by Amir Hadadi in 
>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
>> 
>> It looks like in the following code in `String(byte[], int, int, Charset)`
>> 
>> while (offset < sl) {
>> int b1 = bytes[offset];
>> if (b1 >= 0) {
>> dst[dp++] = (byte)b1;
>> offset++;  // <---
>> continue;
>> }
>> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
>> offset + 1 < sl) {
>> int b2 = bytes[offset + 1];
>> if (!isNotContinuation(b2)) {
>> dst[dp++] = (byte)decode2(b1, b2);
>> offset += 2;
>> continue;
>> }
>> }
>> // anything not a latin1, including the repl
>> // we have to go with the utf16
>> break;
>> }
>> 
>> bounds check elimination is not executed when accessing byte array via 
>> `bytes[offset].
>> 
>> The reason, I guess, is that offset variable is modified within the loop 
>> (marked with arrow).
>> 
>> Possible fix for this could be changing:
>> 
>> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
>> 
>> However the best is to invest in C2 optimization to handle all such cases.
>> 
>> The following benchmark demonstrates good improvement:
>> 
>> @State(Scope.Thread)
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class StringConstructorBenchmark {
>>   private byte[] array;
>>   private String str;
>> 
>>   @Setup
>>   public void setup() {
>> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
>> Я";//Latin1 ending with Russian
>> array = str.getBytes(StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String newString()  {
>>   return new String(array, 0, array.length, StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String translateEscapes()  {
>>   return str.translateEscapes();
>>   }
>> }
>> 
>> Results:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
>> 
>> The same is observed in String.translateEscapes() for the same String as in 
>> the benchmark above:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
>> 
>> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
>> baseline is available here 
>> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
>> patched code here 
>> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
>> 
>> Here's the part of baseline assembly responsible for `while` loop: 
>> 
>>   3.62%   │││  0x7fed70eb4c1c:   mov%ebx,%ecx
>>   2.29%   │││  0x7fed70eb4c1e:   mov%edx,%r9d
>>   2.22%   │││  0x7fed70eb4c21:   mov(%rsp),%r8   
>> ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>>   │││
>> ; - java.lang.String::init@107 (line 537)
>>   2.32%   ↘││  0x7fed70eb4c25:   cmp%r13d,%ecx
>>││  0x7fed70eb4c28:   jge
>> 0x7fed70eb5388   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>>││
>> ; - java.lang.String::init@110 (line 537)
>>   3.05%││  0x7fed70eb4c2e:   cmp
>> 0x8(%rsp),%ecx
>>││  0x7fed70eb4c32:   jae
>> 0x7fed70eb5319
>>   2.38%││  0x7fed70eb4c38:   mov%r8,(%rsp)
>>   2.64%││  0x7fed70eb4c3c:   movslq %ecx,%r8
>>   2.46%││  0x7fed70eb4c3f:   mov%rax,%rbx
>>   3.44%││  0x7fed70eb4c42:   sub%r8,%rbx
>>   2.62%││  0x7fed70eb4c45:   add$0x1,%rbx
>>   2.64%││  0x7fed70eb4c49:   and
>> $0xfffe,%rbx
>>   2.30%││  0x7fed70eb4c4d:   mov%ebx,%r8d
>>   3.08%││  0x7fed70eb4c50:   add%ecx,%r8d
>>   2.55%││  0x7fed70eb4c53:   movslq %r8d,%r8
>>   2.45%││  0x7fed70eb4c56:   add
>> $0xfffe,%r8
>>   2.13%││  0x7fed70eb4c5a:   cmp(%rsp),%r8
>>││  0x7fed70eb4c5e:   jae
>> 

Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-17 Thread Dean Long
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов  wrote:

> Originally this was spotted by by Amir Hadadi in 
> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
> 
> It looks like in the following code in `String(byte[], int, int, Charset)`
> 
> while (offset < sl) {
> int b1 = bytes[offset];
> if (b1 >= 0) {
> dst[dp++] = (byte)b1;
> offset++;  // <---
> continue;
> }
> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
> offset + 1 < sl) {
> int b2 = bytes[offset + 1];
> if (!isNotContinuation(b2)) {
> dst[dp++] = (byte)decode2(b1, b2);
> offset += 2;
> continue;
> }
> }
> // anything not a latin1, including the repl
> // we have to go with the utf16
> break;
> }
> 
> bounds check elimination is not executed when accessing byte array via 
> `bytes[offset].
> 
> The reason, I guess, is that offset variable is modified within the loop 
> (marked with arrow).
> 
> Possible fix for this could be changing:
> 
> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
> 
> However the best is to invest in C2 optimization to handle all such cases.
> 
> The following benchmark demonstrates good improvement:
> 
> @State(Scope.Thread)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class StringConstructorBenchmark {
>   private byte[] array;
>   private String str;
> 
>   @Setup
>   public void setup() {
> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
> Я";//Latin1 ending with Russian
> array = str.getBytes(StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String newString()  {
>   return new String(array, 0, array.length, StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String translateEscapes()  {
>   return str.translateEscapes();
>   }
> }
> 
> Results:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
> 
> The same is observed in String.translateEscapes() for the same String as in 
> the benchmark above:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
> 
> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
> baseline is available here 
> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
> patched code here 
> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
> 
> Here's the part of baseline assembly responsible for `while` loop: 
> 
>   3.62%   │││  0x7fed70eb4c1c:   mov%ebx,%ecx
>   2.29%   │││  0x7fed70eb4c1e:   mov%edx,%r9d
>   2.22%   │││  0x7fed70eb4c21:   mov(%rsp),%r8
>;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>   │││ 
>; - java.lang.String::init@107 (line 537)
>   2.32%   ↘││  0x7fed70eb4c25:   cmp%r13d,%ecx
>││  0x7fed70eb4c28:   jge
> 0x7fed70eb5388   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>││ 
>; - java.lang.String::init@110 (line 537)
>   3.05%││  0x7fed70eb4c2e:   cmp0x8(%rsp),%ecx
>││  0x7fed70eb4c32:   jae
> 0x7fed70eb5319
>   2.38%││  0x7fed70eb4c38:   mov%r8,(%rsp)
>   2.64%││  0x7fed70eb4c3c:   movslq %ecx,%r8
>   2.46%││  0x7fed70eb4c3f:   mov%rax,%rbx
>   3.44%││  0x7fed70eb4c42:   sub%r8,%rbx
>   2.62%││  0x7fed70eb4c45:   add$0x1,%rbx
>   2.64%││  0x7fed70eb4c49:   and
> $0xfffe,%rbx
>   2.30%││  0x7fed70eb4c4d:   mov%ebx,%r8d
>   3.08%││  0x7fed70eb4c50:   add%ecx,%r8d
>   2.55%││  0x7fed70eb4c53:   movslq %r8d,%r8
>   2.45%││  0x7fed70eb4c56:   add
> $0xfffe,%r8
>   2.13%││  0x7fed70eb4c5a:   cmp(%rsp),%r8
>││  0x7fed70eb4c5e:   jae
> 0x7fed70eb5319
>   3.36%││  0x7fed70eb4c64:   mov%ecx,%edi 
>;*aload_1 

Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-15 Thread Mai Đặng Quân Anh
On Wed, 15 Dec 2021 08:29:20 GMT, Сергей Цыпанов  wrote:

>>> @AlanBateman the benchmark is mine along with the changes for 
>>> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is 
>>> from SO.
>> 
>> I don't know how we can progress this PR if the patch includes code copied 
>> from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave 
>> it to someone else to start again? Maybe you could get Amit to sign the OCA 
>> and you co-contribute/author the PR? I can't look at the patch so I don't 
>> know how significant the changes, maybe there are other options.
>
> @AlanBateman I suggest to decide first whether this should be fixed on 
> HotSpot level (which is the best option to me) or on Java level.
> 
> If we choose HotSpot then the issue should be reassigned, because I cannot do 
> the fix myself.
> 
> If we choose Java then I would revert the change for String constructor and 
> Amir would commit it into this branch having OCA signed.
> 
> What do you think?

@stsypanov I just looked at the generated code and make a guess, which may be 
not entirely true though. I think you could ask hotspot compiler mailing list 
for more profound analysis.

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-15 Thread Сергей Цыпанов
On Tue, 14 Dec 2021 13:20:46 GMT, Alan Bateman  wrote:

>>> Originally this was spotted by by Amir Hadadi in 
>>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
>> 
>> Before anyone looks at this, can you confirm that the patch does not include 
>> any code or tests/benchmarks that were taken from SO?
>
>> @AlanBateman the benchmark is mine along with the changes for 
>> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is 
>> from SO.
> 
> I don't know how we can progress this PR if the patch includes code copied 
> from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave 
> it to someone else to start again? Maybe you could get Amit to sign the OCA 
> and you co-contribute/author the PR? I can't look at the patch so I don't 
> know how significant the changes, maybe there are other options.

@AlanBateman I suggest to decide first whether this should be fixed on HotSpot 
level (which is the best option to me) or on Java level.

If we choose HotSpot then the issue should be reassigned, because I cannot do 
the fix myself.

If we choose Java then I would revert the change for String constructor and 
Amir would commit it into this branch having OCA signed.

What do you think?

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-14 Thread Сергей Цыпанов
On Tue, 14 Dec 2021 19:12:29 GMT, Mai Đặng Quân Anh  
wrote:

> The problem, at first glance, seems to be that our compiled code is trying to 
> compute this mysterious number

@merykitty how do you view it? 樂

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-14 Thread Mai Đặng Quân Anh
On Mon, 13 Dec 2021 09:39:55 GMT, Сергей Цыпанов  wrote:

> Originally this was spotted by by Amir Hadadi in 
> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
> 
> It looks like in the following code in `String(byte[], int, int, Charset)`
> 
> while (offset < sl) {
> int b1 = bytes[offset];
> if (b1 >= 0) {
> dst[dp++] = (byte)b1;
> offset++;  // <---
> continue;
> }
> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
> offset + 1 < sl) {
> int b2 = bytes[offset + 1];
> if (!isNotContinuation(b2)) {
> dst[dp++] = (byte)decode2(b1, b2);
> offset += 2;
> continue;
> }
> }
> // anything not a latin1, including the repl
> // we have to go with the utf16
> break;
> }
> 
> bounds check elimination is not executed when accessing byte array via 
> `bytes[offset].
> 
> The reason, I guess, is that offset variable is modified within the loop 
> (marked with arrow).
> 
> Possible fix for this could be changing:
> 
> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
> 
> However the best is to invest in C2 optimization to handle all such cases.
> 
> The following benchmark demonstrates good improvement:
> 
> @State(Scope.Thread)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class StringConstructorBenchmark {
>   private byte[] array;
>   private String str;
> 
>   @Setup
>   public void setup() {
> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
> Я";//Latin1 ending with Russian
> array = str.getBytes(StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String newString()  {
>   return new String(array, 0, array.length, StandardCharsets.UTF_8);
>   }
> 
>   @Benchmark
>   public String translateEscapes()  {
>   return str.translateEscapes();
>   }
> }
> 
> Results:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
> 
> The same is observed in String.translateEscapes() for the same String as in 
> the benchmark above:
> 
> //baseline
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
> 
> //patched
> Benchmark Mode Cnt Score Error Units
> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
> 
> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
> baseline is available here 
> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
> patched code here 
> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
> 
> Here's the part of baseline assembly responsible for `while` loop: 
> 
>   3.62%   │││  0x7fed70eb4c1c:   mov%ebx,%ecx
>   2.29%   │││  0x7fed70eb4c1e:   mov%edx,%r9d
>   2.22%   │││  0x7fed70eb4c21:   mov(%rsp),%r8
>;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>   │││ 
>; - java.lang.String::init@107 (line 537)
>   2.32%   ↘││  0x7fed70eb4c25:   cmp%r13d,%ecx
>││  0x7fed70eb4c28:   jge
> 0x7fed70eb5388   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>││ 
>; - java.lang.String::init@110 (line 537)
>   3.05%││  0x7fed70eb4c2e:   cmp0x8(%rsp),%ecx
>││  0x7fed70eb4c32:   jae
> 0x7fed70eb5319
>   2.38%││  0x7fed70eb4c38:   mov%r8,(%rsp)
>   2.64%││  0x7fed70eb4c3c:   movslq %ecx,%r8
>   2.46%││  0x7fed70eb4c3f:   mov%rax,%rbx
>   3.44%││  0x7fed70eb4c42:   sub%r8,%rbx
>   2.62%││  0x7fed70eb4c45:   add$0x1,%rbx
>   2.64%││  0x7fed70eb4c49:   and
> $0xfffe,%rbx
>   2.30%││  0x7fed70eb4c4d:   mov%ebx,%r8d
>   3.08%││  0x7fed70eb4c50:   add%ecx,%r8d
>   2.55%││  0x7fed70eb4c53:   movslq %r8d,%r8
>   2.45%││  0x7fed70eb4c56:   add
> $0xfffe,%r8
>   2.13%││  0x7fed70eb4c5a:   cmp(%rsp),%r8
>││  0x7fed70eb4c5e:   jae
> 0x7fed70eb5319
>   3.36%││  0x7fed70eb4c64:   mov%ecx,%edi 
>;*aload_1 

Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-14 Thread amirhadadi
On Tue, 14 Dec 2021 13:20:46 GMT, Alan Bateman  wrote:

>>> Originally this was spotted by by Amir Hadadi in 
>>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
>> 
>> Before anyone looks at this, can you confirm that the patch does not include 
>> any code or tests/benchmarks that were taken from SO?
>
>> @AlanBateman the benchmark is mine along with the changes for 
>> `translateEscapes` and `newStringUTF8NoRepl`, the change for constructor is 
>> from SO.
> 
> I don't know how we can progress this PR if the patch includes code copied 
> from SO. Maybe the PR should be closed, the JBS issue unassigned, and leave 
> it to someone else to start again? Maybe you could get Amit to sign the OCA 
> and you co-contribute/author the PR? I can't look at the patch so I don't 
> know how significant the changes, maybe there are other options.

@AlanBateman @stsypanov I have no problem signing the OCA.
However like @stsypanov mentioned, I don't think this is the right approach.
Adding explicit check that the offset is non negative makes no semantic 
difference, it just helps the optimizer to figure out that no bounds checking 
is needed.
A better approach is to improve the optimizer so that it can apply bounds 
checking elimination in this case (and hopefully in many other cases).

-

PR: https://git.openjdk.java.net/jdk/pull/6812


Re: RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

2021-12-14 Thread Alan Bateman
On Mon, 13 Dec 2021 09:55:36 GMT, Alan Bateman  wrote:

>> Originally this was spotted by by Amir Hadadi in 
>> https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
>> 
>> It looks like in the following code in `String(byte[], int, int, Charset)`
>> 
>> while (offset < sl) {
>> int b1 = bytes[offset];
>> if (b1 >= 0) {
>> dst[dp++] = (byte)b1;
>> offset++;  // <---
>> continue;
>> }
>> if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
>> offset + 1 < sl) {
>> int b2 = bytes[offset + 1];
>> if (!isNotContinuation(b2)) {
>> dst[dp++] = (byte)decode2(b1, b2);
>> offset += 2;
>> continue;
>> }
>> }
>> // anything not a latin1, including the repl
>> // we have to go with the utf16
>> break;
>> }
>> 
>> bounds check elimination is not executed when accessing byte array via 
>> `bytes[offset].
>> 
>> The reason, I guess, is that offset variable is modified within the loop 
>> (marked with arrow).
>> 
>> Possible fix for this could be changing:
>> 
>> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
>> 
>> However the best is to invest in C2 optimization to handle all such cases.
>> 
>> The following benchmark demonstrates good improvement:
>> 
>> @State(Scope.Thread)
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class StringConstructorBenchmark {
>>   private byte[] array;
>>   private String str;
>> 
>>   @Setup
>>   public void setup() {
>> str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. 
>> Я";//Latin1 ending with Russian
>> array = str.getBytes(StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String newString()  {
>>   return new String(array, 0, array.length, StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String translateEscapes()  {
>>   return str.translateEscapes();
>>   }
>> }
>> 
>> Results:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
>> 
>> The same is observed in String.translateEscapes() for the same String as in 
>> the benchmark above:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
>> 
>> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for 
>> baseline is available here 
>> https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for 
>> patched code here 
>> https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
>> 
>> Here's the part of baseline assembly responsible for `while` loop: 
>> 
>>   3.62%   │││  0x7fed70eb4c1c:   mov%ebx,%ecx
>>   2.29%   │││  0x7fed70eb4c1e:   mov%edx,%r9d
>>   2.22%   │││  0x7fed70eb4c21:   mov(%rsp),%r8   
>> ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>>   │││
>> ; - java.lang.String::init@107 (line 537)
>>   2.32%   ↘││  0x7fed70eb4c25:   cmp%r13d,%ecx
>>││  0x7fed70eb4c28:   jge
>> 0x7fed70eb5388   ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>>││
>> ; - java.lang.String::init@110 (line 537)
>>   3.05%││  0x7fed70eb4c2e:   cmp
>> 0x8(%rsp),%ecx
>>││  0x7fed70eb4c32:   jae
>> 0x7fed70eb5319
>>   2.38%││  0x7fed70eb4c38:   mov%r8,(%rsp)
>>   2.64%││  0x7fed70eb4c3c:   movslq %ecx,%r8
>>   2.46%││  0x7fed70eb4c3f:   mov%rax,%rbx
>>   3.44%││  0x7fed70eb4c42:   sub%r8,%rbx
>>   2.62%││  0x7fed70eb4c45:   add$0x1,%rbx
>>   2.64%││  0x7fed70eb4c49:   and
>> $0xfffe,%rbx
>>   2.30%││  0x7fed70eb4c4d:   mov%ebx,%r8d
>>   3.08%││  0x7fed70eb4c50:   add%ecx,%r8d
>>   2.55%││  0x7fed70eb4c53:   movslq %r8d,%r8
>>   2.45%││  0x7fed70eb4c56:   add
>> $0xfffe,%r8
>>   2.13%││  0x7fed70eb4c5a:   cmp(%rsp),%r8
>>││  0x7fed70eb4c5e:   jae
>>