Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-26 Thread Jie Fu
On Tue, 26 Jan 2021 18:20:00 GMT, Paul Sandoz  wrote:

> Hi Jie, Thanks for the detailed analysis. I suspect the difference outside of 
> loops is because of expression "length - (vlen - 1)”. We need to make 
> intrinsic Objects.checkFromIndexSize. Is that something you might be 
> interested in pursuing? 

Yes, I'd like to do that in the future.
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-26 Thread Paul Sandoz
On Tue, 26 Jan 2021 13:24:57 GMT, Jie Fu  wrote:

>> The intrinsic enables C2 to more reliably elide bounds checks. I don't know 
>> the exact details but at a high-level it transforms signed values into 
>> unsigned values (and therefore the signed comparisons become unsigned 
>> comparisons), which helps C2 remove checks when there is a dominating check 
>> of, for example, an upper loop bound.
>> 
>> You say "the intrinsified Objects.checkIndex seems to generate more code 
>> than inlined if-statements". Can you present some examples of Java code and 
>> the corresponding C2 generated assembler where this happens?
>
>> The intrinsic enables C2 to more reliably elide bounds checks. I don't know 
>> the exact details but at a high-level it transforms signed values into 
>> unsigned values (and therefore the signed comparisons become unsigned 
>> comparisons), which helps C2 remove checks when there is a dominating check 
>> of, for example, an upper loop bound.
>> 
>> You say "the intrinsified Objects.checkIndex seems to generate more code 
>> than inlined if-statements". Can you present some examples of Java code and 
>> the corresponding C2 generated assembler where this happens?
> 
> Hi @PaulSandoz ,
> 
> I agree with you that let's keep the code as it is for the sake of 
> performance.
> 
> I spent some time looking into the assembly code generated by 
> Objects.checkIndex and inlined if-statements.
> Here are the test program [1], running script [2] and diff [3].
> 
>  - For testSimple [4] that I checked last week, inlined if-statements [5] is 
> better than Objects.checkIndex [6].
>  - However, for testLoop [7], Objects.checkIndex [8] is better than inlined 
> if-statements [9].
>(I'm sorry I didn't check loop methods last week.)
>AFAICS, the inlined if-statements will generate two more instructions [10] 
> to check wether idx >= 0 for each loop iteration.
> 
> It seems that the intrinsified Objects.checkIndex will be able to optimize 
> out the lower bound check for loops.
> So I also agree with you that an intrinsified method seems the right choice.
> 
> Thanks.
> Best regards,
> Jie
> 
> [1] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java
> [2] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/1.sh
> [3] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/1.diff
> [4] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java#L10
> [5] 
> https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testSimple.log
> [6] 
> https://github.com/DamonFool/experiment/blob/main/JDK-8259925/2-testSimple.log
> [7] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java#L15
> [8] 
> https://github.com/DamonFool/experiment/blob/main/JDK-8259925/2-testLoop.log
> [9] 
> https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testLoop.log
> [10] 
> https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testLoop.log#L135

Hi Jie,

Thanks for the detailed analysis.

I suspect the difference outside of loops is because of expression "length - 
(vlen - 1)”.

We need to make intrinsic Objects.checkFromIndexSize. Is that something you 
might be interested in pursuing?

Paul.

On Jan 26, 2021, at 5:25 AM, Jie Fu 
mailto:notificati...@github.com>> wrote:



The intrinsic enables C2 to more reliably elide bounds checks. I don't know the 
exact details but at a high-level it transforms signed values into unsigned 
values (and therefore the signed comparisons become unsigned comparisons), 
which helps C2 remove checks when there is a dominating check of, for example, 
an upper loop bound.

You say "the intrinsified Objects.checkIndex seems to generate more code than 
inlined if-statements". Can you present some examples of Java code and the 
corresponding C2 generated assembler where this happens?

Hi 
@PaulSandoz
 ,

I agree with you that let's keep the code as it is for the sake of performance.

I spent some time looking into the assembly code generated by 
Objects.checkIndex and inlined if-statements.
Here are the test program [1], running script [2] and diff [3].

  *   For testSimple [4] that I checked last week, inlined if-statements [5] is 
better than Objects.checkIndex [6].
  *   However, for testLoop [7], Objects.checkIndex [8] is better than inlined 
if-statements [9].
(I'm sorry I didn't check loop methods last week.)
AFAICS, the inlined if-statements will generate two more instructions [10] to 
check wether idx >= 0 for each loop iteration.

It seems that the intrinsified Objects.checkIndex will be able to optimize out 
the lower bound check for loops.
So I also agree with you that an intrinsified method seems the right choice.

Thanks.
Best regards,
Jie

[1] 
https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java

Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-26 Thread Jie Fu
On Thu, 21 Jan 2021 16:54:36 GMT, Paul Sandoz  wrote:

> The intrinsic enables C2 to more reliably elide bounds checks. I don't know 
> the exact details but at a high-level it transforms signed values into 
> unsigned values (and therefore the signed comparisons become unsigned 
> comparisons), which helps C2 remove checks when there is a dominating check 
> of, for example, an upper loop bound.
> 
> You say "the intrinsified Objects.checkIndex seems to generate more code than 
> inlined if-statements". Can you present some examples of Java code and the 
> corresponding C2 generated assembler where this happens?

Hi @PaulSandoz ,

I agree with you that let's keep the code as it is for the sake of performance.

I spent some time looking into the assembly code generated by 
Objects.checkIndex and inlined if-statements.
Here are the test program [1], running script [2] and diff [3].

 - For testSimple [4] that I checked last week, inlined if-statements [5] is 
better than Objects.checkIndex [6].
 - However, for testLoop [7], Objects.checkIndex [8] is better than inlined 
if-statements [9].
   (I'm sorry I didn't check loop methods last week.)
   AFAICS, the inlined if-statements will generate two more instructions [10] 
to check wether idx >= 0 for each loop iteration.

It seems that the intrinsified Objects.checkIndex will be able to optimize out 
the lower bound check for loops.
So I also agree with you that an intrinsified method seems the right choice.

Thanks.
Best regards,
Jie

[1] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java
[2] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/1.sh
[3] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/1.diff
[4] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java#L10
[5] 
https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testSimple.log
[6] 
https://github.com/DamonFool/experiment/blob/main/JDK-8259925/2-testSimple.log
[7] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/Bar.java#L15
[8] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/2-testLoop.log
[9] https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testLoop.log
[10] 
https://github.com/DamonFool/experiment/blob/main/JDK-8259925/3-testLoop.log#L135

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-22 Thread Jie Fu
On Thu, 21 Jan 2021 16:54:36 GMT, Paul Sandoz  wrote:

>>> Unfortunately it is still problematic because you have replaced the 
>>> intrinsic check (that performed by `Object.checksIndex`, or more 
>>> specifically 
>>> [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/Preconditions.java#L261)).
>>> 
>>> Further, you have replaced the bounds check options, in place for 
>>> experimentation. We are not ready yet to collapse our options, further 
>>> analysis is required as bounds checks can be very sensitive in C2.
>>> 
>>> I would be open to you adding a third case, so that you can analyze the 
>>> performance without perturbing the default behavior. I suspect the correct 
>>> fix is to make intrinsic `Objects.checkFromIndexSize` in a similar manner 
>>> to `Object.checksIndex`.
>> 
>> Hi @PaulSandoz ,
>> 
>> Thanks for your kind review and comments.
>> 
>> To be honest, I didn't get your first point.
>> As for the example above, the intrinsified Objects.checkIndex seems to 
>> generate more code than inlined if-statements.
>> So I'm confused why it's a problem to replace the intrinsic check.
>> Did you mean an intrinsic is always (or for most cases) better than 
>> non-intrinc or inlined if-statements?
>> And why?
>> 
>> Could you please make it more clearer to us?
>> Thanks.
>
> The intrinsic enables C2 to more reliably elide bounds checks. I don't know 
> the exact details but at a high-level it transforms signed values into 
> unsigned values (and therefore the signed comparisons become unsigned 
> comparisons), which helps C2 remove checks when there is a dominating check 
> of, for example, an upper loop bound.
> 
> You say "the intrinsified Objects.checkIndex seems to generate more code than 
> inlined if-statements". Can you present some examples of Java code and the 
> corresponding C2 generated assembler where this happens?

Hi @PaulSandoz ,

I will show you the assembly code next week since it is already Friday night 
now.

Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-21 Thread Jie Fu
On Thu, 21 Jan 2021 16:54:36 GMT, Paul Sandoz  wrote:

> The intrinsic enables C2 to more reliably elide bounds checks. I don't know 
> the exact details but at a high-level it transforms signed values into 
> unsigned values (and therefore the signed comparisons become unsigned 
> comparisons), which helps C2 remove checks when there is a dominating check 
> of, for example, an upper loop bound.

OK.
Now I can understand what you are worrying about.
Thanks for your clarification.

> You say "the intrinsified Objects.checkIndex seems to generate more code than 
> inlined if-statements". Can you present some examples of Java code and the 
> corresponding C2 generated assembler where this happens?

Will do it later.
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-21 Thread Paul Sandoz
On Thu, 21 Jan 2021 09:35:01 GMT, Jie Fu  wrote:

>> Unfortunately it is still problematic because you have replaced the 
>> intrinsic check (that performed by `Object.checksIndex`, or more 
>> specifically 
>> [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/Preconditions.java#L261)).
>> 
>> Further, you have replaced the bounds check options, in place for 
>> experimentation. We are not ready yet to collapse our options, further 
>> analysis is required as bounds checks can be very sensitive in C2.
>> 
>> I would be open to you adding a third case, so that you can analyze the 
>> performance without perturbing the default behavior. I suspect the correct 
>> fix is to make intrinsic `Objects.checkFromIndexSize` in a similar manner to 
>> `Object.checksIndex`.
>
>> Unfortunately it is still problematic because you have replaced the 
>> intrinsic check (that performed by `Object.checksIndex`, or more 
>> specifically 
>> [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/Preconditions.java#L261)).
>> 
>> Further, you have replaced the bounds check options, in place for 
>> experimentation. We are not ready yet to collapse our options, further 
>> analysis is required as bounds checks can be very sensitive in C2.
>> 
>> I would be open to you adding a third case, so that you can analyze the 
>> performance without perturbing the default behavior. I suspect the correct 
>> fix is to make intrinsic `Objects.checkFromIndexSize` in a similar manner to 
>> `Object.checksIndex`.
> 
> Hi @PaulSandoz ,
> 
> Thanks for your kind review and comments.
> 
> To be honest, I didn't get your first point.
> As for the example above, the intrinsified Objects.checkIndex seems to 
> generate more code than inlined if-statements.
> So I'm confused why it's a problem to replace the intrinsic check.
> Did you mean an intrinsic is always (or for most cases) better than 
> non-intrinc or inlined if-statements?
> And why?
> 
> Could you please make it more clearer to us?
> Thanks.

The intrinsic enables C2 to more reliably elide bounds checks. I don't know the 
exact details but at a high-level it transforms signed values into unsigned 
values (and therefore the signed comparisons become unsigned comparisons), 
which helps C2 remove checks when there is a dominating check of, for example, 
an upper loop bound.

You say "the intrinsified Objects.checkIndex seems to generate more code than 
inlined if-statements". Can you present some examples of Java code and the 
corresponding C2 generated assembler where this happens?

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-21 Thread Jie Fu
On Wed, 20 Jan 2021 19:30:41 GMT, Paul Sandoz  wrote:

> Unfortunately it is still problematic because you have replaced the intrinsic 
> check (that performed by `Object.checksIndex`, or more specifically 
> [here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/Preconditions.java#L261)).
> 
> Further, you have replaced the bounds check options, in place for 
> experimentation. We are not ready yet to collapse our options, further 
> analysis is required as bounds checks can be very sensitive in C2.
> 
> I would be open to you adding a third case, so that you can analyze the 
> performance without perturbing the default behavior. I suspect the correct 
> fix is to make intrinsic `Objects.checkFromIndexSize` in a similar manner to 
> `Object.checksIndex`.

Hi @PaulSandoz ,

Thanks for your kind review and comments.

To be honest, I didn't get your first point.
As for the example above, the intrinsified Objects.checkIndex seems to generate 
more code than inlined if-statements.
So I'm confused why it's a problem to replace the intrinsic check.
Did you mean an intrinsic is always (or for most cases) better than non-intrinc 
or inlined if-statements?
And why?

Could you please make it more clearer to us?
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-20 Thread Paul Sandoz
On Wed, 20 Jan 2021 09:33:43 GMT, Jie Fu  wrote:

>> That change may cause performance issues. I would recommend leaving as is 
>> for now even through the error message is not great. Bounds checking is 
>> quite sensitive and WIP. Notice that we also have an option to call 
>> `Objects.checkFromIndexSize` which expresses the intent more accurately, but 
>> that is currently less optimal (at least it was when i last checked since it 
>> is non an intrinsic).
>
> Thanks @PaulSandoz for your review and comments.
> 
> Updated:
>  - The performance issue has been fixed since there is no more operation for 
> common cases.
>  - The readability of OOB exception msg has been improved by following the 
> style of Objects.checkFromIndexSize.
>  - Less code generated (several blocks of code were optimized out for the 
> Test::test method below).
> 
> import jdk.incubator.vector.ByteVector;
> import jdk.incubator.vector.VectorSpecies;
> 
> public class Test {
> static final VectorSpecies SPECIES_128 = ByteVector.SPECIES_128;
> static byte[] a = new byte[88];
> static byte[] b = new byte[88];
> 
> public static void test() {
> ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
> av.intoArray(b, 0);
> }
> 
> public static void main(String[] args) {
> for (int i = 0; i < 10; i++) {
> test();
> }
> System.out.println("b: " + b[0]);
> }
> }
> 
> What do you think of it?
> Thanks.

Unfortunately it is still problematic because you have replaced the intrinsic 
check (that performed by `Object.checksIndex`, or more specifically 
[here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/Preconditions.java#L261)).

Further, you have replaced the bounds check options, in place for 
experimentation. We are not ready yet to collapse our options, further analysis 
is required as bounds checks can be very sensitive in C2.

I would be open to you adding a third case, so that you can analyze the 
performance without perturbing the default behavior. I suspect the correct fix 
is to make intrinsic `Objects.checkFromIndexSize` in a similar manner to 
`Object.checksIndex`.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen [v2]

2021-01-20 Thread Jie Fu
> Hi all,
> 
> For this reproducer:
> 
> import jdk.incubator.vector.ByteVector;
> import jdk.incubator.vector.VectorSpecies;
> 
> public class Test {
> static final VectorSpecies SPECIES_128 = ByteVector.SPECIES_128;
> static byte[] a = new byte[8];
> static byte[] b = new byte[8];
> 
> public static void main(String[] args) {
> ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
> av.intoArray(b, 0);
> System.out.println("b: " + b[0]);
> }
> }
> 
> The following IndexOutOfBoundsException message (length -7) is unreasonable.
> 
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
> of bounds for length -7
> 
> It might be better to fix it like this.
> 
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
> of bounds for length 0
> 
> Thanks.
> Best regards,
> Jie

Jie Fu has updated the pull request incrementally with one additional commit 
since the last revision:

  Fix the performance issue

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2127/files
  - new: https://git.openjdk.java.net/jdk/pull/2127/files/ca39b482..3fd8b316

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2127&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2127&range=00-01

  Stats: 8 lines in 1 file changed: 6 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2127.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2127/head:pull/2127

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-20 Thread Jie Fu
On Tue, 19 Jan 2021 21:40:24 GMT, Paul Sandoz  wrote:

>> Hi all,
>> 
>> For this reproducer:
>> 
>> import jdk.incubator.vector.ByteVector;
>> import jdk.incubator.vector.VectorSpecies;
>> 
>> public class Test {
>> static final VectorSpecies SPECIES_128 = ByteVector.SPECIES_128;
>> static byte[] a = new byte[8];
>> static byte[] b = new byte[8];
>> 
>> public static void main(String[] args) {
>> ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
>> av.intoArray(b, 0);
>> System.out.println("b: " + b[0]);
>> }
>> }
>> 
>> The following IndexOutOfBoundsException message (length -7) is unreasonable.
>> 
>> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
>> of bounds for length -7
>> 
>> It might be better to fix it like this.
>> 
>> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
>> of bounds for length 0
>> 
>> Thanks.
>> Best regards,
>> Jie
>
> That change may cause performance issues. I would recommend leaving as is for 
> now even through the error message is not great. Bounds checking is quite 
> sensitive and WIP. Notice that we also have an option to call 
> `Objects.checkFromIndexSize` which expresses the intent more accurately, but 
> that is currently less optimal (at least it was when i last checked since it 
> is non an intrinsic).

Thanks @PaulSandoz for your review and comments.

Updated:
 - The performance issue has been fixed since there is no more operation for 
common cases.
 - The readability of OOB exception msg has been improved by following the 
style of Objects.checkFromIndexSize.
 - Less code generated (several blocks of code were optimized out for the 
Test::test method below).

import jdk.incubator.vector.ByteVector;
import jdk.incubator.vector.VectorSpecies;

public class Test {
static final VectorSpecies SPECIES_128 = ByteVector.SPECIES_128;
static byte[] a = new byte[88];
static byte[] b = new byte[88];

public static void test() {
ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
av.intoArray(b, 0);
}

public static void main(String[] args) {
for (int i = 0; i < 10; i++) {
test();
}
System.out.println("b: " + b[0]);
}
}

What do you think of it?
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/2127


Re: RFR: 8259925: [Vector API] Unreasonable IndexOutOfBoundsException message when length < vlen

2021-01-19 Thread Paul Sandoz
On Mon, 18 Jan 2021 13:32:24 GMT, Jie Fu  wrote:

> Hi all,
> 
> For this reproducer:
> 
> import jdk.incubator.vector.ByteVector;
> import jdk.incubator.vector.VectorSpecies;
> 
> public class Test {
> static final VectorSpecies SPECIES_128 = ByteVector.SPECIES_128;
> static byte[] a = new byte[8];
> static byte[] b = new byte[8];
> 
> public static void main(String[] args) {
> ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
> av.intoArray(b, 0);
> System.out.println("b: " + b[0]);
> }
> }
> 
> The following IndexOutOfBoundsException message (length -7) is unreasonable.
> 
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
> of bounds for length -7
> 
> It might be better to fix it like this.
> 
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out 
> of bounds for length 0
> 
> Thanks.
> Best regards,
> Jie

That change may cause performance issues. I would recommend leaving as is for 
now even through the error message is not great. Bounds checking is quite 
sensitive and WIP. Notice that we also have an option to call 
`Objects.checkFromIndexSize` which expresses the intent more accurately, but 
that is currently less optimal (at least it was when i last checked since it is 
non an intrinsic).

-

PR: https://git.openjdk.java.net/jdk/pull/2127