On Thu, 29 Aug 2024 05:42:58 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support 
>> for following new two vector permutation APIs.
>> 
>> 
>> Declaration:-
>>     Vector<E>.selectFrom(Vector<E> v1, Vector<E> v2)
>> 
>> 
>> Semantics:-
>>     Using index values stored in the lanes of "this" vector, assemble the 
>> values stored in first (v1) and second (v2) vector arguments. Thus, first 
>> and second vector serves as a table, whose elements are selected based on 
>> index value vector. API is applicable to all integral and floating-point 
>> types.  The result of this operation is semantically equivalent to 
>> expression v1.rearrange(this.toShuffle(), v2). Values held in index vector 
>> lanes must lie within valid two vector index range [0, 2*VLEN) else an 
>> IndexOutOfBoundException is thrown.  
>> 
>> Summary of changes:
>> -  Java side implementation of new selectFrom API.
>> -  C2 compiler IR and inline expander changes.
>> -  In absence of direct two vector permutation instruction in target ISA, a 
>> lowering transformation dismantles new IR into constituent IR supported by 
>> target platforms. 
>> -  Optimized x86 backend implementation for AVX512 and legacy target.
>> -  Function tests covering new API.
>> 
>> JMH micro included with this patch shows around 10-15x gain over existing 
>> rearrange API :-
>> Test System: Intel(R) Xeon(R) Platinum 8480+ [ Sapphire Rapids Server]
>> 
>> 
>>   Benchmark                                     (size)   Mode  Cnt      
>> Score   Error   Units
>> SelectFromBenchmark.rearrangeFromByteVector     1024  thrpt    2   2041.762  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromByteVector     2048  thrpt    2   1028.550  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      1024  thrpt    2    962.605  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      2048  thrpt    2    479.004  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     1024  thrpt    2    359.758  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     2048  thrpt    2    178.192  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    1024  thrpt    2   1463.459  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    2048  thrpt    2    727.556  
>>         ops/ms
>> SelectFromBenchmark.selectFromByteVector        1024  thrpt    2  33254.830  
>>         ops/ms
>> SelectFromBenchmark.selectFromByteVector        2048  thrpt    2  17313.174  
>>         ops/ms
>> SelectFromBenchmark.selectFromIntVector         1024  thrpt    2  10756.804  
>>         ops/ms
>> S...
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Adding descriptive comments

Ok, I left a few more comments.

Generally, this looks like a nice feature, thanks for implementing it 
@jatin-bhateja ! 😊 

A few issues with code style (camelCase vs snake_case).
I'm also wondering about good naming. Why did we/you chose "select" for this? 
Why not "shuffle"? Does "select" not often get used as synonym of "blend", 
which has different semantics?

Also: I'm a little worried about the semantics change of the RearrangeNode that 
you did with the changes in `RearrangeNode::Ideal`. It looks a little "hacky", 
especially in conjunction with the `vector_indexes_needs_massaging` method. Can 
you give a clear definition of the semantics of `RearrangeNode` and 
`vector_indexes_needs_massaging`, please?

I also added some control questions for testing.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 6446:

> 6444: }
> 6445: 
> 6446: void C2_MacroAssembler::select_from_two_vector_evex(BasicType elem_bt, 
> XMMRegister dst, XMMRegister src1,

I also wonder if you could use the plural in these cases? You are selecting 
from two vectors, with the plural "s". Of course it is a bit annoying if you 
would have to name the IR node `SelectFromTwoVectors`, because we usually name 
the vector nodes `...Vector`, without the plural "s".

src/hotspot/share/opto/library_call.cpp line 749:

> 747:     return inline_vector_compress_expand();
> 748:   case vmIntrinsics::_VectorSelectFromTwoVectorOp:
> 749:     return inline_vector_select_from_two_vectors();

Interesting, here you use the correct plural "vectors".

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java 
line 544:

> 542:         byte[] vpayload1 = ((ByteVector)v1).vec();
> 543:         byte[] vpayload2 = ((ByteVector)v2).vec();
> 544:         byte[] vpayload3 = ((ByteVector)v3).vec();

Is there a reason you are not using more descriptive names here instead of 
`vpayload1`?
I also wonder if the `selectFromHelper` should not be named more specifically: 
`selectFromTwoVector(s)Helper`?

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java 
line 2595:

> 2593:     @ForceInline
> 2594:     final ByteVector selectFromTemplate(ByteVector v1, ByteVector v2) {
> 2595:         int twovectorlen = length() * 2;

`twovectorlen` -> `twoVectorLen`
I think in Java we are supposed to use camelCase

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Vector.java line 
2770:

> 2768: 
> 2769:     /**
> 2770:      * Rearranges the lane elements of two vectors, selecting lanes

I have a bit of a name concern here. Why are we calling it "select" and not 
"rearrange"? Because for a single "from" vector we also call it "rearrange", 
right? Is "select" not often synonymous to "blend", which works also with two 
"from" vectors, but with a mask and not indexing for "selection/rearranging"?

test/jdk/jdk/incubator/vector/Byte128VectorTests.java line 324:

> 322:                     boolean is_exceptional_idx = (int)order[idx] >= 
> vector_len;
> 323:                     int oidx = is_exceptional_idx ? ((int)order[idx] - 
> vector_len) : (int)order[idx];
> 324:                     Assert.assertEquals(r[idx], (is_exceptional_idx ? 
> b[i + oidx] : a[i + oidx]));

I thought general Java style is camelCase? Is that not followed in the 
VectorAPI code?

test/jdk/jdk/incubator/vector/ShortMaxVectorTests.java line 1048:

> 1046:         return 
> SHORT_GENERATOR_SELECT_FROM_TRIPLES.stream().map(List::toArray).
> 1047:                 toArray(Object[][]::new);
> 1048:     }

Just a control question: does this also occasionally generate examples with 
out-of-bounds indices? Negative out of bounds and positive out of bounds?

test/jdk/jdk/incubator/vector/ShortMaxVectorTests.java line 5812:

> 5810:                 ShortVector bv = ShortVector.fromArray(SPECIES, b, i);
> 5811:                 ShortVector idxv = ShortVector.fromArray(SPECIES, idx, 
> i);
> 5812:                 idxv.selectFrom(av, bv).intoArray(r, i);

Would this test catch a bug where the backend would generate vectors that are 
too long or too short?

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20508#pullrequestreview-2276944129
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741766060
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741773766
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741914524
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741911809
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741919025
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741920940
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741947885
PR Review Comment: https://git.openjdk.org/jdk/pull/20508#discussion_r1741949290

Reply via email to