Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-09 Thread Emanuel Peter
On Tue, 9 Jan 2024 06:13:44 GMT, Jatin Bhateja wrote: >> Yes, IF it is vectorized, then there is no difference between high and low >> density. My concern was more if vectorization is preferrable over the scalar >> alternative in the low-density case, where branch prediction is more stable. >

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-08 Thread Jatin Bhateja
On Mon, 8 Jan 2024 07:55:00 GMT, Emanuel Peter wrote: >>> You are using `VectorMask pred = VectorMask.fromLong(ispecies, >>> maskctr++);`. That basically systematically iterates over all masks, which >>> is nice for a correctness test. But that would use different density inside >>> one test

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-08 Thread Emanuel Peter
On Fri, 5 Jan 2024 09:35:34 GMT, Emanuel Peter wrote: >> Thanks for the comment addition! > > Improvement suggestion: > For a vector with 8 ints, we get `2^8 = 256` many bit patterns for the mask. > The table has a row for each `mask` value, consisting of 8 ints, which > provide the valid

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-08 Thread Emanuel Peter
On Mon, 8 Jan 2024 06:06:20 GMT, Jatin Bhateja wrote: >> You are using `VectorMask pred = VectorMask.fromLong(ispecies, >> maskctr++);`. >> That basically systematically iterates over all masks, which is nice for a >> correctness test. >> But that would use different density inside one test

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-07 Thread Jatin Bhateja
On Fri, 5 Jan 2024 09:45:11 GMT, Emanuel Peter wrote: > You are using `VectorMask pred = VectorMask.fromLong(ispecies, > maskctr++);`. That basically systematically iterates over all masks, which is > nice for a correctness test. But that would use different density inside one > test run,

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-05 Thread Emanuel Peter
On Fri, 5 Jan 2024 07:05:51 GMT, Jatin Bhateja wrote: >> We do have extensive functional tests for compress/expand APIs in >> [test/jdk/jdk/incubator/vector](https://github.com/openjdk/jdk/tree/master/test/jdk/jdk/incubator/vector) > >> Could there be equivalent `expand` tests? > > Here are

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-05 Thread Emanuel Peter
On Fri, 5 Jan 2024 09:37:55 GMT, Emanuel Peter wrote: >> This computes the byte offset from start of the table, both integer and long >> permute table have same row sizes, 8 int elements vs 4 long elements. > > Ah, I understand now. Maybe leave a comment for that? I would say something like

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-05 Thread Emanuel Peter
On Thu, 4 Jan 2024 13:40:19 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Updating copyright year of modified files. > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 957: > >> 955: __

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-05 Thread Emanuel Peter
On Fri, 5 Jan 2024 09:31:50 GMT, Emanuel Peter wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 957: >> >>> 955: __ align(CodeEntryAlignment); >>> 956: StubCodeMark mark(this, "StubRoutines", stub_name); >>> 957: address start = __ pc(); >> >> Could you please add some

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-05 Thread Emanuel Peter
On Fri, 5 Jan 2024 07:03:34 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5307: >> >>> 5305: assert(bt == T_LONG || bt == T_DOUBLE, ""); >>> 5306: vmovmskpd(rtmp, mask, vec_enc); >>> 5307: shlq(rtmp, 5); >> >> Might this need to be 6? If I

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Jatin Bhateja
On Thu, 4 Jan 2024 13:41:40 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Updating copyright year of modified files. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5307: > >> 5305:

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Jatin Bhateja
On Thu, 4 Jan 2024 13:30:24 GMT, Emanuel Peter wrote: >> test/micro/org/openjdk/bench/jdk/incubator/vector/ColumnFilterBenchmark.java >> line 94: >> >>> 92:IntVector vec = IntVector.fromArray(ispecies, intinCol, i); >>> 93:VectorMask pred =

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Jatin Bhateja
On Fri, 5 Jan 2024 07:03:26 GMT, Jatin Bhateja wrote: >> And what about some result verification? Or is there another test that does >> that? > > We do have extensive functional tests for compress/expand APIs in >

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Jatin Bhateja
On Thu, 4 Jan 2024 13:33:08 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Updating copyright year of modified files. > >

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Emanuel Peter
On Thu, 4 Jan 2024 05:39:01 GMT, Jatin Bhateja wrote: >> Hi, >> >> Patch optimizes non-subword vector compress and expand APIs for x86 AVX2 >> only targets. >> Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2 >> instruction set. >> These are very frequently used APIs in

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-04 Thread Emanuel Peter
On Thu, 4 Jan 2024 13:09:30 GMT, Emanuel Peter wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Updating copyright year of modified files. > >

Re: RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]

2024-01-03 Thread Jatin Bhateja
> Hi, > > Patch optimizes non-subword vector compress and expand APIs for x86 AVX2 only > targets. > Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2 > instruction set. > These are very frequently used operation in columnar database filter > operation. > >