On Tue, 9 Jan 2024 06:13:44 GMT, Jatin Bhateja wrote:
>> Yes, IF it is vectorized, then there is no difference between high and low
>> density. My concern was more if vectorization is preferrable over the scalar
>> alternative in the low-density case, where branch prediction is more stable.
>
On Mon, 8 Jan 2024 07:55:00 GMT, Emanuel Peter wrote:
>>> You are using `VectorMask pred = VectorMask.fromLong(ispecies,
>>> maskctr++);`. That basically systematically iterates over all masks, which
>>> is nice for a correctness test. But that would use different density inside
>>> one test
On Fri, 5 Jan 2024 09:35:34 GMT, Emanuel Peter wrote:
>> Thanks for the comment addition!
>
> Improvement suggestion:
> For a vector with 8 ints, we get `2^8 = 256` many bit patterns for the mask.
> The table has a row for each `mask` value, consisting of 8 ints, which
> provide the valid
On Mon, 8 Jan 2024 06:06:20 GMT, Jatin Bhateja wrote:
>> You are using `VectorMask pred = VectorMask.fromLong(ispecies,
>> maskctr++);`.
>> That basically systematically iterates over all masks, which is nice for a
>> correctness test.
>> But that would use different density inside one test
On Fri, 5 Jan 2024 09:45:11 GMT, Emanuel Peter wrote:
> You are using `VectorMask pred = VectorMask.fromLong(ispecies,
> maskctr++);`. That basically systematically iterates over all masks, which is
> nice for a correctness test. But that would use different density inside one
> test run,
On Fri, 5 Jan 2024 07:05:51 GMT, Jatin Bhateja wrote:
>> We do have extensive functional tests for compress/expand APIs in
>> [test/jdk/jdk/incubator/vector](https://github.com/openjdk/jdk/tree/master/test/jdk/jdk/incubator/vector)
>
>> Could there be equivalent `expand` tests?
>
> Here are
On Fri, 5 Jan 2024 09:37:55 GMT, Emanuel Peter wrote:
>> This computes the byte offset from start of the table, both integer and long
>> permute table have same row sizes, 8 int elements vs 4 long elements.
>
> Ah, I understand now. Maybe leave a comment for that?
I would say something like
On Thu, 4 Jan 2024 13:40:19 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Updating copyright year of modified files.
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 957:
>
>> 955: __
On Fri, 5 Jan 2024 09:31:50 GMT, Emanuel Peter wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 957:
>>
>>> 955: __ align(CodeEntryAlignment);
>>> 956: StubCodeMark mark(this, "StubRoutines", stub_name);
>>> 957: address start = __ pc();
>>
>> Could you please add some
On Fri, 5 Jan 2024 07:03:34 GMT, Jatin Bhateja wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5307:
>>
>>> 5305: assert(bt == T_LONG || bt == T_DOUBLE, "");
>>> 5306: vmovmskpd(rtmp, mask, vec_enc);
>>> 5307: shlq(rtmp, 5);
>>
>> Might this need to be 6? If I
On Thu, 4 Jan 2024 13:41:40 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Updating copyright year of modified files.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5307:
>
>> 5305:
On Thu, 4 Jan 2024 13:30:24 GMT, Emanuel Peter wrote:
>> test/micro/org/openjdk/bench/jdk/incubator/vector/ColumnFilterBenchmark.java
>> line 94:
>>
>>> 92:IntVector vec = IntVector.fromArray(ispecies, intinCol, i);
>>> 93:VectorMask pred =
On Fri, 5 Jan 2024 07:03:26 GMT, Jatin Bhateja wrote:
>> And what about some result verification? Or is there another test that does
>> that?
>
> We do have extensive functional tests for compress/expand APIs in
>
On Thu, 4 Jan 2024 13:33:08 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Updating copyright year of modified files.
>
>
On Thu, 4 Jan 2024 05:39:01 GMT, Jatin Bhateja wrote:
>> Hi,
>>
>> Patch optimizes non-subword vector compress and expand APIs for x86 AVX2
>> only targets.
>> Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2
>> instruction set.
>> These are very frequently used APIs in
On Thu, 4 Jan 2024 13:09:30 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Updating copyright year of modified files.
>
>
> Hi,
>
> Patch optimizes non-subword vector compress and expand APIs for x86 AVX2 only
> targets.
> Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2
> instruction set.
> These are very frequently used operation in columnar database filter
> operation.
>
>
17 matches
Mail list logo