On Tue, 1 Nov 2022 06:17:16 GMT, Ludovic Henry wrote:
>> Ok, I can try rewriting as @merykitty suggests and compare. I'm running out
>> of time to spend on this right now, though, so I sort of hope we can do this
>> experiment as a follow-up RFE.
>
> @cl4es i can write the assembly and send it
On Mon, 31 Oct 2022 22:06:20 GMT, Claes Redestad wrote:
>> No you don't need to, the vector loop can be calculated as:
>>
>> IntVector accumulation = IntVector.zero(INT_SPECIES);
>> for (int i = 0; i < bound; i += INT_SPECIES.length()) {
>> IntVector current = IntVector.load(INT_
On Mon, 31 Oct 2022 22:06:20 GMT, Claes Redestad wrote:
>> No you don't need to, the vector loop can be calculated as:
>>
>> IntVector accumulation = IntVector.zero(INT_SPECIES);
>> for (int i = 0; i < bound; i += INT_SPECIES.length()) {
>> IntVector current = IntVector.load(INT_
On Mon, 31 Oct 2022 13:35:36 GMT, Quan Anh Mai wrote:
>> But doing it forward requires a `reduceLane` on each iteration. It's faster
>> to do it backward.
>
> No you don't need to, the vector loop can be calculated as:
>
> IntVector accumulation = IntVector.zero(INT_SPECIES);
> for (int
On Mon, 31 Oct 2022 21:48:37 GMT, Claes Redestad wrote:
>> Continuing the work initiated by @luhenry to unroll and then intrinsify
>> polynomial hash loops.
>>
>> I've rewired the library changes to route via a single `@IntrinsicCandidate`
>> method. To make this work I've harmonized how they
> Continuing the work initiated by @luhenry to unroll and then intrinsify
> polynomial hash loops.
>
> I've rewired the library changes to route via a single `@IntrinsicCandidate`
> method. To make this work I've harmonized how they are invoked so that
> there's less special handling and checks