On Tue, 20 Dec 2022 21:11:18 GMT, Claes Redestad <redes...@openjdk.org> wrote:

>>> How far off is this ...?
>> 
>> Back then it looked way too constrained (tight constraints on code shapes). 
>> But I considered it as a generally applicable optimization. 
>> 
>>>  ... do you think it'll be able to match the efficiency we see here with a 
>>> memoized coefficient table etc?
>> 
>> Yes, it is able to build the constant table at runtime when folding 
>> multiplications of constant coefficients produced during loop unrolling and 
>> then packing scalars into a constant vector.
>> 
>> Moreover, briefly looking at the code shape, the vectorizer would produce a 
>> more optimal loop shape (pre-loop would align vector accesses and would use 
>> 512-bit vectors when available; vector post-loop could help as well).
>
> Passing the constant node through as an input as suggested by @iwanowww and 
> @sviswa7 meant we could eliminate most of the `instruct` blocks, removing a 
> significant chunk of code and a little bit of complexity from the proposed 
> patch.

@cl4es Thanks for passing the constant node through, the code looks much 
cleaner now. The attached patch should handle the signed bytes/shorts as well. 
Please take a look. 
[signed.patch](https://github.com/openjdk/jdk/files/10273480/signed.patch)

-------------

PR: https://git.openjdk.org/jdk/pull/10847

Reply via email to