Re: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v15]

Jatin Bhateja Tue, 31 Mar 2026 21:38:35 -0700

On Mon, 30 Mar 2026 06:08:33 GMT, Jatin Bhateja <[email protected]> wrote:


>> I briefly looked at the patch.
>> 
>> First of all, I suggest to separate the logic to handle intrinsification 
>> failures. It's not specific to the proposed enhancement and will improve 
>> handling of intrinsification failures for vector operations.
>> 
>> Speaking of proposed approach, it aligns well with current Vector API 
>> implementation practices. I agree it would be nice to automatically detect 
>> equivalent IR shapes and transform them accordingly, but if it means 
>> hard-coding the shape of `sliceTemplate` into the compiler, current proposal 
>> does look well-justified.
>
>> I briefly looked at the patch.
>> 
>> First of all, I suggest to separate the logic to handle intrinsification 
>> failures. It's not specific to the proposed enhancement and will improve 
>> handling of intrinsification failures for vector operations.
>> 
>> Speaking of proposed approach, it aligns well with current Vector API 
>> implementation practices. I agree it would be nice to automatically detect 
>> equivalent IR shapes and transform them accordingly, but if it means 
>> hard-coding the shape of `sliceTemplate` into the compiler, current proposal 
>> does look well-justified.
> 
> Thanks @iwanowww , I agree that approach to inline on intrinsic failure is 
> generic enough and can benefit other vector operations also as it may absorb 
> boxing penalties. For slice and un-slice since the fallback is completely 
> written in vector APIs it will give most benefits and that is the focus of 
> this patch.
> 
> Looking forward to your other comments on current implementation.

> @jatin-bhateja I agree with @iwanowww that the PR could be split into two: 
> One handling the intrinsification failure/fallback handling and other with 
> vector slice optimization for x86. That might help you to get reviews on this 
> work. I volunteer to review the x86 PR. Order wise, the fallback PR would 
> need to get in first though.

Hi @sviswa7 , 
Almost all the fallback code apart from few (unslice, slice etc) use scalar 
operation loop to compute the result, a box created on caller side on account 
of failed intrinsic will not be unboxed on callee side i.e. fall back 
implementation. In this context, inlining the fallback will save the call 
overhead but not prevent boxing penalty or code bloating on callee side which 
may have other side effects.  Which is why this PR selectively enables in 
lining of slice fallback which is composed of vector APIs and code for that is 
part of this pull request.

May I request you to kindly review the x86 backend implementation part of this 
pull request and share your feedback. 

Best Regards

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24104#issuecomment-4167389738

Re: RFR: 8303762: Optimize vector slice operation with constant index using VPALIGNR instruction [v15]

Reply via email to