On Mon, 30 Mar 2026 06:08:33 GMT, Jatin Bhateja <[email protected]> wrote:
>> I briefly looked at the patch. >> >> First of all, I suggest to separate the logic to handle intrinsification >> failures. It's not specific to the proposed enhancement and will improve >> handling of intrinsification failures for vector operations. >> >> Speaking of proposed approach, it aligns well with current Vector API >> implementation practices. I agree it would be nice to automatically detect >> equivalent IR shapes and transform them accordingly, but if it means >> hard-coding the shape of `sliceTemplate` into the compiler, current proposal >> does look well-justified. > >> I briefly looked at the patch. >> >> First of all, I suggest to separate the logic to handle intrinsification >> failures. It's not specific to the proposed enhancement and will improve >> handling of intrinsification failures for vector operations. >> >> Speaking of proposed approach, it aligns well with current Vector API >> implementation practices. I agree it would be nice to automatically detect >> equivalent IR shapes and transform them accordingly, but if it means >> hard-coding the shape of `sliceTemplate` into the compiler, current proposal >> does look well-justified. > > Thanks @iwanowww , I agree that approach to inline on intrinsic failure is > generic enough and can benefit other vector operations also as it may absorb > boxing penalties. For slice and un-slice since the fallback is completely > written in vector APIs it will give most benefits and that is the focus of > this patch. > > Looking forward to your other comments on current implementation. > @jatin-bhateja I agree with @iwanowww that the PR could be split into two: > One handling the intrinsification failure/fallback handling and other with > vector slice optimization for x86. That might help you to get reviews on this > work. I volunteer to review the x86 PR. Order wise, the fallback PR would > need to get in first though. Hi @sviswa7 , Almost all the fallback code apart from few (unslice, slice etc) use scalar operation loop to compute the result, a box created on caller side on account of failed intrinsic will not be unboxed on callee side i.e. fall back implementation. In this context, inlining the fallback will save the call overhead but not prevent boxing penalty or code bloating on callee side which may have other side effects. Which is why this PR selectively enables in lining of slice fallback which is composed of vector APIs and code for that is part of this pull request. May I request you to kindly review the x86 backend implementation part of this pull request and share your feedback. Best Regards ------------- PR Comment: https://git.openjdk.org/jdk/pull/24104#issuecomment-4167389738
