On Wed, 15 Nov 2023 15:40:49 GMT, Claes Redestad <redes...@openjdk.org> wrote:

>> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8617:
>> 
>>> 8615:     lea(dst, Address(dst, tmp5, Address::times_1));
>>> 8616:     subptr(len, tmp5);
>>> 8617:     jmpb(copy_chars_loop);
>> 
>> This cause a crash if I run with `-XX:UseAVX=3 -XX:AVX3Threshold=0`:
>> 
>> 
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (macroAssembler_x86.hpp:122), pid=3400452, tid=3400470
>> #  guarantee(this->is8bit(imm8)) failed: Short forward jump exceeds 8-bit 
>> offset at <null>:0
>> #
>> 
>> 
>> Needs to be a `jmp(copy_chars_loop)`.
>
> Alternatively:
> 
> if (UseSSE42Intrinsics) {
>   jmpb(copy_chars_loop);
> } else {
>   jmp(copy_chars_loop);
> }
> 
> 
> More generally I do wonder if it'd make most sense to make the AVX512 and 
> SSE42 implementations exclusive, though. Especially since we shouldn't mix 
> AVX and SSE code (the code in this intrinsic seem to follow paths which are 
> either/or, but it seems fragile). Perhaps @TobiHartmann can advise?

> This cause a crash if I run with -XX:UseAVX=3 -XX:AVX3Threshold=0:

Good catch! Do we have a test for that scenario? If not, one should be added.

> Alternatively [...]

I would suggest to use `jmp(copy_chars_loop)` here for consistency with 
surrounding code.

> More generally I do wonder if it'd make most sense to make the AVX512 and 
> SSE42 implementations exclusive

We don't mix AVX and SSE code here, right? We just fall back to SSE42 when 
AVX512 is not available or when the remaining length is below a threshold. Are 
you suggesting to better encapsulate both implementations (for example, factor 
them out into separate methods) or only ever using one?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1395363072

Reply via email to