On Wed, 15 Nov 2023 15:40:49 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8617: >> >>> 8615: lea(dst, Address(dst, tmp5, Address::times_1)); >>> 8616: subptr(len, tmp5); >>> 8617: jmpb(copy_chars_loop); >> >> This cause a crash if I run with `-XX:UseAVX=3 -XX:AVX3Threshold=0`: >> >> >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (macroAssembler_x86.hpp:122), pid=3400452, tid=3400470 >> # guarantee(this->is8bit(imm8)) failed: Short forward jump exceeds 8-bit >> offset at <null>:0 >> # >> >> >> Needs to be a `jmp(copy_chars_loop)`. > > Alternatively: > > if (UseSSE42Intrinsics) { > jmpb(copy_chars_loop); > } else { > jmp(copy_chars_loop); > } > > > More generally I do wonder if it'd make most sense to make the AVX512 and > SSE42 implementations exclusive, though. Especially since we shouldn't mix > AVX and SSE code (the code in this intrinsic seem to follow paths which are > either/or, but it seems fragile). Perhaps @TobiHartmann can advise? > This cause a crash if I run with -XX:UseAVX=3 -XX:AVX3Threshold=0: Good catch! Do we have a test for that scenario? If not, one should be added. > Alternatively [...] I would suggest to use `jmp(copy_chars_loop)` here for consistency with surrounding code. > More generally I do wonder if it'd make most sense to make the AVX512 and > SSE42 implementations exclusive We don't mix AVX and SSE code here, right? We just fall back to SSE42 when AVX512 is not available or when the remaining length is below a threshold. Are you suggesting to better encapsulate both implementations (for example, factor them out into separate methods) or only ever using one? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1395363072