https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118279
--- Comment #4 from Arseny Kapoulkine <arseny.kapoulkine at gmail dot com> ---
Noted re: godbolt, sorry!
Applying your suggested workaround seems to make the codegen worse? Adding
[[assume(hoff >= 0 && hoff <= 2)]];
before the loop retains the guards for the interior jumps but also changes the
first switch dispatch in the loop to also carry the guard (.L27 is the loop
start):
.L27:
mov rax, r9
shr rax, 2
movzx eax, BYTE PTR [rdi+rax]
mov edx, eax
and edx, 3
add edx, esi
cmp edx, 5
ja .L3
jmp [QWORD PTR .L5[0+rdx*8]]
(this is using g++
(Compiler-Explorer-Build-gcc-a8781c4151136968ad38a40344d16940e4ccb700-binutils-2.42)
15.0.0 20250102 (experimental))