https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64622
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- without loop header copyign we generate __strcspn_c1: .LFB0: .cfi_startproc xorl %eax, %eax jmp .L2 .p2align 4,,10 .p2align 3 .L8: cmpl %esi, %edx je .L6 addq $1, %rax .L2: movsbl (%rdi,%rax), %edx testb %dl, %dl jne .L8 .L6: rep ret so it would be interesting to investigate how they do this (if it's a special hack or some systematic fix). The loop header contains just the IV increment here.