https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105429

            Bug ID: 105429
           Summary: Unnecessary moves generated by the compiler.
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mareksz1958 at wp dot pl
  Target Milestone: ---

The following C code:

>>>
#include <nmmintrin.h>
#include <stdint.h>
uint32_t crc(uint32_t current, const uint8_t *buffer, size_t size) {
    for(size_t i = 0; i < size; i++)
        current = _mm_crc32_u64(current, buffer[i]);
    return current;
}
<<<

Generates inefficient assembly on all optimisation presets due to the extra
`mov eax, eax' - Os and O3 below:

>>>
crc:
        movl    %edi, %eax
        xorl    %ecx, %ecx
.L2:
        cmpq    %rdx, %rcx
        je      .L5
        movzbl  (%rsi,%rcx), %edi
        movl    %eax, %eax
        incq    %rcx
        crc32q  %rdi, %rax
        jmp     .L2
.L5:
        ret

crc:
        movl    %edi, %eax
        testq   %rdx, %rdx
        je      .L6
        leaq    (%rsi,%rdx), %rcx
.L3:
        movzbl  (%rsi), %edx
        movl    %eax, %eax
        addq    $1, %rsi
        crc32q  %rdx, %rax
        cmpq    %rsi, %rcx
        jne     .L3
.L6:
        ret
<<<

The problem seems to be present in all GCC versions I have access to. The
redundant move greatly worsens the performance of the generated code. When
`_mm_crc32_u64' is replaced by any other function, the problem seems to
disappear.

Reply via email to