https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102483

            Bug ID: 102483
           Summary: Regression in codegen of reduction of 4 chars
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david.bolvansky at gmail dot com
  Target Milestone: ---

char foo (char* p)
 {
   char sum = 0;
    for (int i = 0; i != 4; i++)
    sum += p[i];
     return sum;
  }

-O3 -march=x86-64


GCC trunk:

foo:
        mov     edx, DWORD PTR [rdi]
        movzx   eax, dh
        mov     ecx, edx
        add     eax, edx
        shr     ecx, 16
        add     eax, ecx
        shr     edx, 24
        add     eax, edx
        ret


GCC 11 (much better):
foo:
        movzx   eax, BYTE PTR [rdi+1]
        add     al, BYTE PTR [rdi]
        add     al, BYTE PTR [rdi+2]
        add     al, BYTE PTR [rdi+3]
        ret


Best? llvm-mca says so..

foo:                                    # @foo
        movd    xmm0, dword ptr [rdi]           # xmm0 = mem[0],zero,zero,zero
        pxor    xmm1, xmm1
        psadbw  xmm1, xmm0
        movd    eax, xmm1
        ret


https://godbolt.org/z/sT9svvj7W

Reply via email to