https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94071

            Bug ID: 94071
           Summary: Missed optimization with endian and alignment
                    independent memory access
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: moritz.struebe at redheads dot de
  Target Milestone: ---

When copying consecutive bytes from memory into a local variable, this is done
byte wise, even on platforms that support unaligned access.

Code Example (Including more examples): https://godbolt.org/z/kFh6nc
Related: Bug 54733 (Same issue, but with a local variable)

Code

    #include <stdint.h>
    uint8_t data[1024];

    uint16_t getU16(int addr) {
        return 
          (uint16_t) data[addr    ] 
        | (uint16_t) data[addr + 1] << 8;
    }

Expected: (e.g. LLVM)
        movsxd  rax, edi
        movzx   eax, word ptr [rax + data]
        ret

Actual: (g++ (Compiler-Explorer-Build) 10.0.1 20200305 (experimental))
        lea     eax, [rdi+1]
        movsx   rdi, edi
        cdqe
        movzx   eax, BYTE PTR data[rax]
        sal     eax, 8
        mov     edx, eax
        movzx   eax, BYTE PTR data[rdi]
        or      eax, edx
        ret


I found this on ARM, where this probably hurts more than on x86.

Reply via email to