https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32605

Jed Brown <jed at 59A2 dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jed at 59A2 dot org

--- Comment #5 from Jed Brown <jed at 59A2 dot org> ---
The missed optimization even exists for code such as this, which should compile
to a simple load on LE architectures.

unsigned read_u32_le(const unsigned char arr[]) {
  return (arr[0] << 0)
    | (arr[1] << 8)
    | (arr[2] << 16)
    | (arr[3] << 24);
}

gcc-8.3/trunk -O:

read_u32_le:
  movzx eax, BYTE PTR [rdi+1]
  sal eax, 8
  movzx edx, BYTE PTR [rdi+2]
  sal edx, 16
  or eax, edx
  movzx edx, BYTE PTR [rdi]
  or eax, edx
  movzx edx, BYTE PTR [rdi+3]
  sal edx, 24
  or eax, edx
  ret

clang-8 -O:

read_u32_le: # @read_u32_le
  mov eax, dword ptr [rdi]
  ret

https://gcc.godbolt.org/z/8lGeCF

Reply via email to