https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102393
Bug ID: 102393 Summary: Failure to optimize 2 8-bit stores into a single 16-bit store Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- #include <stdint.h> void HeaderWriteU16LE(int offset, uint16_t value, uint8_t *RomHeader) { RomHeader[offset] = value; RomHeader[offset + 1] = value >> 8; } Non-withstanding aliasing, this can be optimized to `*(uint16_t *)(RomHeader + offset) = value`. This transformation is done by LLVM, but not by GCC. Sample AMD64 output for this from GCC: HeaderWriteU16LE: movsx rdi, edi mov eax, esi mov BYTE PTR [rdx+rdi], sil mov BYTE PTR [rdx+1+rdi], ah ret And from LLVM: HeaderWriteU16LE: movsxd rax, edi mov word ptr [rdx + rax], si ret PS: The equivalent pattern for 4 8-bit stores gets optimized into a single 32-bit store.