https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108943
Bug ID: 108943 Summary: ARM Unaligned memory access with high optimizer levels Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: g...@dynamic-noise.net Target Milestone: --- Hello, the following MWE program, extracted from my production code, is compiled to machine code with unaligned memory accesses for ARM Cortex-M7 with optimizer mode -O2 and -O3, but *not* with -O1: #include <stdint.h> static int calculateStuff(const uint8_t *bytes); int main(void) { uint8_t *frame = (uint8_t*)0x240008d0; return calculateStuff(&frame[1]); } int calculateStuff(const uint8_t *bytes) { return (int)((bytes[0] << 8) | bytes[1]); } Compiled with: arm-none-eabi-gcc -c -O3 \ -mcpu=cortex-m7 -mthumb \ -Wall -Wextra \ -o mwe.o mwe.c No warnings except unused main arguments. My production code where I found this behavior did not trigger any traps when compiling with -fsanitize=undefined. No changes when compiling with -fno-strict-aliasing, -fno-aggressive-loop-optimizations and -fwrapv. So I do not assume there is undefined behaviour I am unaware of. Disassembly of object file (with my annotations) 00000000 <main>: ; Optimizer Level -O2 0: 4b02 ldr r3, [pc, #8] ; (c <main+0xc>) ; Tries to load half-word from address 0x240008d1, ; which is not allowed for ldrh instruction. 2: f8b3 00d1 ldrh.w r0, [r3, #209] ; 0xd1 6: ba40 rev16 r0, r0 8: b280 uxth r0, r0 a: 4770 bx lr c: 24000800 strcs r0, [r0], #-2048 ; 0xfffff800 GCC seems to assume that a byte-order-swap is attempted and thus assumes that the pointer is aligned. Compiling with -O1 uses two loads and shift operators. 00000000 <main>: ; Optimizer Level -O1 0: 4b03 ldr r3, [pc, #12] ; (10 <main+0x10>) 2: f893 00d1 ldrb.w r0, [r3, #209] ; 0xd1 6: f893 30d2 ldrb.w r3, [r3, #210] ; 0xd2 a: ea43 2000 orr.w r0, r3, r0, lsl #8 e: 4770 bx lr 10: 24000800 strcs r0, [r0], #-2048 ; 0xfffff800 GCC Version and Target Information: arm-none-eabi-gcc -v: Using built-in specs. COLLECT_GCC=arm-none-eabi-gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-none-eabi/12.2.0/lto-wrapper Target: arm-none-eabi Configured with: /build/arm-none-eabi-gcc/src/gcc-12.2.0/configure --target=arm-none-eabi --prefix=/usr --with-sysroot=/usr/arm-none-eabi --with-native-system-header-dir=/include --libexecdir=/usr/lib --enable-languages=c,c++ --enable-plugins --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-as --with-gnu-ld --with-system-zlib --with-newlib --with-headers=/usr/arm-none-eabi/include --with-python-dir=share/gcc-arm-none-eabi --with-gmp --with-mpfr --with-mpc --with-isl --with-libelf --enable-gnu-indirect-function --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-pkgversion='Arch Repository' --with-bugurl=https://bugs.archlinux.org/ --with-multilib-list=rmprofile Thread model: single Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (Arch Repository) gcc version 12.2.0 (Arch Repository)