https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
--- Comment #18 from Uroš Bizjak <ubizjak at gmail dot com> --- Maybe related to bswap optimization is also: typedef __SIZE_TYPE__ size_t; void baz (char *buf, unsigned int data) { buf[0] = data >> 8; buf[1] = data; } which currently generates (-O2 -march=haswell) rolw $8, %si movw %si, (%rdi) but could use "movbew %si, (%rdi)".