https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104315
--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> --- PS: I've just stumbled upon the more generic case, which would be this code: unsigned int stb_bitreverse(unsigned int n) { n = ((n & 0xAAAAAAAA) >> 1) | ((n & 0x55555555) << 1); n = ((n & 0xCCCCCCCC) >> 2) | ((n & 0x33333333) << 2); n = ((n & 0xF0F0F0F0) >> 4) | ((n & 0x0F0F0F0F) << 4); n = ((n & 0xFF00FF00) >> 8) | ((n & 0x00FF00FF) << 8); return (n >> 16) | (n << 16); } which GCC optimizes to this: stb_bitreverse(unsigned int): lsl w2, w0, 1 lsr w1, w0, 1 and w0, w1, 1431655765 and w1, w2, -1431655766 orr w0, w0, w1 lsr w1, w0, 2 lsl w0, w0, 2 and w0, w0, -858993460 and w1, w1, 858993459 orr w1, w1, w0 lsr w0, w1, 4 lsl w1, w1, 4 and w1, w1, -252645136 and w0, w0, 252645135 orr w0, w0, w1 rev w0, w0 ret and LLVM to this: stb_bitreverse(unsigned int): rbit w0, w0 ret