https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102135
--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> --- A small change to the testcase shows that this is highly dependent on the constrained registers from the calling convention. uint64_t foo64(int dummy, const uint8_t *rData1) { uint64_t buffer; buffer = (((uint64_t)rData1[7]) << 56)|((uint64_t)(rData1[6]) << 48)|((uint64_t)(rData1[5]) << 40)|(((uint64_t)rData1[4]) << 32)| (((uint64_t)rData1[3]) << 24)|(((uint64_t)rData1[2]) << 16)|((uint64_t)(rData1[1]) << 8)|rData1[0]; } Register allocation does not re-order code in order to reduce the conflicts, so this is not easy to fix. This is also a problem that is more obvious in micro-testcases such as this example, in real code it is more common for the register allocator to have more freedom and to be able to avoid issues like this. If your programming style is to write functions like this you'd likely get better code overall by marking these very small functions as inline, so that they do not incur the call setup and call/return overhead, which can be significant when you take into account the number of registers that must be saved over a function call.