https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70557
--- Comment #5 from Albert Cahalan <acahalan at gmail dot com> --- This example shows the most simple form of the problem: unsigned long long ull; void simple64(void){ ull = 0; } NOTE: In the assembly below, I might have missing/excess parentheses. Assembler syntax varies. gcc generates: clr.L %d0 clr.L %d1 move.L %d0,ull move.L %d1,ull+4 As you can see, two registers are set to the same value. It's better to set just one, and even better to directly address memory with a clr.L instruction. Also, given that this code was optimized for size and there was an address register free, gcc should have put the address of ull into a register and then used that, preferably with autoincrement addressing. I'd like to see something like this: movea.L ull, %a0 clr.L (%a0)+ clr.L (%a0) When optimizing for speed and registers are not available, maybe this: clr.L ull clr.L ull+4 (the code is larger with those 6-byte instructions though, and it might actually run slower especially considering the small cache)