https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69143
Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target|powerpc64le-linux |powerpc64le-linux, | |aarch64-none-linux-gnu CC| |ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> --- I'm not sure if this is target specific or not but on aarch64 and on armhf as well the code is badly optimized fmov x1, d1 fmov x0, d0 bfi x0, x1, 32, 32 ubfx x1, x0, 0, 32 ubfx x0, x0, 32, 32 fmov s1, w0 fmov s0, w1 ret On ARM it's even more fun sub sp, sp, #24 add r3, sp, #8 vstr.32 s0, [sp, #8] vstr.32 s1, [sp, #12] ldm r3, {r0, r1} add r3, sp, #24 stmdb r3, {r0, r1} vldr.32 s0, [sp, #16] vldr.32 s1, [sp, #20] add sp, sp, #24 @ sp needed bx lr And this is just saving and restoring the values that came in in s0 and s1 from the stack.