https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65409
--- Comment #8 from Eric Botcazou <ebotcazou at gcc dot gnu.org> --- The difference is that we go through a pseudo with my version, but the code is optimal at -O1: call _Z8copy_foo3Foo movq %rax, a(%rip) movb %dl, a+8(%rip) and the change looks safe enough for all the branches.