https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target|i386,x86-64 |x86_64-*-* i?86-*-* Keywords| |missed-optimization --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- GCC does not model CPU pipelines in such detail (not to say documentation on the CPU side is insufficient). In principle the vmovaxx %ymmB, %ymmA should be handled at the rename stage and be 'free' and both cases should end up with the same number of uops. Did you try to throw Intel iaca on it? (not that it is very reliable)