https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91333

--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
With trunk (master?), compiling with -O3, h gives

        movapd  %xmm1, %xmm3
        addsd   %xmm3, %xmm1
        movapd  %xmm0, %xmm2
        addsd   %xmm2, %xmm0
        addsd   %xmm1, %xmm0

which looks good (the asm prevents from doing addsd %xmm1 %xmm1 directly).

However, if I add -mavx, I get

        vmovapd %xmm0, %xmm2
        vmovapd %xmm1, %xmm4
        vmovapd %xmm1, %xmm0
        vaddsd  %xmm0, %xmm4, %xmm0
        vmovapd %xmm2, %xmm3
        vaddsd  %xmm2, %xmm3, %xmm2
        vaddsd  %xmm0, %xmm2, %xmm0

That's 2 extra moves compared to the non-avx version, which seems wrong since
AVX gives more freedom to the RA.
Those initial moves look quite similar to the ones I get for f with gcc-9 -O3
-mno-avx, so the optimization looks fragile.

Reply via email to