https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89071

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |87007

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
        vcvtsd2ss       %xmm1, %xmm1, %xmm0

is faster than

        vcvtsd2ss       %xmm1, %xmm0, %xmm0

But

        vxorps  %xmm0, %xmm0, %xmm0
        vcvtsd2ss       %xmm1, %xmm0, %xmm0

are faster than both.  I have a patch for PR 87007:

https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00298.html

which inserts a vxorps at the last possible position.  vxorps
will be executed only once in a function.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87007
[Bug 87007] [8/9 Regression] 10% slowdown with -march=skylake-avx512

Reply via email to