[Bug middle-end/79359] Squaring a complex float gives inefficient code with or without -ffast-math

drraph at gmail dot com Mon, 06 Feb 2017 06:20:14 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79359


--- Comment #2 from Raphael C <drraph at gmail dot com> ---
As an additional data point in relation to Part 2 (that is without
-ffast-math). In gcc 7 -O3 -ffinite-math-only gives

f:
        movq    QWORD PTR [rsp-16], xmm0
        movss   xmm3, DWORD PTR [rsp-12]
        movss   xmm2, DWORD PTR [rsp-16]
        movaps  xmm1, xmm3
        movaps  xmm0, xmm2
        jmp     __mulsc3

whereas in clang trunk it gives

f:                                      # @f
        movaps  xmm1, xmm0
        shufps  xmm1, xmm1, 229         # xmm1 = xmm1[1,1,2,3]
        movaps  xmm2, xmm0
        mulss   xmm2, xmm1
        addss   xmm2, xmm2
        mulss   xmm0, xmm0
        mulss   xmm1, xmm1
        subss   xmm0, xmm1
        unpcklps        xmm0, xmm2      # xmm0 =
xmm0[0],xmm2[0],xmm0[1],xmm2[1]
        ret

I am no longer convinced ICC is handling NaN and Inf correctly so have posted a
query to their forum. However, it looks like gcc is not optimising as it could
when -ffinite-math-only is enabled.

[Bug middle-end/79359] Squaring a complex float gives inefficient code with or without -ffast-math

Reply via email to