https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #36 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Richard Biener from comment #34)
> GCC definitely fails to see the FMA use as opportunity in
> ix86_emit_swsqrtsf, the a == 0 checking is because of the missing
> expander w/o avx512er where we could still use the NR sequence
> with the other instruction.  HJ?

Like this?

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e0d7c74fcec..0bbe3772ab7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -44855,14 +44855,22 @@ void ix86_emit_swsqrtsf (rtx res, rtx a, machine_mode
mode, bool recip)
        }
     }

+  mthree = force_reg (mode, mthree);
+
   /* e0 = x0 * a */
   emit_insn (gen_rtx_SET (e0, gen_rtx_MULT (mode, x0, a)));
-  /* e1 = e0 * x0 */
-  emit_insn (gen_rtx_SET (e1, gen_rtx_MULT (mode, e0, x0)));

-  /* e2 = e1 - 3. */
-  mthree = force_reg (mode, mthree);
-  emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (mode, e1, mthree)));
+  if (TARGET_FMA || TARGET_AVX512F)
+    emit_insn (gen_rtx_SET (e2,
+                           gen_rtx_FMA (mode, e0, x0, mthree)));
+  else
+    {
+      /* e1 = e0 * x0 */
+      emit_insn (gen_rtx_SET (e1, gen_rtx_MULT (mode, e0, x0)));
+
+      /* e2 = e1 - 3. */
+      emit_insn (gen_rtx_SET (e2, gen_rtx_PLUS (mode, e1, mthree)));
+    }

   mhalf = force_reg (mode, mhalf);
   if (recip)

Reply via email to