On May 7, 2020 4:25:45 PM GMT+02:00, Jakub Jelinek <ja...@redhat.com> wrote:
>On Thu, May 07, 2020 at 10:04:35AM +0200, Richard Biener wrote:
>> On Thu, 7 May 2020, Jakub Jelinek wrote:
>> > The ffs expanders on several targets (x86, ia64, aarch64 at least)
>> > emit a conditional move or similar code to handle the case when the
>> > argument is 0, which makes the code longer.
>> > If we know from VRP that the argument will not be zero, we can (if
>the
>> > target has also an ctz expander) just use ctz which is undefined at
>zero
>> > and thus the expander doesn't need to deal with that.
>> > 
>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk?
>> 
>> can you use direct_internal_fn_supported_p (IFN_CTZ, type, 
>> OPTIMIZE_FOR_SPEED)?
>
>Only if it is guarded with #if GIMPLE (because otherwise the fn
>isn't declared).
>Though, restricting this to GIMPLE seems like a good idea anyway to me.
>
>Ok for trunk if it passes bootstrap/regtest?

OK. 

Richard. 

>2020-05-07  Jakub Jelinek  <ja...@redhat.com>
>
>       PR tree-optimization/94956
>       * match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into
>       __builtin_ctz* + 1 if direct IFN_CTZ is supported.
>
>       * gcc.target/i386/pr94956.c: New test.
>
>--- gcc/match.pd.jj    2020-05-06 15:03:51.618058839 +0200
>+++ gcc/match.pd       2020-05-07 16:16:48.466970168 +0200
>@@ -5986,6 +5986,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>       && direct_internal_fn_supported_p (IFN_POPCOUNT, type,
>                                          OPTIMIZE_FOR_BOTH))
>     (convert (IFN_POPCOUNT:type @0)))))
>+
>+/* __builtin_ffs needs to deal on many targets with the possible zero
>+   argument.  If we know the argument is always non-zero,
>__builtin_ctz + 1
>+   should lead to better code.  */
>+(simplify
>+ (FFS tree_expr_nonzero_p@0)
>+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>+      && direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@0),
>+                                       OPTIMIZE_FOR_SPEED))
>+  (plus (CTZ:type @0) { build_one_cst (type); })))
> #endif
> 
> /* Simplify:
>--- gcc/testsuite/gcc.target/i386/pr94956.c.jj 2020-05-06
>16:35:47.085876237 +0200
>+++ gcc/testsuite/gcc.target/i386/pr94956.c    2020-05-06
>16:39:52.927140038 +0200
>@@ -0,0 +1,28 @@
>+/* PR tree-optimization/94956 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2" } */
>+/* { dg-final { scan-assembler-not "\tcmovne\t" } } */
>+/* { dg-final { scan-assembler-not "\tsete\t" } } */
>+
>+int
>+foo (unsigned x)
>+{
>+  if (x == 0) __builtin_unreachable ();
>+  return __builtin_ffs (x) - 1;
>+}
>+
>+int
>+bar (unsigned long x)
>+{
>+  if (x == 0) __builtin_unreachable ();
>+  return __builtin_ffsl (x) - 1;
>+}
>+
>+#ifdef __x86_64__
>+int
>+baz (unsigned long long x)
>+{
>+  if (x == 0) __builtin_unreachable ();
>+  return __builtin_ffsll (x) - 1;
>+}
>+#endif
>
>
>       Jakub

Reply via email to