Hi Jeff,

On 29/07/15 23:38, Jeff Law wrote:
On 07/29/2015 07:49 AM, Kyrill Tkachov wrote:
Hi all,

This patch improves RTL if-conversion on sequences that perform a
conditional select on integer constants.
Most of the smart logic to do that already exists in the
noce_try_store_flag_constants function.
However, currently that function is tried after noce_try_cmove.
noce_try_cmove is a simple catch-all function that just loads the two
immediates and performs a conditional
select between them. It returns true and then the caller
noce_process_if_block doesn't try any other transformations,
completely skipping the more aggressive transformations that
noce_try_store_flag_constants allows!

Calling noce_try_store_flag_constants before noce_try_cmove allows for
the smarter if-conversion transformations
to be used. An example that occurs a lot in the gcc code itself is for
the C code:
int
foo (int a, int b)
{
    return ((a & (1 << 25)) ? 5 : 4);
}

i.e. test a bit in a and return 5 or 4. Currently on aarch64 this
generates the naive:
          and     w2, w0, 33554432  // mask away all bits except bit 25
          mov     w1, 4
          cmp     w2, wzr
          mov     w0, 5
          csel    w0, w0, w1, ne


whereas with this patch this can be transformed into the much better:
          ubfx    x0, x0, 25, 1  // extract bit 25
          add     w0, w0, 4
I suspect the PA would benefit from this as well, probably several
other architectures with reasonable bitfield extraction capabilities.

This helps on arm as well but isn't enabled there at the moment
because this function is gated on !have_conditional_execution ().
We can look at enabling this for conditional execution
as well as a separate piece of work.


Another issue I encountered is that the code that claims to perform the
transformation:
        /* if (test) x = 3; else x = 4;
       =>   x = 3 + (test == 0);  */

doesn't seem to do exactly that in all cases. In fact for that case it
will try something like:
x = 4 - (test == 0)
which is suboptimal for targets like aarch64 which have a conditional
increment operation.
I vaguely recall targets that don't have const - X insns, but do have X
+ const style insns.  And more generally I think we're better off
generating the PLUS version.

Now that I think of it, aarch64 does have the CSETM instruction that
conditionally sets a register to 0 or -1, but I didn't see it being utilised
as frequently as I'd like to. Anyway, I do prefer generating the PLUS version
when possible too, which is what this patch tries to do.



This patch tweaks that code to always try to generate an addition of the
condition rather than
a subtraction.

Anyway, for code:
int
fooinc (int x)
{
    return x ? 1025 : 1026;
}

we currently generate:
          mov     w2, 1025
          mov     w1, 1026
          cmp     w0, wzr
          csel    w0, w2, w1, ne

whereas with this patch we will generate:
          cmp     w0, wzr
          cset    w0, eq
          add     w0, w0, 1025

Bootstrapped and tested on arm, aarch64, x86_64.
Ok for trunk?

Thanks,
Kyrill

P.S. noce_try_store_flag_constants is currently gated on
!targetm.have_conditional_execution () but I don't see
any reason to restrict it on targets with conditional execution. For
example, I think the first example above
would show a benefit on arm if it was enabled there. But that can be a
separate investigation.

2015-07-29  Kyrylo Tkachov <kyrylo.tkac...@arm.com>

      * ifcvt.c (noce_try_store_flag_constants): Reverse when diff is
      -STORE_FLAG and condition is reversable.  Prefer to add to the
      flag value.
      (noce_process_if_block): Try noce_try_store_flag_constants before
      noce_try_cmove.

2015-07-29  Kyrylo Tkachov <kyrylo.tkac...@arm.com>

      * gcc.target/aarch64/csel_bfx_1.c: New test.
      * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.

ifcvt-csel-imms.patch


commit 0164ef164483bdf0b2f73e267e2ff1df7800dd6d
Author: Kyrylo Tkachov<kyrylo.tkac...@arm.com>
Date:   Tue Jul 28 14:59:46 2015 +0100

      [RTL-ifcvt] Improve conditional increment ops on immediates

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index a57d78c..80d0285 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1222,7 +1222,7 @@ noce_try_store_flag_constants (struct noce_if_info 
*if_info)

         reversep = 0;
         if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
-       normalize = 0;
+       normalize = 0, reversep = (diff == -STORE_FLAG_VALUE) && can_reverse;
We generally avoid using a ',' operator like this.  However, I can see
that you're just following existing convention in that code.  So I won't
object.

Ok, I've respun the patch to convert this part to use a more conventional
style and also took the liberty of converting reversep and can_reverse into
bool variables.




         else if (ifalse == 0 && exact_log2 (itrue) >= 0
               && (STORE_FLAG_VALUE == 1
                   || if_info->branch_cost >= 2))
@@ -1261,10 +1261,13 @@ noce_try_store_flag_constants (struct noce_if_info 
*if_info)
         =>   x = 3 + (test == 0);  */
         if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
        {
-         target = expand_simple_binop (mode,
-                                       (diff == STORE_FLAG_VALUE
-                                        ? PLUS : MINUS),
-                                       gen_int_mode (ifalse, mode), target,
+         rtx_code code = reversep ? PLUS :
+                                   (diff == STORE_FLAG_VALUE ? PLUS
+                                                              : MINUS);
+         HOST_WIDE_INT to_add = reversep ? MIN (ifalse, itrue) : ifalse;
+
+         target = expand_simple_binop (mode, code,
+                                       gen_int_mode (to_add, mode), target,
                                        if_info->x, 0, OPTAB_WIDEN);
        }
Note that STORE_FLAG_VALUE can potentially be any value.  Most targets
use "1", but others use -1 (m68k for example, there are others).  I'm
concerned that the reversep ? MIN (ifalse, itrue) won't do what we want
on targets such as the m68k.

The logic here is also somewhat convoluted.  When reversep is true, we
always use PLUS, but is that actually correct?  Normally we'd compute
the code, then invert the code.  ie, if we normally want PLUS, then
we've flip to MINUS and vice-versa.

I agree that it's hard to follow. In this respin I have
explicitly laid out the different combinations of diff
and STORE_FLAG_VALUE. It is more verbose now, but hopefully
it's easier to follow.



@@ -3120,13 +3123,14 @@ noce_process_if_block (struct noce_if_info *if_info)
       goto success;
     if (noce_try_abs (if_info))
       goto success;
+  if (!targetm.have_conditional_execution ()
+      && noce_try_store_flag_constants (if_info))
+    goto success;
     if (HAVE_conditional_move
         && noce_try_cmove (if_info))
       goto success;
     if (! targetm.have_conditional_execution ())
       {
-      if (noce_try_store_flag_constants (if_info))
-       goto success;
         if (noce_try_addcc (if_info))
        goto success;
         if (noce_try_store_flag_mask (if_info))
This part seems fine and ought to be able to go forward independently.
Consider this hunk and its associated testcase approved if you want to
test it independently and get it installed while we iterate on the other
hunks,

This respin includes that hunk, however if this patch goes through any more
respins then I'll separate it out into a separate patch.

This version has been bootstrapped and tested on x86_64 and aarch64.

Thanks for the feedback!
Kyrill

2015-07-30  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * ifcvt.c (noce_try_store_flag_constants): Make logic of the case
    when diff == STORE_FLAG_VALUE or diff == -STORE_FLAG_VALUE more
    explicit.  Prefer to add the flag whenever possible.
    (noce_process_if_block): Try noce_try_store_flag_constants before
    noce_try_cmove.

2015-07-30  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * gcc.target/aarch64/csel_bfx_1.c: New test.
    * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.




jeff



commit bd720b6a5a298449ad850517e0fc29e825c702d9
Author: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
Date:   Tue Jul 28 14:59:46 2015 +0100

    [RTL-ifcvt] Improve conditional increment ops on immediates

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index a57d78c..909c674 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1194,9 +1194,10 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 {
   rtx target;
   rtx_insn *seq;
-  int reversep;
+  bool reversep;
   HOST_WIDE_INT itrue, ifalse, diff, tmp;
-  int normalize, can_reverse;
+  int normalize;
+  bool can_reverse;
   machine_mode mode;
 
   if (!noce_simple_bbs (if_info))
@@ -1208,6 +1209,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
       mode = GET_MODE (if_info->x);
       ifalse = INTVAL (if_info->a);
       itrue = INTVAL (if_info->b);
+      bool subtract_flag_p = false;
 
       diff = (unsigned HOST_WIDE_INT) itrue - ifalse;
       /* Make sure we can represent the difference between the two values.  */
@@ -1220,23 +1222,61 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
       can_reverse = (reversed_comparison_code (if_info->cond, if_info->jump)
 		     != UNKNOWN);
 
-      reversep = 0;
+      reversep = false;
       if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
-	normalize = 0;
+	{
+	  normalize = 0;
+	  /* We could collapse these cases but it is easier to follow the
+	     diff/STORE_FLAG_VALUE combinations when they are listed
+	     explicitly.  */
+
+	  /* test ? 3 : 4
+	     => 4 + (test != 0).  */
+	  if (diff < 0 && STORE_FLAG_VALUE < 0)
+	      reversep = false;
+	  /* test ? 4 : 3
+	     => can_reverse  | 4 + (test == 0)
+		!can_reverse | 3 - (test != 0).  */
+	  else if (diff > 0 && STORE_FLAG_VALUE < 0)
+	    {
+	      reversep = can_reverse;
+	      subtract_flag_p = !can_reverse;
+	    }
+	  /* test ? 3 : 4
+	     => can_reverse  | 3 + (test == 0)
+		!can_reverse | 4 - (test != 0).  */
+	  else if (diff < 0 && STORE_FLAG_VALUE > 0)
+	    {
+	      reversep = can_reverse;
+	      subtract_flag_p = !can_reverse;
+	    }
+	  /* test ? 4 : 3
+	     => 4 + (test != 0).  */
+	  else if (diff > 0 && STORE_FLAG_VALUE > 0)
+	    reversep = false;
+	  else
+	    gcc_unreachable ();
+	}
       else if (ifalse == 0 && exact_log2 (itrue) >= 0
 	       && (STORE_FLAG_VALUE == 1
 		   || if_info->branch_cost >= 2))
 	normalize = 1;
       else if (itrue == 0 && exact_log2 (ifalse) >= 0 && can_reverse
 	       && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2))
-	normalize = 1, reversep = 1;
+	{
+	  normalize = 1;
+	  reversep = true;
+	}
       else if (itrue == -1
 	       && (STORE_FLAG_VALUE == -1
 		   || if_info->branch_cost >= 2))
 	normalize = -1;
       else if (ifalse == -1 && can_reverse
 	       && (STORE_FLAG_VALUE == -1 || if_info->branch_cost >= 2))
-	normalize = -1, reversep = 1;
+	{
+	  normalize = -1;
+	  reversep = true;
+	}
       else if ((if_info->branch_cost >= 2 && STORE_FLAG_VALUE == -1)
 	       || if_info->branch_cost >= 3)
 	normalize = -1;
@@ -1261,9 +1301,9 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	 =>   x = 3 + (test == 0);  */
       if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
 	{
-	  target = expand_simple_binop (mode,
-					(diff == STORE_FLAG_VALUE
-					 ? PLUS : MINUS),
+	  /* Always use ifalse here.  It should have been swapped with itrue
+	     when appropriate when reversep is true.  */
+	  target = expand_simple_binop (mode, subtract_flag_p ? MINUS : PLUS,
 					gen_int_mode (ifalse, mode), target,
 					if_info->x, 0, OPTAB_WIDEN);
 	}
@@ -3120,13 +3160,14 @@ noce_process_if_block (struct noce_if_info *if_info)
     goto success;
   if (noce_try_abs (if_info))
     goto success;
+  if (!targetm.have_conditional_execution ()
+      && noce_try_store_flag_constants (if_info))
+    goto success;
   if (HAVE_conditional_move
       && noce_try_cmove (if_info))
     goto success;
   if (! targetm.have_conditional_execution ())
     {
-      if (noce_try_store_flag_constants (if_info))
-	goto success;
       if (noce_try_addcc (if_info))
 	goto success;
       if (noce_try_store_flag_mask (if_info))
diff --git a/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c
new file mode 100644
index 0000000..c20597f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/csel_bfx_1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -O2" } */
+
+int
+foo (int a, int b)
+{
+  return ((a & (1 << 25)) ? 5 : 4);
+}
+
+/* { dg-final { scan-assembler "ubfx\t\[xw\]\[0-9\]*.*" } } */
+/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c
new file mode 100644
index 0000000..2ae434d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/csel_imms_inc_1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-save-temps -O2 -fno-inline" } */
+
+extern void abort (void);
+
+int
+fooinc (int x)
+{
+  if (x)
+    return 1025;
+  else
+    return 1026;
+}
+
+int
+fooinc2 (int x)
+{
+  if (x)
+    return 1026;
+  else
+    return 1025;
+}
+
+int
+main (void)
+{
+  if (fooinc (0) != 1026)
+    abort ();
+
+  if (fooinc (1) != 1025)
+    abort ();
+
+  if (fooinc2 (0) != 1025)
+    abort ();
+
+  if (fooinc2 (1) != 1026)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "csel\tw\[0-9\]*.*" } } */

Reply via email to