Re: [PATCH 2/6] combine: If recog fails, try again with zero_ext{ract,end} simplified

2015-05-12 Thread Segher Boessenkool
On Tue, May 12, 2015 at 09:20:18AM +0100, Kyrill Tkachov wrote:
> Does this patch means we can remove any patterns in
> the backend that look like:
> 
> -  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
> -   (zero_extract:SI (match_operand:SI 1 "gpc_reg_operand" "r")
> -(match_operand:SI 2 "const_int_operand" "i")
> -(match_operand:SI 3 "const_int_operand" "i")))]
> 
> 
> as long as we have an equivalent and-with-mask pattern?

Yes, exactly.  But you can also keep them, it will find either.

Older / CISC targets will usually want the zero_ext*, newer / RISC
targets will more often write something as AND (and LSHIFTRT).

You can also have both in the same target: for example, many targets
will want to write their plain zero_extend patterns as that, because
you want to allow both mem and reg in the same pattern, for reload.


Segher


Re: [PATCH 2/6] combine: If recog fails, try again with zero_ext{ract,end} simplified

2015-05-12 Thread Kyrill Tkachov

Hi Segher,

On 10/05/15 17:13, Segher Boessenkool wrote:

Combine has its own ideas of what is "canonical" RTL, forcing all
backends to have special patterns in their machine description for the
"more simplified" patterns combine often creates, even though the
backend already has patterns for a more general form.  Backends that
do not implement those patterns get less well optimised code.

This patch lifts that burden for two cases: combine often converts
an AND (with, say, 0xff) to a ZERO_EXTEND of a SUBREG; and an LSHIFTRT
followed by an AND to a ZERO_EXTRACT.  This is perfectly helpful for
e.g. MEMs, but not nice if you have instructions to do more generic
masking (like PowerPC rlwinm, and similar on some other archs).

With this patch, if recog_for_combine fails, and there are any
ZERO_EXT* in the pattern to be matched, it tries again with those
expressed as AND etc.  If that also fails it rolls back the changes,
because it might still match after e.g. splitting, and we want to
try the ZERO_EXT* for that as well.

Tested on powerpc-linux, before and after removing many patterns
from the machine description, and checked that the only changes in
the bootstrapped compiler are new and removed functions.



Does this patch means we can remove any patterns in
the backend that look like:

-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
-   (zero_extract:SI (match_operand:SI 1 "gpc_reg_operand" "r")
-(match_operand:SI 2 "const_int_operand" "i")
-(match_operand:SI 3 "const_int_operand" "i")))]


as long as we have an equivalent and-with-mask pattern?

Thanks,
Kyrill




I'll also test on x86_64-linux before committing.


Segher


2015-05-10  Segher Boessenkool   

* combine.c (recog_for_combine_1): New function, factored out
from recog_for_combine.
(change_zero_ext): New function.
(recog_for_combine): If recog fails, try again with the pattern
modified by change_zero_ext; if that still fails, restore the
pattern.

---
  gcc/combine.c | 120 +++---
  1 file changed, 107 insertions(+), 13 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 1e4d65e..896d9d2 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -10849,21 +10849,11 @@ simplify_shift_const (rtx x, enum rtx_code code, 
machine_mode result_mode,
  }
  
  

-/* Like recog, but we receive the address of a pointer to a new pattern.
-   We try to match the rtx that the pointer points to.
-   If that fails, we may try to modify or replace the pattern,
-   storing the replacement into the same pointer object.
-
-   Modifications include deletion or addition of CLOBBERs.
-
-   PNOTES is a pointer to a location where any REG_UNUSED notes added for
-   the CLOBBERs are placed.
-
-   The value is the final insn code from the pattern ultimately matched,
-   or -1.  */
+/* A subroutine of recog_for_combine.  See there for arguments and
+   return value.  */
  
  static int

-recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx *pnotes)
+recog_for_combine_1 (rtx *pnewpat, rtx_insn *insn, rtx *pnotes)
  {
rtx pat = *pnewpat;
rtx pat_without_clobbers;
@@ -11010,6 +11000,110 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
*pnotes)
  
return insn_code_number;

  }
+
+/* Change every ZERO_EXTRACT and ZERO_EXTEND of a SUBREG that can be
+   expressed as an AND and maybe an LSHIFTRT, to that formulation.
+   Return whether anything was so changed.  */
+
+static bool
+change_zero_ext (rtx *src)
+{
+  bool changed = false;
+
+  subrtx_ptr_iterator::array_type array;
+  FOR_EACH_SUBRTX_PTR (iter, array, src, NONCONST)
+{
+  rtx x = **iter;
+  machine_mode mode = GET_MODE (x);
+  int size;
+
+  if (GET_CODE (x) == ZERO_EXTRACT
+ && CONST_INT_P (XEXP (x, 1))
+ && CONST_INT_P (XEXP (x, 2))
+ && GET_MODE (XEXP (x, 0)) == mode)
+   {
+ size = INTVAL (XEXP (x, 1));
+
+ int start = INTVAL (XEXP (x, 2));
+ if (BITS_BIG_ENDIAN)
+   start = GET_MODE_PRECISION (mode) - size - start;
+
+ x = gen_rtx_LSHIFTRT (mode, XEXP (x, 0), GEN_INT (start));
+   }
+  else if (GET_CODE (x) == ZERO_EXTEND
+  && GET_CODE (XEXP (x, 0)) == SUBREG
+  && GET_MODE (SUBREG_REG (XEXP (x, 0))) == mode
+  && subreg_lowpart_p (XEXP (x, 0)))
+   {
+ size = GET_MODE_PRECISION (GET_MODE (XEXP (x, 0)));
+ x = SUBREG_REG (XEXP (x, 0));
+   }
+  else
+   continue;
+
+  unsigned HOST_WIDE_INT mask = 1;
+  mask <<= size;
+  mask--;
+
+  x = gen_rtx_AND (mode, x, GEN_INT (mask));
+
+  SUBST (**iter, x);
+  changed = true;
+}
+
+  return changed;
+}
+
+/* Like recog, but we receive the address of a pointer to a new pattern.
+   We try to match the rtx that the pointer points to.
+   If that fails, we may try to modify or replace the pattern,
+   stori

Re: [PATCH 2/6] combine: If recog fails, try again with zero_ext{ract,end} simplified

2015-05-10 Thread Segher Boessenkool
On Sun, May 10, 2015 at 10:15:34PM -0600, Jeff Law wrote:
> > (recog_for_combine): If recog fails, try again with the pattern
> > modified by change_zero_ext; if that still fails, restore the
> > pattern.
> I like it.  Attacking the extensions are the most obvious candidates, 
> but I wonder if there's others (like the whole "ASHIFT vs MULT" stuff 
> that we were recently looking at for ARM).

Yeah, I thought about that as well.  But that case, MULT instead of shift,
outside of MEM, is simply non-canonical RTL; and combine created it on
purpose.  It just shouldn't.

There certainly will be other cases though; hopefully not overlapping,
I don't want to recog an exponentially expanding number of times ;-)


Segher


Re: [PATCH 2/6] combine: If recog fails, try again with zero_ext{ract,end} simplified

2015-05-10 Thread Jeff Law

On 05/10/2015 10:13 AM, Segher Boessenkool wrote:

Combine has its own ideas of what is "canonical" RTL, forcing all
backends to have special patterns in their machine description for the
"more simplified" patterns combine often creates, even though the
backend already has patterns for a more general form.  Backends that
do not implement those patterns get less well optimised code.

This patch lifts that burden for two cases: combine often converts
an AND (with, say, 0xff) to a ZERO_EXTEND of a SUBREG; and an LSHIFTRT
followed by an AND to a ZERO_EXTRACT.  This is perfectly helpful for
e.g. MEMs, but not nice if you have instructions to do more generic
masking (like PowerPC rlwinm, and similar on some other archs).

With this patch, if recog_for_combine fails, and there are any
ZERO_EXT* in the pattern to be matched, it tries again with those
expressed as AND etc.  If that also fails it rolls back the changes,
because it might still match after e.g. splitting, and we want to
try the ZERO_EXT* for that as well.

Tested on powerpc-linux, before and after removing many patterns
from the machine description, and checked that the only changes in
the bootstrapped compiler are new and removed functions.

I'll also test on x86_64-linux before committing.


Segher


2015-05-10  Segher Boessenkool   

* combine.c (recog_for_combine_1): New function, factored out
from recog_for_combine.
(change_zero_ext): New function.
(recog_for_combine): If recog fails, try again with the pattern
modified by change_zero_ext; if that still fails, restore the
pattern.
I like it.  Attacking the extensions are the most obvious candidates, 
but I wonder if there's others (like the whole "ASHIFT vs MULT" stuff 
that we were recently looking at for ARM).



jeff