On Thu, May 19, 2016 at 11:14:37AM -0600, Jeff Law wrote:
> On 05/19/2016 05:18 AM, Dominik Vogt wrote:
> >On Mon, May 16, 2016 at 01:09:36PM -0600, Jeff Law wrote:
> >>> On 05/11/2016 02:52 AM, Dominik Vogt wrote:
> >>>> >On Wed, May 11, 2016 at 10:40:11AM +0200, Bernd Schmidt wrote:
> >>>> >That's what I mentioned somewhere during the discussion.  The s390
> >>>> >backend just uses COSTS_N_INSNS(1) for AND as well as ZERO_EXTEND,
> >>>> >so this won't ever trigger.  I just left the rtx_cost call in the
> >>>> >patch for further discussion as Jeff said he liked the approach.
> >>>> >We don't need it to achieve the behaviour we want for s390.
> >>> I liked it, just based on the general theory that we should be
> >>> comparing costs of a transform to the original much more often than
> >>> we currently do.
> >>>
> >>> If Bernd prefers it gone and you don't need it to achieve your
> >>> goals, then I won't object to the costing stuff going away.
> >All right, third version attached, without the rtx_vost call;
> >bootstrapped and regression tested on s390, s390x, x86_64.
> >
> >On Wed, Apr 27, 2016 at 09:20:05AM +0100, Dominik Vogt wrote:
> >>> The attached patch is a result of discussing an S/390 issue with
> >>> "and with complement" in some cases.
> >>>
> >>>   https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html
> >>>   https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html
> >>>
> >>> Combine would merge a ZERO_EXTEND and a SET taking the known zero
> >>> bits into account, resulting in an AND.  Later on,
> >>> make_compound_operation() fails to replace that with a ZERO_EXTEND
> >>> which we get for free on S/390 but leaves the AND, eventually
> >>> resulting in two consecutive AND instructions.
> >>>
> >>> The current code in make_compound_operation() that detects
> >>> opportunities for ZERO_EXTEND does not work here because it does
> >>> not take the known zero bits into account:
> >>>
> >>>       /* If the constant is one less than a power of two, this might be
> >>>        representable by an extraction even if no shift is present.
> >>>        If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
> >>>        we are in a COMPARE.  */
> >>>       else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0)
> >>>       new_rtx = make_extraction (mode,
> >>>                              make_compound_operation (XEXP (x, 0),
> >>>                                                       next_code),
> >>>                              0, NULL_RTX, i, 1, 0, in_code == COMPARE);
> >>>
> >>> An attempt to use the zero bits in the above conditions resulted
> >>> in many situations that generated worse code, so the patch tries
> >>> to fix this in a more conservative way.  While the effect is
> >>> completely positive on S/390, this will very likely have
> >>> unforeseeable consequences on other targets.
> >>>
> >>> Bootstrapped and regression tested on s390 and s390x only at the
> >>> moment.
> >Ciao
> >
> >Dominik ^_^  ^_^
> >
> >-- Dominik Vogt IBM Germany
> >
> >
> >0001-v3-ChangeLog
> >
> >
> >gcc/ChangeLog
> >
> >     * combine.c (make_compound_operation): Take known zero bits into
> >     account when checking for possible zero_extend.
> >gcc/testsuite/ChangeLog
> >
> >     * gcc.dg/zero_bits_compound-1.c: New test.
> >     * gcc.dg/zero_bits_compound-2.c: New test.
> I'm a little worried about the tests.  They check for lp64, but the
> tests actually need a stronger set of conditions to pass.
> 
> I'm thinking that the tests ought to be opt-in as I don't think we
> have a set of effective-target tests we can use.
> 
> So OK with the tests restricted to the targets where you've verified
> they work.

Attached.

(Are opt-in tests usually preferred over opt-out tests or is there
no fixed ruled to decide that?)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

        * combine.c (make_compound_operation): Take known zero bits into
        account when checking for possible zero_extend.
gcc/testsuite/ChangeLog

        * gcc.dg/zero_bits_compound-1.c: New test.
        * gcc.dg/zero_bits_compound-2.c: New test.
>From 0fc3622dcee3430ac78452cfeafa3cba27cf68fa Mon Sep 17 00:00:00 2001
From: Dominik Vogt <v...@linux.vnet.ibm.com>
Date: Tue, 12 Apr 2016 09:53:46 +0100
Subject: [PATCH] Take known zero bits into account when checking
 extraction.

Allows AND Insns with a const_int operand to be expressed as ZERO_EXTEND if the
operand ist a power of 2 - 1 even with the known zero bits masked out.
---
 gcc/combine.c                               | 28 +++++++++++++++++++
 gcc/testsuite/gcc.dg/zero_bits_compound-1.c | 42 +++++++++++++++++++++++++++++
 gcc/testsuite/gcc.dg/zero_bits_compound-2.c | 39 +++++++++++++++++++++++++++
 3 files changed, 109 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-1.c
 create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-2.c

diff --git a/gcc/combine.c b/gcc/combine.c
index 3554f51..97d59d7 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7988,6 +7988,34 @@ make_compound_operation (rtx x, enum rtx_code in_code)
 							next_code),
 			       i, NULL_RTX, 1, 1, 0, 1);
 
+      /* If the one operand is a paradoxical subreg of a register or memory and
+	 the constant (limited to the smaller mode) has only zero bits where
+	 the sub expression has known zero bits, this can be expressed as
+	 a zero_extend.  */
+      else if (GET_CODE (XEXP (x, 0)) == SUBREG)
+	{
+	  rtx sub;
+
+	  sub = XEXP (XEXP (x, 0), 0);
+	  machine_mode sub_mode = GET_MODE (sub);
+	  if ((REG_P (sub) || MEM_P (sub))
+	      && GET_MODE_PRECISION (sub_mode) < mode_width)
+	    {
+	      unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode);
+	      unsigned HOST_WIDE_INT mask;
+
+	      /* original AND constant with all the known zero bits set */
+	      mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode));
+	      if ((mask & mode_mask) == mode_mask)
+		{
+		  new_rtx = make_compound_operation (sub, next_code);
+		  new_rtx = make_extraction (mode, new_rtx, 0, 0,
+					     GET_MODE_PRECISION (sub_mode),
+					     1, 0, in_code == COMPARE);
+		}
+	    }
+	}
+
       break;
 
     case LSHIFTRT:
diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
new file mode 100644
index 0000000..d78dc43
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
@@ -0,0 +1,42 @@
+/* Test whether an AND mask or'ed with the know zero bits that equals a mode
+   mask is a candidate for zero extendion.  */
+
+/* Note: This test requires that char, int and long have different sizes and the
+   target has a way to do 32 -> 64 bit zero extension other than AND.  */
+
+/* { dg-do compile { target x86_64-*-* s390*-*-* } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O3 -dP" } */
+
+unsigned long foo (unsigned char c)
+{
+  unsigned long l;
+  unsigned int i;
+
+  i = ((unsigned int)c) << 8;
+  i |= ((unsigned int)c) << 20;
+  asm volatile ("":::);
+  i = i & 0x0ff0ff00;
+  asm volatile ("":::);
+  l = (unsigned long)i;
+
+  return l;
+}
+
+unsigned long bar (unsigned char c)
+{
+  unsigned long l;
+  unsigned int i;
+
+  i = ((unsigned int)c) << 8;
+  i |= ((unsigned int)c) << 20;
+  asm volatile ("":::);
+  i = i & 0x0ffffff0;
+  asm volatile ("":::);
+  l = (unsigned long)i;
+
+  return l;
+}
+
+/* Check that no pattern containing an AND expression was used.  */
+/* { dg-final { scan-assembler-not "\\(and:" } } */
diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-2.c b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c
new file mode 100644
index 0000000..80fd363
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c
@@ -0,0 +1,39 @@
+/* Test whether an AND mask or'ed with the know zero bits that equals a mode
+   mask is a candidate for zero extendion.  */
+
+/* { dg-do compile { target x86_64-*-* s390*-*-* } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O3 -dP" } */
+
+unsigned long foo (unsigned char c)
+{
+  unsigned long l;
+  unsigned int i;
+
+  i = ((unsigned int)c) << 8;
+  i |= ((unsigned int)c) << 20;
+  asm volatile ("":::);
+  i = i & 0x0fe0fe00;
+  asm volatile ("":::);
+  l = (unsigned long)i;
+
+  return l;
+}
+
+unsigned long bar (unsigned char c)
+{
+  unsigned long l;
+  unsigned int i;
+
+  i = ((unsigned int)c) << 8;
+  i |= ((unsigned int)c) << 20;
+  asm volatile ("":::);
+  i = i & 0x07f007f0;
+  asm volatile ("":::);
+  l = (unsigned long)i;
+
+  return l;
+}
+
+/* Check that an AND expression was used.  */
+/* { dg-final { scan-assembler-times "\\(and:" 2 } } */
-- 
2.3.0

Reply via email to