On Thu, May 19, 2016 at 11:14:37AM -0600, Jeff Law wrote: > On 05/19/2016 05:18 AM, Dominik Vogt wrote: > >On Mon, May 16, 2016 at 01:09:36PM -0600, Jeff Law wrote: > >>> On 05/11/2016 02:52 AM, Dominik Vogt wrote: > >>>> >On Wed, May 11, 2016 at 10:40:11AM +0200, Bernd Schmidt wrote: > >>>> >That's what I mentioned somewhere during the discussion. The s390 > >>>> >backend just uses COSTS_N_INSNS(1) for AND as well as ZERO_EXTEND, > >>>> >so this won't ever trigger. I just left the rtx_cost call in the > >>>> >patch for further discussion as Jeff said he liked the approach. > >>>> >We don't need it to achieve the behaviour we want for s390. > >>> I liked it, just based on the general theory that we should be > >>> comparing costs of a transform to the original much more often than > >>> we currently do. > >>> > >>> If Bernd prefers it gone and you don't need it to achieve your > >>> goals, then I won't object to the costing stuff going away. > >All right, third version attached, without the rtx_vost call; > >bootstrapped and regression tested on s390, s390x, x86_64. > > > >On Wed, Apr 27, 2016 at 09:20:05AM +0100, Dominik Vogt wrote: > >>> The attached patch is a result of discussing an S/390 issue with > >>> "and with complement" in some cases. > >>> > >>> https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html > >>> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html > >>> > >>> Combine would merge a ZERO_EXTEND and a SET taking the known zero > >>> bits into account, resulting in an AND. Later on, > >>> make_compound_operation() fails to replace that with a ZERO_EXTEND > >>> which we get for free on S/390 but leaves the AND, eventually > >>> resulting in two consecutive AND instructions. > >>> > >>> The current code in make_compound_operation() that detects > >>> opportunities for ZERO_EXTEND does not work here because it does > >>> not take the known zero bits into account: > >>> > >>> /* If the constant is one less than a power of two, this might be > >>> representable by an extraction even if no shift is present. > >>> If it doesn't end up being a ZERO_EXTEND, we will ignore it unless > >>> we are in a COMPARE. */ > >>> else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0) > >>> new_rtx = make_extraction (mode, > >>> make_compound_operation (XEXP (x, 0), > >>> next_code), > >>> 0, NULL_RTX, i, 1, 0, in_code == COMPARE); > >>> > >>> An attempt to use the zero bits in the above conditions resulted > >>> in many situations that generated worse code, so the patch tries > >>> to fix this in a more conservative way. While the effect is > >>> completely positive on S/390, this will very likely have > >>> unforeseeable consequences on other targets. > >>> > >>> Bootstrapped and regression tested on s390 and s390x only at the > >>> moment. > >Ciao > > > >Dominik ^_^ ^_^ > > > >-- Dominik Vogt IBM Germany > > > > > >0001-v3-ChangeLog > > > > > >gcc/ChangeLog > > > > * combine.c (make_compound_operation): Take known zero bits into > > account when checking for possible zero_extend. > >gcc/testsuite/ChangeLog > > > > * gcc.dg/zero_bits_compound-1.c: New test. > > * gcc.dg/zero_bits_compound-2.c: New test. > I'm a little worried about the tests. They check for lp64, but the > tests actually need a stronger set of conditions to pass. > > I'm thinking that the tests ought to be opt-in as I don't think we > have a set of effective-target tests we can use. > > So OK with the tests restricted to the targets where you've verified > they work.
Attached. (Are opt-in tests usually preferred over opt-out tests or is there no fixed ruled to decide that?) Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
gcc/ChangeLog * combine.c (make_compound_operation): Take known zero bits into account when checking for possible zero_extend. gcc/testsuite/ChangeLog * gcc.dg/zero_bits_compound-1.c: New test. * gcc.dg/zero_bits_compound-2.c: New test.
>From 0fc3622dcee3430ac78452cfeafa3cba27cf68fa Mon Sep 17 00:00:00 2001 From: Dominik Vogt <v...@linux.vnet.ibm.com> Date: Tue, 12 Apr 2016 09:53:46 +0100 Subject: [PATCH] Take known zero bits into account when checking extraction. Allows AND Insns with a const_int operand to be expressed as ZERO_EXTEND if the operand ist a power of 2 - 1 even with the known zero bits masked out. --- gcc/combine.c | 28 +++++++++++++++++++ gcc/testsuite/gcc.dg/zero_bits_compound-1.c | 42 +++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/zero_bits_compound-2.c | 39 +++++++++++++++++++++++++++ 3 files changed, 109 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-1.c create mode 100644 gcc/testsuite/gcc.dg/zero_bits_compound-2.c diff --git a/gcc/combine.c b/gcc/combine.c index 3554f51..97d59d7 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7988,6 +7988,34 @@ make_compound_operation (rtx x, enum rtx_code in_code) next_code), i, NULL_RTX, 1, 1, 0, 1); + /* If the one operand is a paradoxical subreg of a register or memory and + the constant (limited to the smaller mode) has only zero bits where + the sub expression has known zero bits, this can be expressed as + a zero_extend. */ + else if (GET_CODE (XEXP (x, 0)) == SUBREG) + { + rtx sub; + + sub = XEXP (XEXP (x, 0), 0); + machine_mode sub_mode = GET_MODE (sub); + if ((REG_P (sub) || MEM_P (sub)) + && GET_MODE_PRECISION (sub_mode) < mode_width) + { + unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode); + unsigned HOST_WIDE_INT mask; + + /* original AND constant with all the known zero bits set */ + mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode)); + if ((mask & mode_mask) == mode_mask) + { + new_rtx = make_compound_operation (sub, next_code); + new_rtx = make_extraction (mode, new_rtx, 0, 0, + GET_MODE_PRECISION (sub_mode), + 1, 0, in_code == COMPARE); + } + } + } + break; case LSHIFTRT: diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c new file mode 100644 index 0000000..d78dc43 --- /dev/null +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c @@ -0,0 +1,42 @@ +/* Test whether an AND mask or'ed with the know zero bits that equals a mode + mask is a candidate for zero extendion. */ + +/* Note: This test requires that char, int and long have different sizes and the + target has a way to do 32 -> 64 bit zero extension other than AND. */ + +/* { dg-do compile { target x86_64-*-* s390*-*-* } } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O3 -dP" } */ + +unsigned long foo (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0ff0ff00; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +unsigned long bar (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0ffffff0; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +/* Check that no pattern containing an AND expression was used. */ +/* { dg-final { scan-assembler-not "\\(and:" } } */ diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-2.c b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c new file mode 100644 index 0000000..80fd363 --- /dev/null +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-2.c @@ -0,0 +1,39 @@ +/* Test whether an AND mask or'ed with the know zero bits that equals a mode + mask is a candidate for zero extendion. */ + +/* { dg-do compile { target x86_64-*-* s390*-*-* } } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O3 -dP" } */ + +unsigned long foo (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x0fe0fe00; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +unsigned long bar (unsigned char c) +{ + unsigned long l; + unsigned int i; + + i = ((unsigned int)c) << 8; + i |= ((unsigned int)c) << 20; + asm volatile ("":::); + i = i & 0x07f007f0; + asm volatile ("":::); + l = (unsigned long)i; + + return l; +} + +/* Check that an AND expression was used. */ +/* { dg-final { scan-assembler-times "\\(and:" 2 } } */ -- 2.3.0