Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

Segher Boessenkool Tue, 30 Nov 2021 10:12:30 -0800

Hi!

On Tue, Nov 30, 2021 at 04:46:34PM +0800, HAO CHEN GUI wrote:
>     This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior 
> pattern.
> 
>     Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.


(Please make shorter lines in email.  70 chars is usual).

> gcc/
>         * combine.c (change_pseudo_and_mask): New.
>         (recog_for_combine): If recog fails, try again with the pattern
>         modified by change_pseudo_and_mask.
> 
> gcc/testsuite/
>         * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
>         * gcc.target/powerpc/rlwimi-2.c: Likewise.

> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its
> +   nonzero_bits is less than its mode mask.  */

Please add some words *why* we do this (namely, because you cannot use
nonzero_bits in combine as well as after combine and expect the same
answer).

> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;
> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +       || GET_CODE (src) == XOR
> +       || GET_CODE (src) == PLUS)
> +      && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> +           || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> +           || GET_CODE (XEXP (src, 0)) == AND)
> +          && REG_P (XEXP (src, 1)))
> +         || ((GET_CODE (XEXP (src, 1)) == ASHIFT
> +              || GET_CODE (XEXP (src, 1)) == LSHIFTRT
> +              || GET_CODE (XEXP (src, 1)) == AND)
> +             && REG_P (XEXP (src, 0)))))

If one arm is a pseudo and the other is compound, the compound one is
first always.  This is one of those canonicalisations that simplifies a
lot of code -- including this new code :-)

> +    {
> +      rtx *reg = REG_P (XEXP (src, 0))
> +                ? &XEXP (SET_SRC (pat), 0)
> +                : &XEXP (SET_SRC (pat), 1);

This is indented wrong.  But, in fact, all tabs are changed to spaces in
your patch?

> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
>             }
>         }
>        else
> -       changed = change_zero_ext (pat);
> +       {
> +         if (change_pseudo_and_mask (pat))
> +           {
> +             maybe_swap_commutative_operands (SET_SRC (pat));
> +             changed = true;
> +           }
> +         changed |= change_zero_ext (pat);
> +       }
>      }
>    else if (GET_CODE (pat) == PARALLEL)
>      {


  changed = change_zero_ext (pat);
  if (!changed)
    changed = change_pseudo_and_mask (pat);

  if (changed)
    maybe_swap_commutative_operands (SET_SRC (pat));


> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>    b.y = (x<<12) | (x>>20);
>  }
> 
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */

Please show the -m32 code before and after the change?  Why is it okay
to get an rlwinm there?

> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> @@ -2,14 +2,14 @@
>  /* { dg-options "-O2" } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } 
> } */

No, it is not okay to generate worse code.  In what cases do you see
more insns now, and why?

>  /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */

Are the new rlwimi's good to have, or can we do those with simpler or
fewer insns?


Segher

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

Reply via email to