Re: [PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-20 Thread HAO CHEN GUI via Gcc-patches
Thanks so much for your advice. Please see my comments.

On 21/1/2022 上午 5:42, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jan 20, 2022 at 01:46:48PM -0500, David Edelsohn wrote:
>> On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI  wrote:
>>>This patch adds a combine pattern for "CA minus one". As CA only has two
>>> values (0 or 1), we could convert following pattern
>>>   (sign_extend:DI (plus:SI (reg:SI 98 ca)
>>> (const_int -1 [0x]
>>> to
>>>(plus:DI (reg:DI 98 ca)
>>> (const_int -1 [0x])))
>>>With this patch, one unnecessary sign extend is eliminated.
>>>
>>>Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
>>> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> There are ten gazillion similar things we could make extra backend
> patterns for, and we still would not cover a majority of cases.
> 
> If instead we got some generic way to handle this we could cover many
> more cases, for much less effort.
Could we add an additional pass to exam the finally generated instructions
and its used registers to decide which extension is unnecessary?
> 
> We need both widening modes from SI to DI, amd narrowing modes from DI
> to SI.  Both are useful in certain cases; it is not like using wider
> modes is always better, in some cases narrower modes is better (in cases
> where we can let the generated code then generate whatever bits in the
> high half of the word, for example; a typical example is addition in an
> unsigned int).
Just for this case, converting CA from DI to SI is supported in simplify_rtx.
The original comparison result is in DI mode. But it's truncated to SI mode as
C standard requires.

Trying 8 -> 11:
8: {r127:DI=ca:DI-0x1;clobber ca:DI;}
  REG_DEAD ca:DI
  REG_UNUSED ca:DI
   11: r128:SI=r127:DI#0
  REG_DEAD r127:DI
Successfully matched this instruction:
(set (reg:SI 128)
(plus:SI (reg:SI 98 ca)
(const_int -1 [0x])))
allowing combination of insns 8 and 11
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i311: {r128:SI=ca:SI-0x1;clobber ca:SI;}
  REG_UNUSED ca:SI
deferring rescan insn with uid = 11.

The C standard type promotion requirement and 64-bit return value are the
root cause of such problem, I think.
> 
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
>>> @@ -0,0 +1,10 @@
>>> +/* PR target/95737 */
>>> +/* { dg-do compile { target lp64 } } */
>>> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
>>
>> Why does the testcase force power8? This testcase is not specific to
>> Power8 or later.
> 
> Yes, and we should generate the same code on older machines.
> 
>>> +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
>>> +
>>> +
>>> +unsigned long long negativeLessThan (unsigned long long a, unsigned long 
>>> long b)
>>> +{
>>> +   return -(a < b);
>>> +}
>>
>> If you're only testing for lp64, the testcase could use "long" instead
>> of "long long".
> 
> The testcase really needs "powerpc64", if that would mean "test if
> -mpowerpc64 is (implicitly) used".  But that is not what it currently
> means (it is something akin to "powerpc64_hw", instead).
> 
> So we test lp64, which is set if and only if -m64 was used.  It is
> reasonable coverage, no one cares much for -m32 -mpowerpc64 .
> 
> 
> Segher


Re: [PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-20 Thread Segher Boessenkool
Hi!

On Thu, Jan 20, 2022 at 01:46:48PM -0500, David Edelsohn wrote:
> On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI  wrote:
> >This patch adds a combine pattern for "CA minus one". As CA only has two
> > values (0 or 1), we could convert following pattern
> >   (sign_extend:DI (plus:SI (reg:SI 98 ca)
> > (const_int -1 [0x]
> > to
> >(plus:DI (reg:DI 98 ca)
> > (const_int -1 [0x])))
> >With this patch, one unnecessary sign extend is eliminated.
> >
> >Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> > Is this okay for trunk? Any recommendations? Thanks a lot.

There are ten gazillion similar things we could make extra backend
patterns for, and we still would not cover a majority of cases.

If instead we got some generic way to handle this we could cover many
more cases, for much less effort.

We need both widening modes from SI to DI, amd narrowing modes from DI
to SI.  Both are useful in certain cases; it is not like using wider
modes is always better, in some cases narrower modes is better (in cases
where we can let the generated code then generate whatever bits in the
high half of the word, for example; a typical example is addition in an
unsigned int).

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> > @@ -0,0 +1,10 @@
> > +/* PR target/95737 */
> > +/* { dg-do compile { target lp64 } } */
> > +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
> 
> Why does the testcase force power8? This testcase is not specific to
> Power8 or later.

Yes, and we should generate the same code on older machines.

> > +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
> > +
> > +
> > +unsigned long long negativeLessThan (unsigned long long a, unsigned long 
> > long b)
> > +{
> > +   return -(a < b);
> > +}
> 
> If you're only testing for lp64, the testcase could use "long" instead
> of "long long".

The testcase really needs "powerpc64", if that would mean "test if
-mpowerpc64 is (implicitly) used".  But that is not what it currently
means (it is something akin to "powerpc64_hw", instead).

So we test lp64, which is set if and only if -m64 was used.  It is
reasonable coverage, no one cares much for -m32 -mpowerpc64 .


Segher


Re: [PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-20 Thread David Edelsohn via Gcc-patches
On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI  wrote:
>
> Hi,
>This patch adds a combine pattern for "CA minus one". As CA only has two
> values (0 or 1), we could convert following pattern
>   (sign_extend:DI (plus:SI (reg:SI 98 ca)
> (const_int -1 [0x]
> to
>(plus:DI (reg:DI 98 ca)
> (const_int -1 [0x])))
>With this patch, one unnecessary sign extend is eliminated.
>
>Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
> 2022-01-20 Haochen Gui 
>
> gcc/
> * config/rs6000/rs6000.md (extenddi_ca_minus_one): Define.
>
> gcc/testsuite/
> * gcc.target/powerpc/pr95737.c: New.
>
>
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6ecb0bd6142..1d8b212962f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -2358,6 +2358,19 @@ (define_insn "subf3_carry_in_xx"
>"subfe %0,%0,%0"
>[(set_attr "type" "add")])
>
> +(define_insn_and_split "*extenddi_ca_minus_one"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> +   (sign_extend:DI (plus:SI (reg:SI CA_REGNO)
> +(const_int -1]
> +  ""
> +  "#"
> +  ""
> +  [(parallel [(set (match_dup 0)
> +  (plus:DI (reg:DI CA_REGNO)
> +   (const_int -1)))
> + (clobber (reg:DI CA_REGNO))])]
> +  ""
> +)
>
>  (define_insn "@neg2"
>[(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c 
> b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> new file mode 100644
> index 000..94320f23423
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> @@ -0,0 +1,10 @@
> +/* PR target/95737 */
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */

Why does the testcase force power8? This testcase is not specific to
Power8 or later.

> +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
> +
> +
> +unsigned long long negativeLessThan (unsigned long long a, unsigned long 
> long b)
> +{
> +   return -(a < b);
> +}

If you're only testing for lp64, the testcase could use "long" instead
of "long long".

This is okay with those changes.

Thanks, David


[PATCH v2, rs6000] Add a combine pattern for CA minus one [PR95737]

2022-01-19 Thread HAO CHEN GUI via Gcc-patches
Hi,
   This patch adds a combine pattern for "CA minus one". As CA only has two
values (0 or 1), we could convert following pattern
  (sign_extend:DI (plus:SI (reg:SI 98 ca)
(const_int -1 [0x]
to
   (plus:DI (reg:DI 98 ca)
(const_int -1 [0x])))
   With this patch, one unnecessary sign extend is eliminated.

   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-01-20 Haochen Gui 

gcc/
* config/rs6000/rs6000.md (extenddi_ca_minus_one): Define.

gcc/testsuite/
* gcc.target/powerpc/pr95737.c: New.


patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6ecb0bd6142..1d8b212962f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2358,6 +2358,19 @@ (define_insn "subf3_carry_in_xx"
   "subfe %0,%0,%0"
   [(set_attr "type" "add")])

+(define_insn_and_split "*extenddi_ca_minus_one"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+   (sign_extend:DI (plus:SI (reg:SI CA_REGNO)
+(const_int -1]
+  ""
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+  (plus:DI (reg:DI CA_REGNO)
+   (const_int -1)))
+ (clobber (reg:DI CA_REGNO))])]
+  ""
+)

 (define_insn "@neg2"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c 
b/gcc/testsuite/gcc.target/powerpc/pr95737.c
new file mode 100644
index 000..94320f23423
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
@@ -0,0 +1,10 @@
+/* PR target/95737 */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+/* { dg-final { scan-assembler-not {\mextsw\M} } } */
+
+
+unsigned long long negativeLessThan (unsigned long long a, unsigned long long 
b)
+{
+   return -(a < b);
+}