Re: PING^3: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant

2023-09-03 Thread Stefan Schulze Frielinghaus via Gcc-patches
Ping.

On Thu, Aug 24, 2023 at 11:31:32AM +0800, Xi Ruoyao wrote:
> Ping again.
> 
> On Fri, 2023-08-18 at 13:04 +0200, Stefan Schulze Frielinghaus via 
> Gcc-patches wrote:
> > Ping.  Since this fixes bootstrap problem PR110939 for Loongarch I'm
> > pingen this one earlier.
> > 
> > On Thu, Aug 10, 2023 at 03:04:03PM +0200, Stefan Schulze Frielinghaus wrote:
> > > In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> > > completely missed the fact that the normal form of a generated constant 
> > > for a
> > > mode with fewer bits than in HOST_WIDE_INT is a sign extended version of 
> > > the
> > > actual constant.  This even holds true for unsigned constants.
> > > 
> > > Fixed by masking out the upper bits for the incoming constant and sign
> > > extending the resulting unsigned constant.
> > > 
> > > Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> > > 
> > > While reading existing optimizations in combine I stumbled across two
> > > optimizations where either my intuition about the representation of
> > > unsigned integers via a const_int rtx is wrong, which then in turn would
> > > probably also mean that this patch is wrong, or that the optimizations
> > > are missed sometimes.  In other words in the following I would assume
> > > that the upper bits are masked out:
> > > 
> > > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > > index 468b7fde911..80c4ff0fbaf 100644
> > > --- a/gcc/combine.cc
> > > +++ b/gcc/combine.cc
> > > @@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, 
> > > machine_mode mode,
> > >    /* (unsigned) < 0x8000 is equivalent to >= 0.  */
> > >    else if (is_a  (mode, _mode)
> > >    && GET_MODE_PRECISION (int_mode) - 1 < 
> > > HOST_BITS_PER_WIDE_INT
> > > -  && ((unsigned HOST_WIDE_INT) const_op
> > > +  && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > (int_mode))
> > >    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) 
> > > - 1)))
> > >     {
> > >   const_op = 0;
> > > @@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, 
> > > machine_mode mode,
> > >    /* (unsigned) >= 0x8000 is equivalent to < 0.  */
> > >    else if (is_a  (mode, _mode)
> > >    && GET_MODE_PRECISION (int_mode) - 1 < 
> > > HOST_BITS_PER_WIDE_INT
> > > -  && ((unsigned HOST_WIDE_INT) const_op
> > > +  && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > (int_mode))
> > >    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) 
> > > - 1)))
> > >     {
> > >   const_op = 0;
> > > 
> > > For example, while bootstrapping on x64 the optimization is missed since
> > > a LTU comparison in QImode is done and the constant equals
> > > 0xff80.
> > > 
> > > Sorry for inlining another patch, but I would really like to make sure
> > > that my understanding is correct, now, before I come up with another
> > > patch.  Thus it would be great if someone could shed some light on this.
> > > 
> > > gcc/ChangeLog:
> > > 
> > > * combine.cc (simplify_compare_const): Properly handle unsigned
> > > constants while narrowing comparison of memory and constants.
> > > ---
> > >  gcc/combine.cc | 19 ++-
> > >  1 file changed, 10 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > > index e46d202d0a7..468b7fde911 100644
> > > --- a/gcc/combine.cc
> > > +++ b/gcc/combine.cc
> > > @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, 
> > > machine_mode mode,
> > >    && !MEM_VOLATILE_P (op0)
> > >    /* The optimization makes only sense for constants which are big 
> > > enough
> > >  so that we have a chance to chop off something at all.  */
> > > -  && (unsigned HOST_WIDE_INT) const_op > 0xff
> > > -  /* Bail out, if the constant does not fit into INT_MODE.  */
> > > -  && (unsigned HOST_WIDE_INT) const_op
> > > -    < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 
> > > 1) - 1)
> > > +  && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode)) 
> > > > 0xff
> > >    /* Ensure that we do not overflow during normalization.  */
> > > -  && (code != GTU || (unsigned HOST_WIDE_INT) const_op < 
> > > HOST_WIDE_INT_M1U))
> > > +  && (code != GTU
> > > + || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > (int_mode))
> > > +    < HOST_WIDE_INT_M1U)
> > > +  && trunc_int_for_mode (const_op, int_mode) == const_op)
> > >  {
> > > -  unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> > > +  unsigned HOST_WIDE_INT n
> > > +   = (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
> > >    enum rtx_code adjusted_code;
> > >  
> > >    /* Normalize code to either LEU or GEU.  */
> > > @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, 

[PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches
Hi,
  This patch enables SImode in FP registers on P7. Instruction "fctiw"
stores its integer output in an FP register. So SImode in FP register
needs be enabled on P7 if we want support "fctiw" on P7.

  The test case is in the second patch which implements 32bit inline
lrint.

  Compared to the last version, the main change it to remove disparaging
on the alternatives of "fmr". Test shows it doesn't cause regression.
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628435.html

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.


ChangeLog
rs6000: enable SImode in FP register on P7

gcc/
PR target/88558
* config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached):
Enable SImode in FP registers on P7.
* config/rs6000/rs6000.md (*movsi_internal1): Add fmr for SImode
move between FP registers.  Set attribute isa of stfiwx to "*"
and attribute of stxsiwx to "p7".

patch.diff
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 44b448d2ba6..99085c2cdd7 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1903,7 +1903,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, 
machine_mode mode)
  if(GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD)
return 1;

- if (TARGET_P8_VECTOR && (mode == SImode))
+ if (TARGET_POPCNTD && mode == SImode)
return 1;

  if (TARGET_P9_VECTOR && (mode == QImode || mode == HImode))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index cdab49fbb91..edf49bd74e3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7566,7 +7566,7 @@ (define_split

 (define_insn "*movsi_internal1"
   [(set (match_operand:SI 0 "nonimmediate_operand"
- "=r, r,
+ "=r, r,  d,
   r,  d,  v,
   m,  ?Z, ?Z,
   r,  r,  r,  r,
@@ -7575,7 +7575,7 @@ (define_insn "*movsi_internal1"
   wa, r,
   r,  *h, *h")
(match_operand:SI 1 "input_operand"
- "r,  U,
+ "r,  U,  d,
   m,  ?Z, ?Z,
   r,  d,  v,
   I,  L,  eI, n,
@@ -7588,6 +7588,7 @@ (define_insn "*movsi_internal1"
   "@
mr %0,%1
la %0,%a1
+   fmr %0,%1
lwz%U1%X1 %0,%1
lfiwzx %0,%y1
lxsiwzx %x0,%y1
@@ -7611,7 +7612,7 @@ (define_insn "*movsi_internal1"
mt%0 %1
nop"
   [(set_attr "type"
- "*,  *,
+ "*,  *,  fpsimple,
   load,   fpload, fpload,
   store,  fpstore,fpstore,
   *,  *,  *,  *,
@@ -7620,7 +7621,7 @@ (define_insn "*movsi_internal1"
   mtvsr,  mfvsr,
   *,  *,  *")
(set_attr "length"
- "*,  *,
+ "*,  *,  *,
   *,  *,  *,
   *,  *,  *,
   *,  *,  *,  8,
@@ -7629,9 +7630,9 @@ (define_insn "*movsi_internal1"
   *,  *,
   *,  *,  *")
(set_attr "isa"
- "*,  *,
-  *,  p8v,p8v,
-  *,  p8v,p8v,
+ "*,  *,  *,
+  *,  p7, p8v,
+  *,  *,  p8v,
   *,  *,  p10,*,
   p8v,p9v,p9v,p8v,
   p9v,p8v,p9v,



[PATCH-2v2, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches
Hi,
  This patch implements 32bit inline lrint by "fctiw". It depends on
the patch1 to do SImode move from FP registers on P7.

  Compared to last version, the main change is to add tests for "lrintf"
and adjust the count of corresponding instructions.
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628436.html

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen

ChangeLog
rs6000: support 32bit inline lrint

gcc/
PR target/88558
* config/rs6000/rs6000.md (lrintdi2): Remove TARGET_FPRND
from insn condition.
(lrintsi2): New insn pattern for 32bit lrint.

gcc/testsuite/
PR target/106769
* gcc.target/powerpc/pr88558.h: New.
* gcc.target/powerpc/pr88558-p7.c: New.
* gcc.target/powerpc/pr88558-p8.c: New.

patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index edf49bd74e3..a41898e0e08 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6655,10 +6655,18 @@ (define_insn "lrintdi2"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d")
(unspec:DI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
   UNSPEC_FCTID))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT"
   "fctid %0,%1"
   [(set_attr "type" "fp")])

+(define_insn "lrintsi2"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=d")
+   (unspec:SI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
+  UNSPEC_FCTIW))]
+  "TARGET_HARD_FLOAT && TARGET_POPCNTD"
+  "fctiw %0,%1"
+  [(set_attr "type" "fp")])
+
 (define_insn "btrunc2"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c 
b/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c
new file mode 100644
index 000..f302491c4d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-math-errno -mdejagnu-cpu=power7" } */
+
+/* -fno-math-errno is required to make {i,l,ll}rint inlined */
+
+#include "pr88558.h"
+
+/* { dg-final { scan-assembler-times {\mfctid\M} 3 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mfctid\M} 1 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mfctiw\M} 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mfctiw\M} 3 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M} 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M} 3 { target ilp32 } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88558-p8.c 
b/gcc/testsuite/gcc.target/powerpc/pr88558-p8.c
new file mode 100644
index 000..33398aa74c2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88558-p8.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -fno-math-errno -mdejagnu-cpu=power8" } */
+
+/* -fno-math-errno is required to make {i,l,ll}rint inlined */
+
+#include "pr88558.h"
+
+/* { dg-final { scan-assembler-times {\mfctid\M} 3 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mfctid\M} 1 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mfctiw\M} 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mfctiw\M} 3 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 3 { target ilp32 } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88558.h 
b/gcc/testsuite/gcc.target/powerpc/pr88558.h
new file mode 100644
index 000..698640c0ef7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88558.h
@@ -0,0 +1,19 @@
+long int test1 (double a)
+{
+  return __builtin_lrint (a);
+}
+
+long long test2 (double a)
+{
+  return __builtin_llrint (a);
+}
+
+int test3 (double a)
+{
+  return __builtin_irint (a);
+}
+
+long int test4 (float a)
+{
+  return __builtin_lrintf (a);
+}


[PATCH] RISC-V: Add conditional sqrt autovec pattern

2023-09-03 Thread Lehua Ding
This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_):
Add sqrt + vcond_mask combine pattern.
* config/riscv/autovec.md (2):
Change define_expand to define_insn_and_split.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 20 +
 gcc/config/riscv/autovec.md   |  7 +++--
 .../riscv/rvv/autovec/cond/cond_sqrt-1.c  | 24 +++
 .../riscv/rvv/autovec/cond/cond_sqrt-2.c  | 24 +++
 .../riscv/rvv/autovec/cond/cond_sqrt_run-1.c  | 29 +++
 .../riscv/rvv/autovec/cond/cond_sqrt_run-2.c  | 29 +++
 6 files changed, 131 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 1ca5ce97193..d9863c76654 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -730,6 +730,26 @@
   DONE;
 })
 
+;; Combine vfsqrt.v and cond_mask
+(define_insn_and_split "*cond_"
+  [(set (match_operand:VF 0 "register_operand")
+ (if_then_else:VF
+   (match_operand: 1 "register_operand")
+   (any_float_unop:VF
+ (match_operand:VF 2 "register_operand"))
+   (match_operand:VF 3 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  insn_code icode = code_for_pred (, mode);
+  rtx ops[] = {operands[0], operands[1], operands[2], operands[3],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_cond_len_unop (icode, ops);
+  DONE;
+})
+
 ;; Combine vlmax neg and UNSPEC_VCOPYSIGN
 (define_insn_and_split "*copysign_neg"
   [(set (match_operand:VF 0 "register_operand")
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 0f9d1fe2c8e..c220fda312e 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -994,11 +994,14 @@
 ;; Includes:
 ;; - vfsqrt.v
 ;; 
---
-(define_expand "2"
+(define_insn_and_split "2"
   [(set (match_operand:VF 0 "register_operand")
 (any_float_unop:VF
  (match_operand:VF 1 "register_operand")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
 {
   insn_code icode = code_for_pred (, mode);
   riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_DYN, 
operands);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
new file mode 100644
index 000..21219b43d9d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param 
riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math" } */
+
+#include 
+
+#define DEF_LOOP(TYPE, OP) 
\
+  void __attribute__ ((noipa)) 
\
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  
\
+ TYPE *__restrict pred, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  r[i] = pred[i] ? OP (a[i]) : a[i];   
\
+  }
+
+#define TEST_ALL(T)
\
+  T (_Float16, __builtin_sqrtf16)  
\
+  T (float, __builtin_sqrtf)   
\
+  T (double, __builtin_sqrt)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } 
} */
+
+/* { dg-final { scan-assembler 
{\tvsetvli\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
new file mode 100644
index 000..2fcdc339e70
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { 

Re: [PATCH V6] Optimize '(X - N * M) / N' to 'X / N - M' if valid

2023-09-03 Thread Jiufu Guo via Gcc-patches


Hi,

Richard Biener  writes:

> On Fri, 1 Sep 2023, Jiufu Guo wrote:
>
>> Hi,
>> 
>> Integer expression "(X - N * M) / N" can be optimized to "X / N - M" with
>> the below conditions:
>> 1. There is no wrap/overflow/underflow.
>>wrap/overflow/underflow breaks the arithmetic operation.
>> 2. "X - N * M" and "X" are not of opposite sign.
>>Here, the operation "/" would be "trunc_div", the fractional part is
>>discarded towards zero. If "X - N * M" and "X" are in different signs,
>>then trunc_div discards the fractional parts (of /N) in different
>>directions.
>> 
>> Compare the previous version:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624801.html
>> This patch adds comments and update the pattern on "(t + C)" to be more
>> tight.
>> 
>> Bootstrap & regtest pass on ppc64{,le} and x86_64.
>> Is this patch ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
>>  PR tree-optimization/108757
>> 
>> gcc/ChangeLog:
>> 
>>  * match.pd ((X - N * M) / N): New pattern.
>>  ((X + N * M) / N): New pattern.
>>  ((X + C) div_rshift N): New pattern.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/pr108757-1.c: New test.
>>  * gcc.dg/pr108757-2.c: New test.
>>  * gcc.dg/pr108757.h: New test.
>> 
>> ---
>>  gcc/match.pd  |  78 ++
>>  gcc/testsuite/gcc.dg/pr108757-1.c |  18 +++
>>  gcc/testsuite/gcc.dg/pr108757-2.c |  19 +++
>>  gcc/testsuite/gcc.dg/pr108757.h   | 233 ++
>>  4 files changed, 348 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757.h
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 
>> fa598d5ca2e470f9cc3b82469e77d743b12f107e..863bc7299cdefc622a7806a4d32e37268c50d453
>>  100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -959,6 +959,84 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>  #endif
>> 
>>  
>> +#if GIMPLE
>> +(for div (trunc_div exact_div)
>> + /* Simplify (X + M*N) / N -> X / N + M.  */
>> + (simplify
>> +  (div (plus:c@4 @0 (mult:c@3 @1 @2)) @2)
>> +  (with {value_range vr0, vr1, vr2, vr3, vr4;}
>> +  (if (INTEGRAL_TYPE_P (type)
>> +   && get_range_query (cfun)->range_of_expr (vr1, @1)
>> +   && get_range_query (cfun)->range_of_expr (vr2, @2)
>> +   /* "N*M" doesn't overflow.  */
>> +   && range_op_handler (MULT_EXPR).overflow_free_p (vr1, vr2)
>> +   && get_range_query (cfun)->range_of_expr (vr0, @0)
>> +   && get_range_query (cfun)->range_of_expr (vr3, @3)
>> +   /* "X+(N*M)" doesn't overflow.  */
>> +   && range_op_handler (PLUS_EXPR).overflow_free_p (vr0, vr3)
>> +   && get_range_query (cfun)->range_of_expr (vr4, @4)
>> +   /* "X+N*M" is not with opposite sign as "X".  */
>> +   && (TYPE_UNSIGNED (type)
>> +   || (vr0.nonnegative_p () && vr4.nonnegative_p ())
>> +   || (vr0.nonpositive_p () && vr4.nonpositive_p (
>> +  (plus (div @0 @2) @1
>> +
>> + /* Simplify (X - M*N) / N -> X / N - M.  */
>> + (simplify
>> +  (div (minus@4 @0 (mult:c@3 @1 @2)) @2)
>> +  (with {value_range vr0, vr1, vr2, vr3, vr4;}
>> +  (if (INTEGRAL_TYPE_P (type)
>> +   && get_range_query (cfun)->range_of_expr (vr1, @1)
>> +   && get_range_query (cfun)->range_of_expr (vr2, @2)
>> +   /* "N * M" doesn't overflow.  */
>> +   && range_op_handler (MULT_EXPR).overflow_free_p (vr1, vr2)
>> +   && get_range_query (cfun)->range_of_expr (vr0, @0)
>> +   && get_range_query (cfun)->range_of_expr (vr3, @3)
>> +   /* "X - (N*M)" doesn't overflow.  */
>> +   && range_op_handler (MINUS_EXPR).overflow_free_p (vr0, vr3)
>> +   && get_range_query (cfun)->range_of_expr (vr4, @4)
>> +   /* "X-N*M" is not with opposite sign as "X".  */
>> +   && (TYPE_UNSIGNED (type)
>> +   || (vr0.nonnegative_p () && vr4.nonnegative_p ())
>> +   || (vr0.nonpositive_p () && vr4.nonpositive_p (
>> +  (minus (div @0 @2) @1)
>> +
>> +/* Simplify
>> +   (X + C) / N -> X / N + C / N where C is multiple of N.
>> +   (X + C) >> N -> X >> N + C>>N if low N bits of C is 0.  */
>> +(for op (trunc_div exact_div rshift)
>> + (simplify
>> +  (op (plus@3 @0 INTEGER_CST@1) INTEGER_CST@2)
>> +   (with
>> +{
>> +  wide_int c = wi::to_wide (@1);
>> +  wide_int n = wi::to_wide (@2);
>> +  bool shift = op == RSHIFT_EXPR;
>> +  #define plus_op1(v) (shift ? wi::rshift (v, n, TYPE_SIGN (type)) \
>> + : wi::div_trunc (v, n, TYPE_SIGN (type)))
>> +  #define exact_mod(v) (shift ? wi::ctz (v) >= n.to_shwi () \
>> +  : wi::multiple_of_p (v, n, TYPE_SIGN (type)))
>
> please indent these full left
>
>> +  value_range vr0, vr1, vr3;
>> +}
>> +(if (INTEGRAL_TYPE_P (type)
>> + && get_range_query (cfun)->range_of_expr (vr0, @0))
>> + (if (exact_mod (c)
>> +  && get_range_query (cfun)->range_of_expr (vr1, @1)
>> + 

[PATCH v2] LoongArch: initial ada support on linux

2023-09-03 Thread Yang Yujie
gcc/ChangeLog:

* ada/Makefile.rtl: Add LoongArch support.
* ada/libgnarl/s-linux__loongarch.ads: New.
* ada/libgnat/system-linux-loongarch.ads: New.
* config/loongarch/loongarch.h: mark normalized options
passed from driver to gnat1 as explicit for multilib.
---
 gcc/ada/Makefile.rtl   |  49 +++
 gcc/ada/libgnarl/s-linux__loongarch.ads| 134 +++
 gcc/ada/libgnat/system-linux-loongarch.ads | 145 +
 gcc/config/loongarch/loongarch.h   |   4 +-
 4 files changed, 330 insertions(+), 2 deletions(-)
 create mode 100644 gcc/ada/libgnarl/s-linux__loongarch.ads
 create mode 100644 gcc/ada/libgnat/system-linux-loongarch.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 96306f8cc9a..852a3324388 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2111,6 +2111,55 @@ ifeq ($(strip $(filter-out cygwin% mingw32% 
pe,$(target_os))),)
   LIBRARY_VERSION := $(LIB_VERSION)
 endif
 
+# LoongArch Linux
+ifeq ($(strip $(filter-out loongarch% linux%,$(target_cpu) $(target_os))),)
+  LIBGNAT_TARGET_PAIRS = \
+  a-exetim.adbhttp://www.gnu.org/licenses/>.  --
+--  --
+--
+
+--  This is the LoongArch version of this package
+
+--  This package encapsulates cpu specific differences between implementations
+--  of GNU/Linux, in order to share s-osinte-linux.ads.
+
+--  PLEASE DO NOT add any with-clauses to this package or remove the pragma
+--  Preelaborate. This package is designed to be a bottom-level (leaf) package
+
+with Interfaces.C;
+with System.Parameters;
+
+package System.Linux is
+   pragma Preelaborate;
+
+   --
+   -- Time --
+   --
+
+   subtype int is Interfaces.C.int;
+   subtype longis Interfaces.C.long;
+   subtype suseconds_t is Interfaces.C.long;
+   type time_t is range -2 ** (System.Parameters.time_t_bits - 1)
+ .. 2 ** (System.Parameters.time_t_bits - 1) - 1;
+   subtype clockid_t   is Interfaces.C.int;
+
+   type timespec is record
+  tv_sec  : time_t;
+  tv_nsec : long;
+   end record;
+   pragma Convention (C, timespec);
+
+   type timeval is record
+  tv_sec  : time_t;
+  tv_usec : suseconds_t;
+   end record;
+   pragma Convention (C, timeval);
+
+   ---
+   -- Errno --
+   ---
+
+   EAGAIN: constant := 11;
+   EINTR : constant := 4;
+   EINVAL: constant := 22;
+   ENOMEM: constant := 12;
+   EPERM : constant := 1;
+   ETIMEDOUT : constant := 110;
+
+   -
+   -- Signals --
+   -
+
+   SIGHUP : constant := 1; --  hangup
+   SIGINT : constant := 2; --  interrupt (rubout)
+   SIGQUIT: constant := 3; --  quit (ASCD FS)
+   SIGILL : constant := 4; --  illegal instruction (not reset)
+   SIGTRAP: constant := 5; --  trace trap (not reset)
+   SIGIOT : constant := 6; --  IOT instruction
+   SIGABRT: constant := 6; --  used by abort, replace SIGIOT in the  future
+   SIGBUS : constant := 7; --  bus error
+   SIGFPE : constant := 8; --  floating point exception
+   SIGKILL: constant := 9; --  kill (cannot be caught or ignored)
+   SIGUSR1: constant := 10; --  user defined signal 1
+   SIGSEGV: constant := 11; --  segmentation violation
+   SIGUSR2: constant := 12; --  user defined signal 2
+   SIGPIPE: constant := 13; --  write on a pipe with no one to read it
+   SIGALRM: constant := 14; --  alarm clock
+   SIGTERM: constant := 15; --  software termination signal from kill
+   SIGSTKFLT  : constant := 16; --  coprocessor stack fault (Linux)
+   SIGCLD : constant := 17; --  alias for SIGCHLD
+   SIGCHLD: constant := 17; --  child status change
+   SIGCONT: constant := 18; --  stopped process has been continued
+   SIGSTOP: constant := 19; --  stop (cannot be caught or ignored)
+   SIGTSTP: constant := 20; --  user stop requested from tty
+   SIGTTIN: constant := 21; --  background tty read attempted
+   SIGTTOU: constant := 22; --  background tty write attempted
+   SIGURG : constant := 23; --  urgent condition on IO channel
+   SIGXCPU: constant := 24; --  CPU time limit exceeded
+   SIGXFSZ: constant := 25; --  filesize limit exceeded
+   SIGVTALRM  : constant := 26; --  virtual timer expired
+   SIGPROF: constant := 27; --  profiling timer expired
+   SIGWINCH   : constant := 28; --  window size change
+   SIGPOLL: constant := 29; --  pollable event occurred
+   SIGIO  : constant := 29; --  I/O now possible (4.2 BSD)
+   SIGPWR : constant := 30; --  power-fail restart
+   SIGSYS : constant := 31; --  bad system call
+   SIG32  : constant := 32; --  glibc internal signal
+   SIG33  : constant := 33; --  glibc internal signal
+   SIG34  : constant := 

Re: [PING][PATCH] LoongArch: initial ada support on linux

2023-09-03 Thread Yang Yujie
On Fri, Sep 01, 2023 at 01:52:16PM +, Arnaud Charlet wrote:

> A small nit above: I'd suggest using += instead of := $(XXX) to make things
> clearer.

Ok, will fix in v2.



Re:[pushed] [PATCH 1/2] LoongArch: Optimize switch with sign-extended index.

2023-09-03 Thread chenglulu

Pushed to r14-3642.

The description information was modified and XLEN was changed to GRLEN.

Thanks!:-)


在 2023/9/2 下午4:09, WANG Xuerui 写道:

On 9/2/23 14:24, Lulu Cheng wrote:

The patch refers to the submission of RISCV
7bbce9b50302959286381d9177818642bceaf301.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_extend_comparands):
In unsigned QImode test, check for sign extended subreg and/or
constant operands, and do a sign extend in that case.

"do a sign extension"

* config/loongarch/loongarch.md (TARGET_64BIT): Define
template cbranchqi4.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/switch-qi.c: New test.
---
  gcc/config/loongarch/loongarch.cc  | 14 --
  gcc/config/loongarch/loongarch.md  |  8 ++--
  gcc/testsuite/gcc.target/loongarch/switch-qi.c | 16 
  3 files changed, 34 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/switch-qi.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc

index c72229cad87..7e300c826cf 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4228,8 +4228,18 @@ loongarch_extend_comparands (rtx_code code, 
rtx *op0, rtx *op1)
    /* Comparisons consider all XLEN bits, so extend sub-XLEN 
values.  */

    if (GET_MODE_SIZE (word_mode) > GET_MODE_SIZE (GET_MODE (*op0)))
  {
-  /* TODO: checkout It is more profitable to zero-extend QImode 
values.  */
-  if (unsigned_condition (code) == code && GET_MODE (*op0) == 
QImode)
+  /* It is more profitable to zero-extend QImode values. But not 
if the
+ first operand has already been sign-extended, and the second 
one is

+ is a constant or has already been sign-extended also.  */
+  if (unsigned_condition (code) == code
+  && (GET_MODE (*op0) == QImode
+  && ! (GET_CODE (*op0) == SUBREG
+    && SUBREG_PROMOTED_VAR_P (*op0)
+    && SUBREG_PROMOTED_SIGNED_P (*op0)
+    && (CONST_INT_P (*op1)
+    || (GET_CODE (*op1) == SUBREG
+    && SUBREG_PROMOTED_VAR_P (*op1)
+    && SUBREG_PROMOTED_SIGNED_P (*op1))
  {
    *op0 = gen_rtx_ZERO_EXTEND (word_mode, *op0);
    if (CONST_INT_P (*op1))
diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md

index b37e070660f..1bb4e461b38 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2733,11 +2733,15 @@ (define_insn "*branch_equality_inverted"
    [(set_attr "type" "branch")])
    +;; Branches operate on XLEN-sized quantities, but for 
LoongArch64 we accept


LoongArch literature refers to "XLEN" as "GRLEN".

Otherwise the patch is fine, thanks ;-)


+;; QImode values so we can force zero-extension.
+(define_mode_iterator BR [(QI "TARGET_64BIT") SI (DI "TARGET_64BIT")])
+
  (define_expand "cbranch4"
    [(set (pc)
  (if_then_else (match_operator 0 "comparison_operator"
-    [(match_operand:GPR 1 "register_operand")
- (match_operand:GPR 2 "nonmemory_operand")])
+    [(match_operand:BR 1 "register_operand")
+ (match_operand:BR 2 "nonmemory_operand")])
    (label_ref (match_operand 3 ""))
    (pc)))]
    ""
diff --git a/gcc/testsuite/gcc.target/loongarch/switch-qi.c 
b/gcc/testsuite/gcc.target/loongarch/switch-qi.c

new file mode 100644
index 000..dd192fd497f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/switch-qi.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-march=loongarch64 -mabi=lp64d" } */
+/* { dg-final { scan-assembler-not "bstrpick" } } */
+
+/* Test for loongarch_extend_comparands patch.  */
+extern void asdf (int);
+void
+foo (signed char x) {
+    switch (x) {
+  case 0: asdf (10); break;
+  case 1: asdf (11); break;
+  case 2: asdf (12); break;
+  case 3: asdf (13); break;
+  case 4: asdf (14); break;
+    }
+}




Re:[pushed] [PATCH v2] LoongArch: Support storing floating-point zero into MEM[base + index].

2023-09-03 Thread chenglulu

pushed to r14-3643.

在 2023/9/2 下午3:02, Guo Jie 写道:

v2: Modify commit message.

gcc/ChangeLog:

* config/loongarch/loongarch.md: Support 'G' -> 'k' in
movsf_hardfloat and movdf_hardfloat.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/const-double-zero-stx.c: New test.

---
  gcc/config/loongarch/loongarch.md  | 12 ++--
  .../loongarch/const-double-zero-stx.c  | 18 ++
  2 files changed, 24 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index b37e070660f..6f47c23a79c 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1915,13 +1915,13 @@ (define_expand "movsf"
  })
  
  (define_insn "*movsf_hardfloat"

-  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*G*r,*m,*r"))]
"TARGET_HARD_FLOAT
 && (register_operand (operands[0], SFmode)
 || reg_or_0_operand (operands[1], SFmode))"
{ return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
 (set_attr "mode" "SF")])
  
  (define_insn "*movsf_softfloat"

@@ -1946,13 +1946,13 @@ (define_expand "movdf"
  })
  
  (define_insn "*movdf_hardfloat"

-  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*r*G,*m,*r"))]
"TARGET_DOUBLE_FLOAT
 && (register_operand (operands[0], DFmode)
 || reg_or_0_operand (operands[1], DFmode))"
{ return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
 (set_attr "mode" "DF")])
  
  (define_insn "*movdf_softfloat"

diff --git a/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c 
b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
new file mode 100644
index 000..8fb04be8ff5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {stx\..\t\$r0} 2 } } */
+
+extern float arr_f[];
+extern double arr_d[];
+
+void
+test_f (int base, int index)
+{
+  arr_f[base + index] = 0.0;
+}
+
+void
+test_d (int base, int index)
+{
+  arr_d[base + index] = 0.0;
+}




[PATCH] MATCH: Add `(x | c) & ~(y | c)` and `x & ~(y | x)` patterns [PR98710]

2023-09-03 Thread Andrew Pinski via Gcc-patches
Adding some more simple bit_and/bit_ior patterns.
How often these show up, I have no idea.

This was tested on top of
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629174.html .

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/98710
* match.pd (`(x | c) & ~(y | c)`, `(x & c) | ~(y & c)`): New pattern.
(`x & ~(y | x)`, `x | ~(y & x)`): New patterns.

gcc/testsuite/ChangeLog:

PR tree-optimization/98710
* gcc.dg/tree-ssa/andor-7.c: New test.
* gcc.dg/tree-ssa/andor-8.c: New test.
---
 gcc/match.pd| 14 +-
 gcc/testsuite/gcc.dg/tree-ssa/andor-7.c | 16 
 gcc/testsuite/gcc.dg/tree-ssa/andor-8.c | 19 +++
 3 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-8.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3495f9451d1..a3f507a1e2e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1995,7 +1995,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   /* (x & y) | (x | z) -> (x | z) */
  (simplify
   (bitop:c (rbitop:c @0 @1) (bitop:c@3 @0 @2))
-  @3))
+  @3)
+ /* (x | c) & ~(y | c) -> x & ~(y | c) */
+ /* (x & c) | ~(y & c) -> x | ~(y & c) */
+ (simplify
+  (bitop:c (rbitop:c @0 @1) (bit_not@3 (rbitop:c @1 @2)))
+  (bitop @0 @3))
+ /* x & ~(y | x) -> 0 */
+ /* x | ~(y & x) -> -1 */
+ (simplify
+  (bitop:c @0 (bit_not (rbitop:c @0 @1)))
+  (if (bitop == BIT_AND_EXPR)
+   { build_zero_cst (type); }
+   { build_minus_one_cst (type); })))
 
 /* ((x | y) & z) | x -> (z & y) | x
((x ^ y) & z) | x -> (z & y) | x  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
new file mode 100644
index 000..63b70fa7888
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-7.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/98710 */
+
+signed foo(signed x, signed y, signed z)
+{
+return (x | z) & ~(y | z); // x & ~(y | z);
+}
+// Note . here is `(` or `)`
+/* { dg-final { scan-tree-dump "return x \& ~.y \\| z.;|return ~.y \\| z. \& 
x;" "original" } } */
+
+signed foo_or(signed a, signed b, signed c)
+{
+return (a & c) | ~(b & c); // a | ~(b & c);
+}
+/* { dg-final { scan-tree-dump "return a \\| ~.b \& c.;|return ~.b \& c. \\| 
a;" "original" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c
new file mode 100644
index 000..0c2eb4c1a00
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-8.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/98710 */
+
+signed foo2(signed a, signed b, signed c)
+{
+return (a & ~(b | a)) & c; // 0
+}
+/* { dg-final { scan-tree-dump "return 0;" "original" } } */
+signed foo2_or(signed x, signed y, signed z)
+{
+return (x | ~(y & x)) & z; // -1 & z -> z
+}
+
+/* { dg-final { scan-tree-dump "return z;" "original" } } */
+/* All | and & should have been removed. */
+/* { dg-final { scan-tree-dump-not "~" "original" } } */
+/* { dg-final { scan-tree-dump-not " \& " "original" } } */
+/* { dg-final { scan-tree-dump-not " \\| " "original" } } */
-- 
2.31.1



Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-03 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 1, 2023 at 7:03 PM Richard Sandiford via Gcc-patches
 wrote:
>
> Uros Bizjak via Gcc-patches  writes:
> > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> >  wrote:
> >>
> >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches 
> >> wrote:
> >> > From: Kong Lingling 
> >> >
> >> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> >> > usage by default from mapping the common reg/mem constraint to non-EGPR
> >> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage
> >> > for inline asm.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >   * config/i386/i386.cc (INCLUDE_STRING): Add include for
> >> >   ix86_md_asm_adjust.
> >> >   (ix86_md_asm_adjust): When APX EGPR enabled without specifying the
> >> >   target option, map reg/mem constraints to non-EGPR constraints.
> >> >   * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >   * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> >> > ---
> >> >  gcc/config/i386/i386.cc   |  44 +++
> >> >  gcc/config/i386/i386.opt  |   5 +
> >> >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 ++
> >> >  3 files changed, 156 insertions(+)
> >> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> >> >
> >> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> >> > index d26d9ab0d9d..9460ebbfda4 100644
> >> > --- a/gcc/config/i386/i386.cc
> >> > +++ b/gcc/config/i386/i386.cc
> >> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General 
> >> > Public License
> >> >  along with GCC; see the file COPYING3.  If not see
> >> >  .  */
> >> >
> >> > +#define INCLUDE_STRING
> >> >  #define IN_TARGET_CODE 1
> >> >
> >> >  #include "config.h"
> >> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec , vec 
> >> > & /*inputs*/,
> >> >bool saw_asm_flag = false;
> >> >
> >> >start_sequence ();
> >> > +  /* TODO: Here we just mapped the general r/m constraints to non-EGPR
> >> > +   constraints, will eventually map all the usable constraints in the 
> >> > future. */
> >>
> >> I think there should be some constraint which explicitly has all the 32
> >> GPRs, like there is one for just all 16 GPRs (h), so that regardless of
> >> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants.
> >>
> >> Also, what about the "g" constraint?  Shouldn't there be another for "g"
> >> without r16..r31?  What about the various other memory
> >> constraints ("<", "o", ...)?
> >
> > I think we should leave all existing constraints as they are, so "r"
> > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > introduce "h" to instructions that have the ability to handle EGPR.
>
> Yeah.  I'm jumping in without having read the full thread, sorry,
> but the current mechanism for handling this is TARGET_MEM_CONSTRAINT
> (added for s390).  That is, TARGET_MEM_CONSTRAINT can be defined to some
Thanks for the comments.
> new constraint that is more general than the traditional "m" constraint.
> This constraint is then the one that is associated with memory_operand
> etc.  "m" can then be defined explicitly to the old definition,
> so that existing asms continue to work.
>
> So if the port wants generic internal memory addresses to use the
> EGPR set (sounds reasonable), then TARGET_MEM_CONSTRAINT would be
> a new constraint that maps to those addresses.
But still we need to enhance current reload infrastructure to support
selective base_reg_class/index_reg_class, refer to [1].
The good thing about using TARGET_MEM_CONSTRAINT is that we don't have
to remapping memory constraint for inline asm, but the bad thing about
it is that we need to modify the backend pattern a lot, because only
5% of the instructions don't support gpr32, and 95% of them need to be
changed to the new memory constraint.
It feels like the cons outweigh the pros.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629040.html

>
> Thanks,
> Richard



-- 
BR,
Hongtao


Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-03 Thread Hongtao Liu via Gcc-patches
On Fri, Sep 1, 2023 at 7:27 PM Uros Bizjak  wrote:
>
> On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu  wrote:
> >
> > On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
> >  wrote:
> > >
> > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang  
> > > wrote:
> > > >
> > > > Uros Bizjak via Gcc-patches  于2023年8月31日周四 
> > > > 18:01写道:
> > > > >
> > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via 
> > > > > > Gcc-patches wrote:
> > > > > > > From: Kong Lingling 
> > > > > > >
> > > > > > > In inline asm, we do not know if the insn can use EGPR, so 
> > > > > > > disable EGPR
> > > > > > > usage by default from mapping the common reg/mem constraint to 
> > > > > > > non-EGPR
> > > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR 
> > > > > > > usage
> > > > > > > for inline asm.
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >   * config/i386/i386.cc (INCLUDE_STRING): Add include for
> > > > > > >   ix86_md_asm_adjust.
> > > > > > >   (ix86_md_asm_adjust): When APX EGPR enabled without 
> > > > > > > specifying the
> > > > > > >   target option, map reg/mem constraints to non-EGPR 
> > > > > > > constraints.
> > > > > > >   * config/i386/i386.opt: Add option 
> > > > > > > mapx-inline-asm-use-gpr32.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > >   * gcc.target/i386/apx-inline-gpr-norex2.c: New test.
> > > > > > > ---
> > > > > > >  gcc/config/i386/i386.cc   |  44 +++
> > > > > > >  gcc/config/i386/i386.opt  |   5 +
> > > > > > >  .../gcc.target/i386/apx-inline-gpr-norex2.c   | 107 
> > > > > > > ++
> > > > > > >  3 files changed, 156 insertions(+)
> > > > > > >  create mode 100644 
> > > > > > > gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c
> > > > > > >
> > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > > > > > index d26d9ab0d9d..9460ebbfda4 100644
> > > > > > > --- a/gcc/config/i386/i386.cc
> > > > > > > +++ b/gcc/config/i386/i386.cc
> > > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU 
> > > > > > > General Public License
> > > > > > >  along with GCC; see the file COPYING3.  If not see
> > > > > > >  .  */
> > > > > > >
> > > > > > > +#define INCLUDE_STRING
> > > > > > >  #define IN_TARGET_CODE 1
> > > > > > >
> > > > > > >  #include "config.h"
> > > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec , 
> > > > > > > vec & /*inputs*/,
> > > > > > >bool saw_asm_flag = false;
> > > > > > >
> > > > > > >start_sequence ();
> > > > > > > +  /* TODO: Here we just mapped the general r/m constraints to 
> > > > > > > non-EGPR
> > > > > > > +   constraints, will eventually map all the usable constraints 
> > > > > > > in the future. */
> > > > > >
> > > > > > I think there should be some constraint which explicitly has all 
> > > > > > the 32
> > > > > > GPRs, like there is one for just all 16 GPRs (h), so that 
> > > > > > regardless of
> > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm 
> > > > > > wants.
> > > > > >
> > > > > > Also, what about the "g" constraint?  Shouldn't there be another 
> > > > > > for "g"
> > > > > > without r16..r31?  What about the various other memory
> > > > > > constraints ("<", "o", ...)?
> > > > >
> > > > > I think we should leave all existing constraints as they are, so "r"
> > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > > This would be somehow similar to the SSE -> AVX512F transition, where
> > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > constraint "Bt", should allow new registers, and should be added to
> > > > > the constraint string as a separate constraint, and conditionally
> > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > >
> > > > The extended constraint can work for registers, but for memory it is 
> > > > more
> > > > complicated.
> > >
> > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > class is used for BASE/INDEX registers. I have hit this limitation
> > > when trying to implement memory support for instructions involving
> > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > register class is the major limitation in the compiler, so perhaps the
> > > strategy on how to 

[PATCH] MATCH: Add `~MAX(~X, Y)` pattern: [PR96694]

2023-09-03 Thread Andrew Pinski via Gcc-patches
This adds `~MAX(~X, Y)` and `~MIN(~X, Y)` patterns
that are like the `~(~a & b)` and `~(~a | b)` patterns
and allows to reduce the number of ~ by 1.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/96694

gcc/ChangeLog:

* match.pd (`~MAX(~X, Y)`, `~MIN(~X, Y)`): New patterns.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmax-24.c: New test.
---
 gcc/match.pd  |  7 -
 gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c | 31 +++
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e9ce48ea7fa..604c2c2360c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3786,7 +3786,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  maxmin (max min)
  (simplify
   (minmax (bit_not:s@2 @0) (bit_not:s@3 @1))
-  (bit_not (maxmin @0 @1
+  (bit_not (maxmin @0 @1)))
+/* ~MAX(~X, Y) --> MIN(X, ~Y) */
+/* ~MIN(~X, Y) --> MAX(X, ~Y) */
+ (simplify
+  (bit_not (minmax:cs (bit_not @0) @1))
+  (maxmin @0 (bit_not @1
 
 /* MIN (X, Y) == X -> X <= Y  */
 (for minmax (min min max max)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c
new file mode 100644
index 000..2b21f94eecf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-24.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/96694 */
+
+static inline int min(int a, int b)
+{
+  return a < b ? a : b;
+}
+
+static inline int max(int a, int b)
+{
+  return a > b ? a : b;
+}
+
+int max_not(int x, int y)
+{
+  return ~max(~x, y); // min (x, ~y)
+}
+/* { dg-final { scan-tree-dump "~y_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "~x_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump "MIN_EXPR |MIN_EXPR 
<_\[0-9\]+, x_\[0-9\]+.D.>" "optimized" } } */
+
+int min_not(int c, int d)
+{
+  return ~min(~c, d); // max (c, ~d)
+}
+/* { dg-final { scan-tree-dump "~d_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "~c_\[0-9\]+.D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump "MAX_EXPR |MIN_EXPR 
<_\[0-9\]+, c_\[0-9\]+.D.>" "optimized" } } */
+
+/* { dg-final { scan-tree-dump-times "~" 2 "optimized" } } */
-- 
2.31.1



[PATCH] RISC-V: Fix Zicond ICE on large constants

2023-09-03 Thread Tsukasa OI via Gcc-patches
From: Tsukasa OI 

Large constant cons and/or alt will trigger ICEs building GCC target
libraries (libgomp and libatomic) when the 'Zicond' extension is enabled.

For instance, zicond-ice-2.c (new test case in this commit) will cause
an ICE when SOME_NUMBER is 0x1000 or larger.  While opposite numbers
corresponding cons/alt (two temp2 variables) are checked, cons/alt
themselves are not checked and causing 2 ICEs building
GCC target libraries as of this writing:

1.  gcc/libatomic/config/posix/lock.c
2.  gcc/libgomp/fortran.c

Coercing a large value into a register will fix the issue.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_conditional_move): Force
large constant cons/alt into a register.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zicond-ice-2.c: New test.  This is based on
an ICE at libat_lock_n func on gcc/libatomic/config/posix/lock.c
but heavily minimized.
---
 gcc/config/riscv/riscv.cc | 16 ++--
 gcc/testsuite/gcc.target/riscv/zicond-ice-2.c | 11 +++
 2 files changed, 21 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ice-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8d8f7b4f16ed..cfaa4b6a7720 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3940,11 +3940,13 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
  rtx temp1 = gen_reg_rtx (mode);
  rtx temp2 = gen_int_mode (-1 * INTVAL (cons), mode);
 
- /* TEMP2 might not fit into a signed 12 bit immediate suitable
-for an addi instruction.  If that's the case, force it into
-a register.  */
+ /* TEMP2 and/or CONS might not fit into a signed 12 bit immediate
+suitable for an addi instruction.  If that's the case, force it
+into a register.  */
  if (!SMALL_OPERAND (INTVAL (temp2)))
temp2 = force_reg (mode, temp2);
+ if (!SMALL_OPERAND (INTVAL (cons)))
+   cons = force_reg (mode, cons);
 
  riscv_emit_binary (PLUS, temp1, alt, temp2);
  emit_insn (gen_rtx_SET (dest,
@@ -3986,11 +3988,13 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
  rtx temp1 = gen_reg_rtx (mode);
  rtx temp2 = gen_int_mode (-1 * INTVAL (alt), mode);
 
- /* TEMP2 might not fit into a signed 12 bit immediate suitable
-for an addi instruction.  If that's the case, force it into
-a register.  */
+ /* TEMP2 and/or ALT might not fit into a signed 12 bit immediate
+suitable for an addi instruction.  If that's the case, force it
+into a register.  */
  if (!SMALL_OPERAND (INTVAL (temp2)))
temp2 = force_reg (mode, temp2);
+ if (!SMALL_OPERAND (INTVAL (alt)))
+   alt = force_reg (mode, alt);
 
  riscv_emit_binary (PLUS, temp1, cons, temp2);
  emit_insn (gen_rtx_SET (dest,
diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c 
b/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c
new file mode 100644
index ..ffd8dcb5814e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64d" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zicond -mabi=ilp32d" { target { rv32 } } } */
+
+#define SOME_NUMBER 0x1000
+
+unsigned long
+d (unsigned long n)
+{
+  return n > SOME_NUMBER ? SOME_NUMBER : n;
+}

base-commit: 78f636d979530c8a649262dbd44914bdfb6f7290
-- 
2.42.0



[PATCH] MATCH: Add pattern for `(x | y) & (x & z)`

2023-09-03 Thread Andrew Pinski via Gcc-patches
Like the pattern already there for `(x | y) & x`,
this adds a simple pattern to optimize `(x | y) & (x & z)`
to just `x & z`.

OK? Bootstrapped and tested on x86-64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/103536
* match.pd (`(x | y) & (x & z)`,
`(x & y) | (x | z)`): New patterns.

gcc/testsuite/ChangeLog:

PR tree-optimization/103536
* gcc.dg/tree-ssa/andor-6.c: New test.
* gcc.dg/tree-ssa/andor-bool-1.c: New test.
---
 gcc/match.pd |  7 ++-
 gcc/testsuite/gcc.dg/tree-ssa/andor-6.c  | 19 +++
 gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c | 13 +
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3efc971f7f6..3495f9451d1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1990,7 +1990,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (with { bool wascmp; }
(if (bitwise_inverted_equal_p (@0, @2, wascmp)
&& (!wascmp || element_precision (type) == 1))
-(bitop @0 @1)
+(bitop @0 @1
+  /* (x | y) & (x & z) -> (x & z) */
+  /* (x & y) | (x | z) -> (x | z) */
+ (simplify
+  (bitop:c (rbitop:c @0 @1) (bitop:c@3 @0 @2))
+  @3))
 
 /* ((x | y) & z) | x -> (z & y) | x
((x ^ y) & z) | x -> (z & y) | x  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
new file mode 100644
index 000..32e11730f98
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-6.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* PR tree-optimization/103536 */
+
+int
+orand(int a, int b, int c)
+{
+return (a | b) & (a & c); // a & c
+}
+
+/* { dg-final { scan-tree-dump "return a \& c;" "original" } } */
+
+int
+andor(int d, int e, int f)
+{
+return (d & e) | (d | f); // d | f
+}
+
+/* { dg-final { scan-tree-dump "return d \\| f;" "original" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c
new file mode 100644
index 000..a1b974f3859
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andor-bool-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/103536 */
+
+_Bool
+src_1 (_Bool a, _Bool b)
+{
+return (a || b) && (a && b);
+}
+
+/* { dg-final { scan-tree-dump "a_\[0-9\]+.D. \& b_\[0-9\]+.D." "optimized" } 
} */
+/* { dg-final { scan-tree-dump-not "a_\[0-9\]+.D. \\\| b_\[0-9\]+.D." 
"optimized" } } */
+/* { dg-final { scan-tree-dump-not "if " "optimized" } } */
-- 
2.31.1



[PATCH v3] mklog: handle Signed-off-by, minor cleanup

2023-09-03 Thread Marc Poulhiès via Gcc-patches
Richard Sandiford via Gcc-patches  writes:
>> +# this regex matches the first line of the "end" in the initial commit 
>> message
>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by|co-authored-by|#): ')
>
> The current code only requires an initial "#", rather than an initial "#: ".
> Is that a deliberate change?
>
> The patch LGTM apart from that.

Hello Richard,

Thanks for the review and sorry for the delayed answer as I was away the
past weeks. This issue was catched early this month
(https://github.com/Rust-GCC/gccrs/pull/2504), but I didn't want to send
something here before leaving. Here's a fixed patched.

Ok for master?

Thanks,
Marc

---
 contrib/mklog.py   | 34 +-
 contrib/prepare-commit-msg | 20 ++--
 2 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 26230b9b4f2..496780883fb 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -41,7 +41,34 @@ from unidiff import PatchSet
 
 LINE_LIMIT = 100
 TAB_WIDTH = 8
-CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
+
+# Initial commit:
+#   +--+
+#   | gccrs: Some title|
+#   |  | This is the "start"
+#   | This is some text explaining the commit. |
+#   | There can be several lines.  |
+#   |  |<--->
+#   | Signed-off-by: My Name  | This is the "end"
+#   +--+
+#
+# Results in:
+#   +--+
+#   | gccrs: Some title|
+#   |  |
+#   | This is some text explaining the commit. | This is the "start"
+#   | There can be several lines.  |
+#   |  |<--->
+#   | gcc/rust/ChangeLog:  |
+#   |  | This is the generated
+#   | * some_file (bla):   | ChangeLog part
+#   | (foo):   |
+#   |  |<--->
+#   | Signed-off-by: My Name  | This is the "end"
+#   +--+
+
+# this regex matches the first line of the "end" in the initial commit message
+FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by:|co-authored-by:|#) ')
 
 pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
 prnum_regex = re.compile(r'PR (?P[a-z+-]+)/(?P[0-9]+)')
@@ -330,10 +357,7 @@ def update_copyright(data):
 
 
 def skip_line_in_changelog(line):
-if line.lower().startswith(CO_AUTHORED_BY_PREFIX) or line.startswith('#'):
-return False
-return True
-
+return FIRST_LINE_OF_END_RE.match(line) == None
 
 if __name__ == '__main__':
 extra_args = os.getenv('GCC_MKLOG_ARGS')
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
index 48c9dad3c6f..1e94706ba40 100755
--- a/contrib/prepare-commit-msg
+++ b/contrib/prepare-commit-msg
@@ -32,11 +32,11 @@ if ! [ -f "$COMMIT_MSG_FILE" ]; then exit 0; fi
 # Don't do anything unless requested to.
 if [ -z "$GCC_FORCE_MKLOG" ]; then exit 0; fi
 
-if [ -z "$COMMIT_SOURCE" ] || [ $COMMIT_SOURCE = template ]; then
+if [ -z "$COMMIT_SOURCE" ] || [ "$COMMIT_SOURCE" = template ]; then
 # No source or "template" means new commit.
 cmd="diff --cached"
 
-elif [ $COMMIT_SOURCE = message ]; then
+elif [ "$COMMIT_SOURCE" = message ]; then
 # "message" means -m; assume a new commit if there are any changes staged.
 if ! git diff --cached --quiet; then
cmd="diff --cached"
@@ -44,23 +44,23 @@ elif [ $COMMIT_SOURCE = message ]; then
cmd="diff --cached HEAD^"
 fi
 
-elif [ $COMMIT_SOURCE = commit ]; then
+elif [ "$COMMIT_SOURCE" = commit ]; then
 # The message of an existing commit.  If it's HEAD, assume --amend;
 # otherwise, assume a new commit with -C.
-if [ $SHA1 = HEAD ]; then
+if [ "$SHA1" = HEAD ]; then
cmd="diff --cached HEAD^"
if [ "$(git config gcc-config.mklog-hook-type)" = "smart-amend" ]; then
# Check if the existing message still describes the staged changes.
f=$(mktemp /tmp/git-commit.XX) || exit 1
-   git log -1 --pretty=email HEAD > $f
-   printf '\n---\n\n' >> $f
-   git $cmd >> $f
+   git log -1 --pretty=email HEAD > "$f"
+   printf '\n---\n\n' >> "$f"
+   git $cmd >> "$f"
if contrib/gcc-changelog/git_email.py "$f" >/dev/null 2>&1; then
# Existing commit message is still OK for amended commit.
-   rm $f
+   rm "$f"
exit 0

[PATCH] MATCH: Transform `(1 >> X) !=/== 0` into `X ==/!= 0`

2023-09-03 Thread Andrew Pinski via Gcc-patches
We currently have a pattern for handling `(C >> X) & D == 0`
but if C is 1 and D is 1, the `& 1` might have been removed.

gcc/ChangeLog:

PR tree-optimization/105832
* match.pd (`(1 >> X) != 0`): New pattern

gcc/testsuite/ChangeLog:

PR tree-optimization/105832
* gcc.dg/tree-ssa/pr105832-1.c: New test.
* gcc.dg/tree-ssa/pr105832-2.c: New test.
* gcc.dg/tree-ssa/pr105832-3.c: New test.
---
 gcc/match.pd   | 10 -
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c | 25 
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c | 30 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c | 46 ++
 4 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 5270e4104ac..e9ce48ea7fa 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4026,7 +4026,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Simplify ((C << x) & D) != 0 where C and D are power of two constants,
either to false if D is smaller (unsigned comparison) than C, or to
-   x == log2 (D) - log2 (C).  Similarly for right shifts.  */
+   x == log2 (D) - log2 (C).  Similarly for right shifts.
+   Note for `(1 >> x)`, the & 1 has been removed so matching that seperately. 
*/
 (for cmp (ne eq)
  icmp (eq ne)
  (simplify
@@ -4043,7 +4044,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
int c2 = wi::clz (wi::to_wide (@2)); }
  (if (c1 > c2)
   { constant_boolean_node (cmp == NE_EXPR ? false : true, type); }
-  (icmp @0 { build_int_cst (TREE_TYPE (@0), c2 - c1); }))
+  (icmp @0 { build_int_cst (TREE_TYPE (@0), c2 - c1); })
+ /* `(1 >> X) != 0` -> `X == 0` */
+ /* `(1 >> X) == 0` -> `X != 0` */
+ (simplify
+  (cmp (rshift integer_onep @0) integer_zerop)
+   (icmp @0 { build_zero_cst (TREE_TYPE (@0)); })))
 
 /* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
(CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
new file mode 100644
index 000..d7029d39c85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-1.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+/* PR tree-optimization/105832 */
+
+void foo(void);
+
+static struct {
+short a;
+signed char b;
+} c;
+
+static signed char d;
+
+int main() {
+signed char g = c.b > 4U ? c.b : c.b << 2;
+for (int h = 0; h < 5; h++) {
+d = (g >= 2 || 1 >> g) ? g : g << 1;
+if (d && 1 == g)
+foo();
+c.a = 0;
+}
+}
+
+/* The call of foo should have been removed. */
+/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
new file mode 100644
index 000..2d2a33e2755
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-2.c
@@ -0,0 +1,30 @@
+/* PR tree-optimization/105832 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+/* { dg-final { scan-tree-dump "return a == 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return b == 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return c != 0;" "original" } } */
+/* { dg-final { scan-tree-dump "return d != 0;" "original" } } */
+
+int
+f1 (int a)
+{
+  return (1 >> a) != 0;
+}
+
+int
+f2 (int b)
+{
+  return ((1 >> b) & 1) != 0;
+}
+int
+f3 (int c)
+{
+  return (1 >> c) == 0;
+}
+
+int
+f4 (int d)
+{
+  return ((1 >> d) & 1) == 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c
new file mode 100644
index 000..2bdd9afcbc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105832-3.c
@@ -0,0 +1,46 @@
+/* PR tree-optimization/105832 */
+/* { dg-do compile } */
+/* Disable the first forwprop1 as that will catch f2/f4 even though `&1`
+   will be removed during evrp. */
+/* { dg-options "-O2 -fdisable-tree-forwprop1 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump "a_\[0-9]+\\(D\\) == 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "b_\[0-9]+\\(D\\) == 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "c_\[0-9]+\\(D\\) != 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "d_\[0-9]+\\(D\\) != 0" "optimized" } } */
+
+int g(void);
+int h(void);
+
+int
+f1 (int a)
+{
+  int t = 1 >> a;
+  if (t != 0) return g();
+  return h();
+}
+
+int
+f2 (int b)
+{
+  int t = 1 >> b;
+  t &= 1;
+  if (t != 0) return g();
+  return h();
+}
+
+int
+f3 (int c)
+{
+  int t = 1 >> c;
+  if (t == 0) return g();
+  return h();
+}
+
+int
+f4 (int d)
+{
+  int t = 1 >> d;
+  t &= 1;
+  if (t == 0) return g();
+  return h();
+}
-- 
2.31.1



[PATCH] Improve rewrite_to_defined_overflow for lhs already the correct type

2023-09-03 Thread Andrew Pinski via Gcc-patches
This improves rewrite_to_defined_overflow slightly if we already
have an unsigned type. The only place where this seems to show up
is ifcombine. It removes one extra statement which gets added and
then later on removed.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/111276
* gimple-fold.cc (rewrite_to_defined_overflow): Don't
add a new lhs if we already the unsigned type.
---
 gcc/gimple-fold.cc | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index fd01810581a..2fcafeada37 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -8721,10 +8721,19 @@ rewrite_to_defined_overflow (gimple *stmt, bool 
in_place /* = false */)
op = gimple_convert (, type, op);
gimple_set_op (stmt, i, op);
   }
-  gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
+  bool needs_cast_back = false;
+  if (!useless_type_conversion_p (type, TREE_TYPE (lhs)))
+{
+  gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
+  needs_cast_back = true;
+}
+
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
 gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
-  gimple_set_modified (stmt, true);
+
+  if (needs_cast_back || stmts)
+gimple_set_modified (stmt, true);
+
   if (in_place)
 {
   gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
@@ -8734,6 +8743,10 @@ rewrite_to_defined_overflow (gimple *stmt, bool in_place 
/* = false */)
 }
   else
 gimple_seq_add_stmt (, stmt);
+
+  if (!needs_cast_back)
+return stmts;
+
   gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs (stmt));
   if (in_place)
 {
-- 
2.31.1



Re: Testsuite: fix contructor priority test

2023-09-03 Thread FX Coudert via Gcc-patches
Hi,

I was about to ping the attached patch, and realised it bordered on obvious, so 
I pushed it directly.

FX



> Le 19 août 2023 à 22:40, FX Coudert  a écrit :
> 
> Bordering on obvious, tested on darwin where the test case fails before (and 
> now passes).
> 
> OK to commit?
> FX
> 
> <0001-Testsuite-fix-contructor-priority-test.patch>


0001-Testsuite-fix-contructor-priority-test.patch
Description: Binary data