Re: [PATCH] Fix spelling of ones' complement.

2021-11-15 Thread Aldy Hernandez via Gcc-patches
On Tue, Nov 16, 2021, 03:20 Marek Polacek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches
> wrote:
> >
> >
> > > On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Nitpicking time.  It's spelled "ones' complement" rather than "one's
> > > complement".
> >
> > Is that so?  I see Wikipedia claims it is, but there are no sources for
> that claim.  (There is an assertion that it is "discussed at length on the
> talk page" of an article about number representation, but in fact there is
> no discussion there at all.)
> >
> > I have never seen this spelling before, and I very much doubt its
> validity.  For one thing, why then have "two's complement"?  For another,
> to pick one random authority, J.E. Thornton in "Design of a computer -- the
> Control Data 6600" refers to "one's complement" to describe the well known
> mode used by that machine and its relatives.
>
> Knuth, The Art of Computer Programming Volume 2, page 203-4:
>
> "A two's complement number is complemented with respect to a single
> power of 2, while a ones' complement number is complemented with respect
> to a long sequence of 1s."
>

I think you get to do a drop mike when you pull out Knuth.

:-)

>
>


[PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-15 Thread Ilya Lipnitskiy via Gcc-patches
musl only uses PT_GNU_STACK to set default thread stack size and has no
executable stack support[0], so there is no reason not to emit the
.note.GNU-stack section on musl builds.

[0]: 
https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: define TARGET_LIBC_GNUSTACK on musl

Signed-off-by: Ilya Lipnitskiy 
---
 gcc/configure| 3 +++
 gcc/configure.ac | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/gcc/configure b/gcc/configure
index 74b9d9be4c85..7091a838aefa 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -31275,6 +31275,9 @@ fi
 # Check if the target LIBC handles PT_GNU_STACK.
 gcc_cv_libc_gnustack=unknown
 case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
   mips*-*-linux*)
 
 if test $glibc_version_major -gt 2 \
diff --git a/gcc/configure.ac b/gcc/configure.ac
index c9ee1fb8919e..8a2d34179a75 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -6961,6 +6961,9 @@ fi
 # Check if the target LIBC handles PT_GNU_STACK.
 gcc_cv_libc_gnustack=unknown
 case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
   mips*-*-linux*)
 GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
 ;;
-- 
2.33.1



Re: [PATCH] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-15 Thread Ilya Lipnitskiy via Gcc-patches
On Mon, Nov 15, 2021 at 2:50 PM Jeff Law  wrote:
>
>
>
> On 11/15/2021 1:25 AM, Ilya Lipnitskiy via Gcc-patches wrote:
> > musl only uses PT_GNU_STACK to set default thread stack size and has no
> > executable stack support[0], so there is no reason not to emit the
> > .note.GNU-stack section on musl builds.
> >
> > [0]: 
> > https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u
> >
> > gcc/ChangeLog:
> >
> >   * configure: Regenerate.
> >   * configure.ac: define TARGET_LIBC_GNUSTACK on musl
> If musl has no executable stack support, then wouldn't we want this
> change to apply to all musl platforms, not just mips?
The original change was MIPS-specific[0] and TARGET_LIBC_GNUSTACK is
only used by mips code today. We could change both cases to be more
generic or keep it MIPS-specific. Dragan, what do you think?

I also need to re-spin my change as my case is a more specific match
than the first and never executes with my patch.

[0]: 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=54b3d52c3cca836c7c4c08cc9c02eda6c096372a
>
> jeff
>
Ilya


[PATCH] Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert

2021-11-15 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when the 
conversion widens
but not when the conversion is a nop. For the same reason why we move the 
widening conversion
(the possibility of removing an extra conversion), we should do the same if the 
conversion is a
nop.

OK? Boostrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103228
PR tree-optimization/55177

gcc/ChangeLog:

* match.pd ((type) X bitop CST): Also do this
transformation for nop conversions.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103228-1.c: New test.
* gcc.dg/tree-ssa/pr55177-1.c: New test.
---
 gcc/match.pd   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++
 3 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a0e9a82e4c4..dc3d5054583 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1615,7 +1615,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& (bitop != BIT_AND_EXPR || GIMPLE)
&& (/* That's a good idea if the conversion widens the operand, thus
  after hoisting the conversion the operation will be narrower.  */
-  TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
+  TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
   /* It's also a good idea if the conversion is to a non-integer
  mode.  */
   || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
new file mode 100644
index 000..a7539819cf2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+int f(int a, int b)
+{
+  b|=1u;
+  b|=2;
+  return b;
+}
+/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
new file mode 100644
index 000..de1a264345c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+extern int x;
+
+void foo(void)
+{
+  int a = __builtin_bswap32(x);
+  a &= 0x5a5b5c5d;
+  x = __builtin_bswap32(a);
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
-- 
2.17.1



Re: [PATCH] Add a missing return transforming atomic bit test and operations

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 8:18 PM, H.J. Lu via Gcc-patches wrote:

When failing to transform equivalent, but slighly different cases of
atomic bit test and operations to their canonical forms, return
immediately.

gcc/

PR middle-end/103268
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing
return.

gcc/testsuite/

PR middle-end/103268
* gcc.dg/pr103268-1.c: New test.
* gcc.dg/pr103268-2.c: Likewise.

OK
jeff



[PATCH] Add a missing return transforming atomic bit test and operations

2021-11-15 Thread H.J. Lu via Gcc-patches
When failing to transform equivalent, but slighly different cases of
atomic bit test and operations to their canonical forms, return
immediately.

gcc/

PR middle-end/103268
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing
return.

gcc/testsuite/

PR middle-end/103268
* gcc.dg/pr103268-1.c: New test.
* gcc.dg/pr103268-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/pr103268-1.c | 10 ++
 gcc/testsuite/gcc.dg/pr103268-2.c | 12 
 gcc/tree-ssa-ccp.c|  2 ++
 3 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr103268-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr103268-2.c

diff --git a/gcc/testsuite/gcc.dg/pr103268-1.c 
b/gcc/testsuite/gcc.dg/pr103268-1.c
new file mode 100644
index 000..6d583d55d6d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103268-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern int si;
+long
+test_types (void)
+{
+  unsigned int u2 = __atomic_fetch_xor (, 0, 5);
+  return u2;
+}
diff --git a/gcc/testsuite/gcc.dg/pr103268-2.c 
b/gcc/testsuite/gcc.dg/pr103268-2.c
new file mode 100644
index 000..12283bb43d9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103268-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+ 
+extern long pscc_a_2_3;
+extern int pscc_a_1_4;
+
+void
+pscc (void)
+{
+  pscc_a_1_4 = __sync_fetch_and_and (_a_2_3, 1);
+}
+
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 0666dc652d0..18d57729d8a 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3638,6 +3638,8 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
  use_stmt = use_nop_stmt;
}
}
+  else
+   return;
 
   if (!bit)
{
-- 
2.33.1



Re: [PATCH] tree-optimization: [PR103245] Improve detection of abs pattern using multiplication

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 7:53 PM, apinski--- via Gcc-patches wrote:

From: Andrew Pinski 

So while working on PR 103228 (and a few others), I noticed the testcase for PR 
94785
was failing. The problem is that the nop_convert moved from being inside the 
IOR to be
outside of it. I also noticed the patch for PR 103228 was not needed to 
reproduce the
issue either.
This patch combines the two patterns together for the abs match when using 
multiplication
and adds a few places where nop_convert are optional.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103245

gcc/ChangeLog:

* match.pd: Combine the abs pattern matching using multiplication.
Adding optional nop_convert too.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103245-1.c: New test.

OK
jeff



[PATCH] tree-optimization: [PR103245] Improve detection of abs pattern using multiplication

2021-11-15 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

So while working on PR 103228 (and a few others), I noticed the testcase for PR 
94785
was failing. The problem is that the nop_convert moved from being inside the 
IOR to be
outside of it. I also noticed the patch for PR 103228 was not needed to 
reproduce the
issue either.
This patch combines the two patterns together for the abs match when using 
multiplication
and adds a few places where nop_convert are optional.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103245

gcc/ChangeLog:

* match.pd: Combine the abs pattern matching using multiplication.
Adding optional nop_convert too.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103245-1.c: New test.
---
 gcc/match.pd   | 22 +--
 gcc/testsuite/gcc.dg/tree-ssa/pr103245-1.c | 25 ++
 2 files changed, 36 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103245-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 3b9d13aa24c..dc3d5054583 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1488,21 +1488,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (absu tree_expr_nonnegative_p@0)
  (convert @0))
 
-/* Simplify (-(X < 0) | 1) * X into abs (X).  */
+/* Simplify (-(X < 0) | 1) * X into abs (X) or absu(X).  */
 (simplify
- (mult:c (bit_ior (negate (convert? (lt @0 integer_zerop))) integer_onep) @0)
- (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))
-  (abs @0)))
-
-/* Similarly (-(X < 0) | 1U) * X into absu (X).  */
-(simplify
- (mult:c (bit_ior (nop_convert (negate (convert? (lt @0 integer_zerop
- integer_onep) (nop_convert @0))
+ (mult:c (nop_convert1?
+ (bit_ior (nop_convert2? (negate (convert? (lt @0 integer_zerop
+   integer_onep))
+(nop_convert3? @0))
  (if (INTEGRAL_TYPE_P (type)
-  && TYPE_UNSIGNED (type)
   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
   && !TYPE_UNSIGNED (TREE_TYPE (@0)))
-  (absu @0)))
+  (if (TYPE_UNSIGNED (type))
+   (absu @0)
+   (abs @0)
+  )
+ )
+)
 
 /* A few cases of fold-const.c negate_expr_p predicate.  */
 (match negate_expr_p
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103245-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr103245-1.c
new file mode 100644
index 000..68ddeadb799
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103245-1.c
@@ -0,0 +1,25 @@
+/* PR tree-optimization/103245 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times " = ABSU_EXPR ;" 1 
"optimized" } } */
+
+unsigned
+f1 (int v)
+{
+  unsigned int d_6;
+  int b_5;
+  int a_4;
+  _Bool _1;
+  unsigned int v1_2;
+  unsigned int _7;
+  int _9;
+
+  _1 = v < 0;
+  a_4 = (int) _1;
+  b_5 = -a_4;
+  _9 = b_5 | 1;
+  d_6 = (unsigned int) _9;
+  v1_2 = (unsigned int) v;
+  _7 = v1_2 * d_6;
+  return _7;
+}
-- 
2.17.1



Re: [PATCH] PCH: Make the save and restore diagnostics more robust.

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/13/2021 6:10 AM, Iain Sandoe via Gcc-patches wrote:

When saving, if we cannot obtain a suitable memory segment there
is no point in continuing, so exit with an error.

When reading in the PCH, we have a situation that the read-in
data will replace the line tables used by the diagnostics output.
However, the state of the read-in line tables is indeterminate
at some points where diagnostics might be needed.

To make this more robust, we save the existing line tables at
the start and, once we have read in the pointer to the new one,
put that to one side and restore the original table.  This
avoids compiler hangs if the read or memory acquisition code
issues an assert, fatal_error, segv etc.

Once the read is complete, we swap in the new line table that
came from the PCH.

If the read-in PCH is corrupted then we still have a broken
compilation w.r.t any future diagnostics - but there is little
that can be done about that without more careful validation of
the file.

I've tested this by hacking and rebuilding the compiler to
produce various kinds of failure.  At present, it is hard to
see how to make testcases to do this.  Now reg-testing on more
systems,

OK for master if reg-tests pass?
thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* ggc-common.c (gt_pch_save): If we cannot find a suitable
memory segment for save, then error-out, do not try to
continue.
(gt_pch_restore): Save the existing line table, and when
the replacement is being read, use that when constructing
diagnostics.

OK if it passes.

Jeff



Re: [PATCH] Avoid pathological function redeclarations when checking access sizes [PR102759]

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 1:31 PM, Martin Sebor via Gcc-patches wrote:

Declaring a function with a prototype at block scope and
then redeclaring it without a prototype at file scope results
in losing the prototype but not the attribute access that was
implicitly added to the function decl based on the prototype.
The middle end code that checks function calls for out-of-bounds
accesses based on the attribute is unprepared for this case and
fails with an ICE.  The attached patch corrects this by having
it ignore these pathological cases.  In addition, the change
also improves the format of the informational note printed
after these warnings to reflect the form of the argument
(e.g., to print int[7] rather than int * if the former was
the form used in the declaration).

Tested on x86_64-linux.

Martin

gcc-102759.diff

Avoid pathological function redeclarations when checking access sizes 
[PR102759].

Resolves:
PR tree-optimization/102759 - ICE: Segmentation fault in 
maybe_check_access_sizes since r12-2976-gb48d4e6818674898

PR tree-optimization/102759

gcc/ChangeLog:

PR tree-optimization/102759
* gimple-array-bounds.cc (build_printable_array_type): Move...
* gimple-ssa-warn-access.cc (build_printable_array_type): Avoid
pathological function redeclarations that remove a previously
declared prototype.
Improve formatting of function arguments in informational notes.
* pointer-query.cc (build_printable_array_type): ...to here.
* pointer-query.h (build_printable_array_type): Declared.

gcc/testsuite/ChangeLog:

PR tree-optimization/102759
* gcc.dg/Warray-parameter-10.c: New test.
* gcc.dg/Wstringop-overflow-82.c: New test.

OK
jeff


Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/12/2021 7:37 AM, Martin Liška wrote:

@Alexandre: PING

On 10/18/21 12:05, Richard Biener wrote:

On Mon, Oct 18, 2021 at 10:54 AM Martin Liška  wrote:


The macros correspond 1:1 to an option flags and make it harder
to find all usages of the flags.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


Hmm, they were introduced on purpose - since you leave around
MAY_HAVE_DEBUG_STMTS they conceptually make the code
easier to understand.

So I'm not sure if we want this change.  CCed Alex so maybe he
can weight in.
I'd give it another 48hrs, then go ahead and commit.  My recollection is 
those were in place to allow the bulk of the work to go in independent 
of the switches that activated the SFN debugging bits.



Jeff


Re: [PATCH] Fix spelling of ones' complement.

2021-11-15 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches wrote:
> 
> 
> > On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches 
> >  wrote:
> > 
> > Nitpicking time.  It's spelled "ones' complement" rather than "one's
> > complement". 
> 
> Is that so?  I see Wikipedia claims it is, but there are no sources for that 
> claim.  (There is an assertion that it is "discussed at length on the talk 
> page" of an article about number representation, but in fact there is no 
> discussion there at all.)
> 
> I have never seen this spelling before, and I very much doubt its validity.  
> For one thing, why then have "two's complement"?  For another, to pick one 
> random authority, J.E. Thornton in "Design of a computer -- the Control Data 
> 6600" refers to "one's complement" to describe the well known mode used by 
> that machine and its relatives.

Knuth, The Art of Computer Programming Volume 2, page 203-4:

"A two's complement number is complemented with respect to a single
power of 2, while a ones' complement number is complemented with respect
to a long sequence of 1s."

Marek



[PATCH, rs6000] Optimization for vec_xl_sext

2021-11-15 Thread HAO CHEN GUI via Gcc-patches
Hi,

   The patch optimizes the code generation for vec_xl_sext builtin. Now all the 
sign extensions are done on VSX registers directly.

   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this 
okay for trunk? Any recommendations? Thanks a lot.

ChangeLog

2021-11-16 Haochen Gui 

gcc/
    * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify
    the expansion for sign extension. All extensions are done on VSX
    registers.

gcc/testsuite/
    * gcc.target/powerpc/p10_vec_xl_sext.c: New test.

patch.diff

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b4e13af4dc6..587e9fa2a2a 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
exp, rtx target, bool bl

   if (sign_extend)
 {
-  rtx discratch = gen_reg_rtx (DImode);
+  rtx discratch = gen_reg_rtx (V2DImode);
   rtx tiscratch = gen_reg_rtx (TImode);

   /* Emit the lxvr*x insn.  */
@@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
exp, rtx target, bool bl
    return 0;
   emit_insn (pat);

-  /* Emit a sign extension from QI,HI,WI to double (DI).  */
-  rtx scratch = gen_lowpart (smode, tiscratch);
+  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
+  rtx temp1, temp2;
   if (icode == CODE_FOR_vsx_lxvrbx)
-   emit_insn (gen_extendqidi2 (discratch, scratch));
+   {
+ temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1));
+   }
   else if (icode == CODE_FOR_vsx_lxvrhx)
-   emit_insn (gen_extendhidi2 (discratch, scratch));
+   {
+ temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1));
+   }
   else if (icode == CODE_FOR_vsx_lxvrwx)
-   emit_insn (gen_extendsidi2 (discratch, scratch));
-  /*  Assign discratch directly if scratch is already DI.  */
-  if (icode == CODE_FOR_vsx_lxvrdx)
-   discratch = scratch;
+   {
+ temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0);
+ emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1));
+   }
+  else if (icode == CODE_FOR_vsx_lxvrdx)
+   discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0);
+  else
+   gcc_unreachable ();

-  /* Emit the sign extension from DI (double) to TI (quad).  */
-  emit_insn (gen_extendditi2 (target, discratch));
+  /* Emit the sign extension from V2DI (double) to TI (quad).  */
+  temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0);
+  emit_insn (gen_extendditi2_vector (target, temp2));

   return target;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c 
b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
new file mode 100644
index 000..78e72ac5425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#include 
+
+vector signed __int128
+foo1 (signed long a, signed char *b)
+{
+  return vec_xl_sext (a, b);
+}
+
+vector signed __int128
+foo2 (signed long a, signed short *b)
+{
+  return vec_xl_sext (a, b);
+}
+
+vector signed __int128
+foo3 (signed long a, signed int *b)
+{
+  return vec_xl_sext (a, b);
+}
+
+vector signed __int128
+foo4 (signed long a, signed long *b)
+{
+  return vec_xl_sext (a, b);
+}
+
+/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b4e13af4dc6..587e9fa2a2a 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
exp, rtx target, bool bl
 
   if (sign_extend)
 {
-  rtx discratch = gen_reg_rtx (DImode);
+  rtx discratch = gen_reg_rtx (V2DImode);
   rtx tiscratch = gen_reg_rtx (TImode);
 
   /* Emit the lxvr*x insn.  */
@@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
exp, rtx target, bool bl
return 0;
   emit_insn (pat);
 
-  /* Emit a sign extension from QI,HI,WI to double (DI).  */
-  rtx scratch = gen_lowpart (smode, tiscratch);
+  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
+  rtx temp1, temp2;
   if (icode == CODE_FOR_vsx_lxvrbx)
-   emit_insn (gen_extendqidi2 (discratch, scratch));
+   {
+ temp1  = simplify_gen_subreg (V16QImode, tiscratch, 

Re: [PATCH] Fix spelling of ones' complement.

2021-11-15 Thread Koning, Paul via Gcc-patches



> On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches 
>  wrote:
> 
> Nitpicking time.  It's spelled "ones' complement" rather than "one's
> complement". 

Is that so?  I see Wikipedia claims it is, but there are no sources for that 
claim.  (There is an assertion that it is "discussed at length on the talk 
page" of an article about number representation, but in fact there is no 
discussion there at all.)

I have never seen this spelling before, and I very much doubt its validity.  
For one thing, why then have "two's complement"?  For another, to pick one 
random authority, J.E. Thornton in "Design of a computer -- the Control Data 
6600" refers to "one's complement" to describe the well known mode used by that 
machine and its relatives.

paul




[PATCH] Fix spelling of ones' complement.

2021-11-15 Thread Marek Polacek via Gcc-patches
Nitpicking time.  It's spelled "ones' complement" rather than "one's
complement".  I didn't go into config/.

Ok for trunk?

gcc/ChangeLog:

* doc/implement-c.texi: Fix spelling.
* doc/md.texi: Likewise.
* expmed.c (emit_store_flag_int): Likewise.
* optabs.c (expand_abs): Likewise.
(expand_one_cmpl_abs_nojump): Likewise.
* optabs.h (expand_abs): Likewise.
* tree-ssa-ccp.c (gimple_nop_atomic_bit_test_and_p): Likewise.
---
 gcc/doc/implement-c.texi | 2 +-
 gcc/doc/md.texi  | 2 +-
 gcc/expmed.c | 2 +-
 gcc/optabs.c | 4 ++--
 gcc/optabs.h | 2 +-
 gcc/tree-ssa-ccp.c   | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/doc/implement-c.texi b/gcc/doc/implement-c.texi
index b656ac8ec4f..5dcd79004de 100644
--- a/gcc/doc/implement-c.texi
+++ b/gcc/doc/implement-c.texi
@@ -236,7 +236,7 @@ GCC does not support any extended integer types.
 
 @item
 @cite{Whether signed integer types are represented using sign and magnitude,
-two's complement, or one's complement, and whether the extraordinary value
+two's complement, or ones' complement, and whether the extraordinary value
 is a trap representation or an ordinary value (C99 and C11 6.2.6.2).}
 
 GCC supports only two's complement integer types, and all bit patterns
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 41f1850bf6e..64a4cc0834e 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1930,7 +1930,7 @@ A 3-bit unsigned integer constant.
 A 6-bit unsigned integer constant.
 
 @item CnL
-One's complement of a 6-bit unsigned integer constant.
+Ones' complement of a 6-bit unsigned integer constant.
 
 @item CmL
 Two's complement of a 6-bit unsigned integer constant.
diff --git a/gcc/expmed.c b/gcc/expmed.c
index 4abce11b647..6f19da6ef92 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -5940,7 +5940,7 @@ emit_store_flag_int (rtx target, rtx subtarget, enum 
rtx_code code, rtx op0,
}
 
   /* If we couldn't do it that way, for NE we can "or" the two's complement
-of the value with itself.  For EQ, we take the one's complement of
+of the value with itself.  For EQ, we take the ones' complement of
 that "or", which is an extra insn, so we only handle EQ if branches
 are expensive.  */
 
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 019bbb62882..d3d801fb96d 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -3742,7 +3742,7 @@ expand_abs (machine_mode mode, rtx op0, rtx target,
   return target;
 }
 
-/* Emit code to compute the one's complement absolute value of OP0
+/* Emit code to compute the ones' complement absolute value of OP0
(if (OP0 < 0) OP0 = ~OP0), with result to TARGET if convenient.
(TARGET may be NULL_RTX.)  The return value says where the result
actually is to be found.
@@ -3775,7 +3775,7 @@ expand_one_cmpl_abs_nojump (machine_mode mode, rtx op0, 
rtx target)
   delete_insns_since (last);
 }
 
-  /* If this machine has expensive jumps, we can do one's complement
+  /* If this machine has expensive jumps, we can do ones' complement
  absolute value of X as (((signed) x >> (W-1)) ^ x).  */
 
   scalar_int_mode int_mode;
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 3bbceff92d9..9ed96944ebe 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -219,7 +219,7 @@ extern rtx expand_unop (machine_mode, optab, rtx, rtx, int);
 extern rtx expand_abs_nojump (machine_mode, rtx, rtx, int);
 extern rtx expand_abs (machine_mode, rtx, rtx, int, int);
 
-/* Expand the one's complement absolute value operation.  */
+/* Expand the ones' complement absolute value operation.  */
 extern rtx expand_one_cmpl_abs_nojump (machine_mode, rtx, rtx);
 
 /* Expand the copysign operation.  */
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 0666dc652d0..8d3f68b1367 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3341,7 +3341,7 @@ extern bool gimple_nop_atomic_bit_test_and_p (tree, tree 
*,
in there), and/or if mask_2 is a power of 2 constant.
Similarly for xor instead of or, use ATOMIC_BIT_TEST_AND_COMPLEMENT
in that case.  And similarly for and instead of or, except that
-   the second argument to the builtin needs to be one's complement
+   the second argument to the builtin needs to be ones' complement
of the mask instead of mask.  */
 
 static void

base-commit: a031aaa2ac9d4c74994df085a0d8c79bd55792c9
-- 
2.33.1



[PATCH] Update my email address.

2021-11-15 Thread Jim Wilson via Gcc-patches
I've left SiFive and have a new gmail account because it is convenient
to use with git send-email.  I'm planning to use this for my RISC-V
work.  My tuliptree address still works, it just isn't as convenient.

* MAINTAINERS: Update my address.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4f610bca9c0..1b70fce3784 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -104,7 +104,7 @@ pru portDimitar Dimitrov

 riscv port Kito Cheng  
 riscv port Palmer Dabbelt  
 riscv port Andrew Waterman 
-riscv port Jim Wilson  
+riscv port Jim Wilson  
 rs6000/powerpc portDavid Edelsohn  
 rs6000/powerpc portSegher Boessenkool  
 rs6000 vector extnsAldy Hernandez  
-- 
2.25.1



Re: [PATCH] c++: designated init of char array by string constant [PR55227]

2021-11-15 Thread Marek Polacek via Gcc-patches
Hi,

thanks for the patch and sorry for the delay in reviewing.

On Sat, Nov 06, 2021 at 08:17:23PM -0400, Will Wray via Gcc-patches wrote:
> This patch aims to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55227.
> 
> There are two underlying bugs in the designated initialization of char array
> fields by string literals that cause:
> 
> (1) Rejection of valid cases with:
>   (a) brace-enclosed string literal initializer (of any valid size), or
>   (b) unbraced string literal shorter than the target char array field.
> 
> (2) Acceptance of invalid cases with designators appearing within the braces
> of a braced string literal, in which case the bogus 'designator' was being
> entirely ignored and the string literal treated as a positional 
> initializer.

I also noticed the C++ FE rejects

  struct A { char x[4]; };
  struct B { struct A a; };
  struct B b = { .a.x = "abc" };
 
but the C FE accepts it.  But that's for another time.

> Please review these changes carefully; there are likely errors of omission,
> logic or an anon anomaly.
> 
> The fixes above allow to address a FIXME in cp_complete_array_type:
> 
>   /* FIXME: this code is duplicated from reshape_init.
>  Probably we should just call reshape_init here?  */
> 
> I believe that this was obstructed by the designator bugs (see comment here
> https://patchwork.ozlabs.org/project/gcc/list/?series=199783)
> 
> Boostraps/regtests on x86_64-pc-linux-gnu.
> 
> PR c++/55227
> 
> gcc/cp/ChangeLog:
> 
> * decl.c (reshape_init_r): restrict has_designator_check,
  
I think something like "Only call has_designator_check when first_initializer_p
or for the inner constructor element." would be better.

> (cp_complete_array_type): do reshape_init on braced-init-list.

s/do/Do/

> gcc/testsuite/ChangeLog:
> 
> * g++.dg/cpp2a/desig20.C: New test.
> ---
>  gcc/cp/decl.c| 28 ++--
>  gcc/testsuite/g++.dg/cpp2a/desig20.C | 48 
>  2 files changed, 57 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/desig20.C
> 
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 947bbfc6637..f01655c5c14 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -6820,6 +6820,7 @@ reshape_init_r (tree type, reshape_iter *d, tree 
> first_initializer_p,
>  {
>tree str_init = init;
>tree stripped_str_init = stripped_init;
> +  reshape_iter stripd = {};

Since the previous variables spell it "stripped" maybe call it stripped_iter.
  
>/* Strip one level of braces if and only if they enclose a single
>element (as allowed by [dcl.init.string]).  */
> @@ -6827,7 +6828,8 @@ reshape_init_r (tree type, reshape_iter *d, tree 
> first_initializer_p,
> && TREE_CODE (stripped_str_init) == CONSTRUCTOR
> && CONSTRUCTOR_NELTS (stripped_str_init) == 1)
>   {
> -   str_init = (*CONSTRUCTOR_ELTS (stripped_str_init))[0].value;
> +   stripd.cur = CONSTRUCTOR_ELT (stripped_str_init, 0);
> +   str_init = stripd.cur->value;
> stripped_str_init = tree_strip_any_location_wrapper (str_init);
>   }
>  
> @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree 
> first_initializer_p,
>array types (one value per array element).  */
>if (TREE_CODE (stripped_str_init) == STRING_CST)
>   {
> -   if (has_designator_problem (d, complain))

So the logic here is that...

> +   if ((first_initializer_p && has_designator_problem (d, complain))

this will complain about a designator in the outermost { }:

  char arr[] = { [0] = "foo" };

and...

> +   || (stripd.cur && has_designator_problem (, complain)))

this checks the { } which is at least one level deep, contains a STRING_CST,
and this line makes sure that we don't have a designator that would refer to
some element of the array we're initializing with the STRING_CST, yes?  I.e.,
something like

  struct S { char arr[10]; };
  S s = { { [2] = "lol" } };

>   return error_mark_node;
> d->cur++;
> return str_init;
> @@ -9538,23 +9541,10 @@ cp_complete_array_type (tree *ptype, tree 
> initial_value, bool do_default)
>  
>if (initial_value)
>  {
> -  /* An array of character type can be initialized from a
> -  brace-enclosed string constant.

Nice to finally remove this, but let's keep this part of the comment.

> -  FIXME: this code is duplicated from reshape_init. Probably
> -  we should just call reshape_init here?  */
> -  if (char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (*ptype)))
> -   && TREE_CODE (initial_value) == CONSTRUCTOR
> -   && !vec_safe_is_empty (CONSTRUCTOR_ELTS (initial_value)))
> - {
> -   vec *v = CONSTRUCTOR_ELTS (initial_value);
> -   tree value = (*v)[0].value;
> -   STRIP_ANY_LOCATION_WRAPPER (value);
> -
> -   if (TREE_CODE (value) == STRING_CST
> -   && v->length () == 1)
> - 

[PATCH] libstdc++: Merge latest Ryu sources

2021-11-15 Thread Patrick Palka via Gcc-patches
The only source change is a speedup to pow5Factor.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* src/c++17/ryu/MERGE: Update the commit hash.
* src/c++17/ryu/d2s_intrinsics.h: Merge from Ryu's master
branch.
---
 libstdc++-v3/src/c++17/ryu/MERGE| 2 +-
 libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h | 9 -
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/src/c++17/ryu/MERGE b/libstdc++-v3/src/c++17/ryu/MERGE
index 0ea65add2ff..9035502846e 100644
--- a/libstdc++-v3/src/c++17/ryu/MERGE
+++ b/libstdc++-v3/src/c++17/ryu/MERGE
@@ -1,4 +1,4 @@
-22c887c017a8512abcd4a8f6bfcdb2742829cd7d
+150d0c87830756d34e76c42f7f33f811d89903a8
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
diff --git a/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h 
b/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
index bbac4dfd48f..e3206b5d7dd 100644
--- a/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
+++ b/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
@@ -181,15 +181,14 @@ static inline uint32_t mod1e9(const uint64_t x) {
 #endif // defined(RYU_32_BIT_PLATFORM)
 
 static inline uint32_t pow5Factor(uint64_t value) {
+  const uint64_t m_inv_5 = 14757395258967641293u; // 5 * m_inv_5 = 1 (mod 2^64)
+  const uint64_t n_div_5 = 3689348814741910323u;  // #{ n | n = 0 (mod 2^64) } 
= 2^64 / 5
   uint32_t count = 0;
   for (;;) {
 assert(value != 0);
-const uint64_t q = div5(value);
-const uint32_t r = ((uint32_t) value) - 5 * ((uint32_t) q);
-if (r != 0) {
+value *= m_inv_5;
+if (value > n_div_5)
   break;
-}
-value = q;
 ++count;
   }
   return count;
-- 
2.34.0



[PATCH 1/5] libstdc++: Import the fast_float library

2021-11-15 Thread Patrick Palka via Gcc-patches
This copies the fast_float library[1] into the compiled-in library
sources.  We're going to use this library in our floating-point
std::from_chars implementation for faster and more portable parsing of
binary32/64 decimal strings.

[1]: https://github.com/fastfloat/fast_float

Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
look OK for trunk?

libstdc++-v3/ChangeLog:

* src/c++17/fast_float/LICENSE-APACHE: New file.
* src/c++17/fast_float/LICENSE-MIT: New file.
* src/c++17/fast_float/LOCAL_PATCHES: New file.
* src/c++17/fast_float/MERGE: New file.
* src/c++17/fast_float/ascii_number.h,
src/c++17/fast_float/bigint.h,
src/c++17/fast_float/decimal_to_binary.h,
src/c++17/fast_float/digit_comparison.h,
src/c++17/fast_float/fast_float.h,
src/c++17/fast_float/fast_table.h,
src/c++17/fast_float/float_common.h,
src/c++17/fast_float/parse_number.h: Import these files from the
fast_float library.
---
 .../src/c++17/fast_float/LICENSE-APACHE   | 190 +
 libstdc++-v3/src/c++17/fast_float/LICENSE-MIT |  27 +
 .../src/c++17/fast_float/LOCAL_PATCHES|   0
 libstdc++-v3/src/c++17/fast_float/MERGE   |   4 +
 .../src/c++17/fast_float/ascii_number.h   | 231 ++
 libstdc++-v3/src/c++17/fast_float/bigint.h| 590 +++
 .../src/c++17/fast_float/decimal_to_binary.h  | 194 +
 .../src/c++17/fast_float/digit_comparison.h   | 423 +++
 .../src/c++17/fast_float/fast_float.h |  63 ++
 .../src/c++17/fast_float/fast_table.h | 699 ++
 .../src/c++17/fast_float/float_common.h   | 362 +
 .../src/c++17/fast_float/parse_number.h   | 113 +++
 12 files changed, 2896 insertions(+)
 create mode 100644 libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
 create mode 100644 libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
 create mode 100644 libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
 create mode 100644 libstdc++-v3/src/c++17/fast_float/MERGE
 create mode 100644 libstdc++-v3/src/c++17/fast_float/ascii_number.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/bigint.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/decimal_to_binary.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/digit_comparison.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/fast_float.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/fast_table.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/float_common.h
 create mode 100644 libstdc++-v3/src/c++17/fast_float/parse_number.h

diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE 
b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
new file mode 100644
index 000..26f4398f249
--- /dev/null
+++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
@@ -0,0 +1,190 @@
+ Apache License
+   Version 2.0, January 2004
+http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+  "License" shall mean the terms and conditions for use, reproduction,
+  and distribution as defined by Sections 1 through 9 of this document.
+
+  "Licensor" shall mean the copyright owner or entity authorized by
+  the copyright owner that is granting the License.
+
+  "Legal Entity" shall mean the union of the acting entity and all
+  other entities that control, are controlled by, or are under common
+  control with that entity. For the purposes of this definition,
+  "control" means (i) the power, direct or indirect, to cause the
+  direction or management of such entity, whether by contract or
+  otherwise, or (ii) ownership of fifty percent (50%) or more of the
+  outstanding shares, or (iii) beneficial ownership of such entity.
+
+  "You" (or "Your") shall mean an individual or Legal Entity
+  exercising permissions granted by this License.
+
+  "Source" form shall mean the preferred form for making modifications,
+  including but not limited to software source code, documentation
+  source, and configuration files.
+
+  "Object" form shall mean any form resulting from mechanical
+  transformation or translation of a Source form, including but
+  not limited to compiled object code, generated documentation,
+  and conversions to other media types.
+
+  "Work" shall mean the work of authorship, whether in Source or
+  Object form, made available under the License, as indicated by a
+  copyright notice that is included in or attached to the work
+  (an example is provided in the Appendix below).
+
+  "Derivative Works" shall mean any work, whether in Source or Object
+  form, that is based on (or derived from) the Work and for which the
+  editorial revisions, annotations, elaborations, or other modifications
+  represent, as a whole, an 

[PATCH 4/5] libstdc++: Use fast_float in std::from_chars for binary32/64

2021-11-15 Thread Patrick Palka via Gcc-patches
This makes our std::from_chars implementation use fast_float for parsing
chars_format::scientific/fixed/general parsing into binary32/64 values.
For chars_format::hex (and long double) we still use the fallback
implementation that goes through the strtod family of functions.

libstdc++-v3/ChangeLog:

* src/c++17/floating_from_chars.cc: (USE_LIB_FAST_FLOAT):
Conditionally define, and use it to conditionally include
fast_float.
(from_chars): Use fast_float for float and double when
USE_LIB_FAST_FLOAT.
---
 libstdc++-v3/src/c++17/floating_from_chars.cc | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc 
b/libstdc++-v3/src/c++17/floating_from_chars.cc
index aa074869872..13834f54d38 100644
--- a/libstdc++-v3/src/c++17/floating_from_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -52,6 +53,18 @@
 extern "C" __ieee128 __strtoieee128(const char*, char**);
 #endif
 
+#if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64
+# define USE_LIB_FAST_FLOAT 1
+#endif
+
+#if USE_LIB_FAST_FLOAT
+# define FASTFLOAT_DEBUG_ASSERT __glibcxx_assert
+namespace
+{
+# include "fast_float/fast_float.h"
+} // anon namespace
+#endif
+
 #if _GLIBCXX_HAVE_USELOCALE
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -406,6 +419,11 @@ from_chars_result
 from_chars(const char* first, const char* last, float& value,
   chars_format fmt) noexcept
 {
+#if USE_LIB_FAST_FLOAT
+  if (fmt != chars_format::hex)
+return fast_float::from_chars(first, last, value, fmt);
+#endif
+
   errc ec = errc::invalid_argument;
 #if _GLIBCXX_USE_CXX11_ABI
   buffer_resource mr;
@@ -432,6 +450,11 @@ from_chars_result
 from_chars(const char* first, const char* last, double& value,
   chars_format fmt) noexcept
 {
+#if USE_LIB_FAST_FLOAT
+  if (fmt != chars_format::hex)
+return fast_float::from_chars(first, last, value, fmt);
+#endif
+
   errc ec = errc::invalid_argument;
 #if _GLIBCXX_USE_CXX11_ABI
   buffer_resource mr;
-- 
2.34.0



[PATCH 3/5] libstdc++: Adjust fast_float's over/underflow behavior for conformnace

2021-11-15 Thread Patrick Palka via Gcc-patches
This makes fast_float handle the situation where std::from_chars is
specified to return result_out_of_range, i.e. when the parsed value
is outside the representable range of the floating-point type.

libstdc++-v3/ChangeLog:

* src/c++17/fast_float/LOCAL_PATCHES: Update.
* src/c++17/fast_float/parse_number.h (from_chars_advanced): In
case of over/underflow, return errc::result_out_of_range and don't
modify 'value'.
---
 libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES  |  1 +
 libstdc++-v3/src/c++17/fast_float/parse_number.h | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
index e9d7bba6195..1f90f9d1d85 100644
--- a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
+++ b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
@@ -1 +1,2 @@
 r12-
+r12-
diff --git a/libstdc++-v3/src/c++17/fast_float/parse_number.h 
b/libstdc++-v3/src/c++17/fast_float/parse_number.h
index 86dea2287b4..57b3585c2fe 100644
--- a/libstdc++-v3/src/c++17/fast_float/parse_number.h
+++ b/libstdc++-v3/src/c++17/fast_float/parse_number.h
@@ -99,6 +99,16 @@ from_chars_result from_chars_advanced(const char *first, 
const char *last,
   // If we called compute_float>(pns.exponent, pns.mantissa) 
and we have an invalid power (am.power2 < 0),
   // then we need to go the long way around again. This is very uncommon.
   if(am.power2 < 0) { am = digit_comp(pns, am); }
+
+  if((pns.mantissa != 0 && am.mantissa == 0 && am.power2 == 0) || am.power2 == 
binary_format::infinite_power()) {
+// In case of over/underflow, return result_out_of_range and don't modify 
value,
+// as per [charconv.from.chars]/1.
+//
+// If LWG 3081 gets adopted, then we'll need to call to_float in this case 
too.
+answer.ec = std::errc::result_out_of_range;
+return answer;
+  }
+
   to_float(pns.negative, am, value);
   return answer;
 }
-- 
2.34.0



[PATCH 2/5] libstdc++: Apply modifications to our local copy of fast_float

2021-11-15 Thread Patrick Palka via Gcc-patches
This performs the following modifications to our local copy of fast_float
in order to make it more readily usable in our std::from_chars
implementation:

  * Remove system #includes
  * Replace stray call to assert
  * Use the standard library chars_format and from_chars_result types

libstdc++-v3/ChangeLog:

* src/c++17/fast_float/LOCAL_PATCHES: Update.
* src/c++17/fast_float/ascii_number.h,
src/c++17/fast_float/bigint.h,
src/c++17/fast_float/decimal_to_binary.h,
src/c++17/fast_float/digit_comparison.h,
src/c++17/fast_float/fast_float.h,
src/c++17/fast_float/fast_table.h,
src/c++17/fast_float/float_common.h,
src/c++17/fast_float/parse_number.h: Apply local modifications.
---
 libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES   |  1 +
 libstdc++-v3/src/c++17/fast_float/ascii_number.h  | 11 +++
 libstdc++-v3/src/c++17/fast_float/bigint.h|  5 -
 .../src/c++17/fast_float/decimal_to_binary.h  |  6 --
 .../src/c++17/fast_float/digit_comparison.h   |  5 -
 libstdc++-v3/src/c++17/fast_float/fast_float.h| 15 ++-
 libstdc++-v3/src/c++17/fast_float/fast_table.h|  2 --
 libstdc++-v3/src/c++17/fast_float/float_common.h  |  7 +--
 libstdc++-v3/src/c++17/fast_float/parse_number.h  |  5 -
 9 files changed, 7 insertions(+), 50 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
index e69de29bb2d..e9d7bba6195 100644
--- a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
+++ b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
@@ -0,0 +1 @@
+r12-
diff --git a/libstdc++-v3/src/c++17/fast_float/ascii_number.h 
b/libstdc++-v3/src/c++17/fast_float/ascii_number.h
index 3e6bb3e9ef3..c2f69326b2f 100644
--- a/libstdc++-v3/src/c++17/fast_float/ascii_number.h
+++ b/libstdc++-v3/src/c++17/fast_float/ascii_number.h
@@ -1,11 +1,6 @@
 #ifndef FASTFLOAT_ASCII_NUMBER_H
 #define FASTFLOAT_ASCII_NUMBER_H
 
-#include 
-#include 
-#include 
-#include 
-
 #include "float_common.h"
 
 namespace fast_float {
@@ -144,7 +139,7 @@ parsed_number_string parse_number_string(const char *p, 
const char *pend, parse_
 return answer;
   }
   int64_t exp_number = 0;// explicit exponential part
-  if ((fmt & chars_format::scientific) && (p != pend) && (('e' == *p) || ('E' 
== *p))) {
+  if (bool(fmt & chars_format::scientific) && (p != pend) && (('e' == *p) || 
('E' == *p))) {
 const char * location_of_e = p;
 ++p;
 bool neg_exp = false;
@@ -155,7 +150,7 @@ parsed_number_string parse_number_string(const char *p, 
const char *pend, parse_
   ++p;
 }
 if ((p == pend) || !is_integer(*p)) {
-  if(!(fmt & chars_format::fixed)) {
+  if(!bool(fmt & chars_format::fixed)) {
 // We are in error.
 return answer;
   }
@@ -174,7 +169,7 @@ parsed_number_string parse_number_string(const char *p, 
const char *pend, parse_
 }
   } else {
 // If it scientific and not fixed, we have to bail out.
-if((fmt & chars_format::scientific) && !(fmt & chars_format::fixed)) { 
return answer; }
+if(bool(fmt & chars_format::scientific) && !bool(fmt & 
chars_format::fixed)) { return answer; }
   }
   answer.lastmatch = p;
   answer.valid = true;
diff --git a/libstdc++-v3/src/c++17/fast_float/bigint.h 
b/libstdc++-v3/src/c++17/fast_float/bigint.h
index b56cb9b03b3..5c9552cab4c 100644
--- a/libstdc++-v3/src/c++17/fast_float/bigint.h
+++ b/libstdc++-v3/src/c++17/fast_float/bigint.h
@@ -1,11 +1,6 @@
 #ifndef FASTFLOAT_BIGINT_H
 #define FASTFLOAT_BIGINT_H
 
-#include 
-#include 
-#include 
-#include 
-
 #include "float_common.h"
 
 namespace fast_float {
diff --git a/libstdc++-v3/src/c++17/fast_float/decimal_to_binary.h 
b/libstdc++-v3/src/c++17/fast_float/decimal_to_binary.h
index 6da6c66a3ac..26343c4cd20 100644
--- a/libstdc++-v3/src/c++17/fast_float/decimal_to_binary.h
+++ b/libstdc++-v3/src/c++17/fast_float/decimal_to_binary.h
@@ -3,12 +3,6 @@
 
 #include "float_common.h"
 #include "fast_table.h"
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
 
 namespace fast_float {
 
diff --git a/libstdc++-v3/src/c++17/fast_float/digit_comparison.h 
b/libstdc++-v3/src/c++17/fast_float/digit_comparison.h
index 7ffe874303b..4af465420c9 100644
--- a/libstdc++-v3/src/c++17/fast_float/digit_comparison.h
+++ b/libstdc++-v3/src/c++17/fast_float/digit_comparison.h
@@ -1,11 +1,6 @@
 #ifndef FASTFLOAT_DIGIT_COMPARISON_H
 #define FASTFLOAT_DIGIT_COMPARISON_H
 
-#include 
-#include 
-#include 
-#include 
-
 #include "float_common.h"
 #include "bigint.h"
 #include "ascii_number.h"
diff --git a/libstdc++-v3/src/c++17/fast_float/fast_float.h 
b/libstdc++-v3/src/c++17/fast_float/fast_float.h
index 3c483803af3..a4b967f5dd7 100644
--- a/libstdc++-v3/src/c++17/fast_float/fast_float.h
+++ b/libstdc++-v3/src/c++17/fast_float/fast_float.h
@@ -1,21 +1,10 @@
 #ifndef FASTFLOAT_FAST_FLOAT_H
 #define FASTFLOAT_FAST_FLOAT_H
 

Re: [PATCH 1/4] Generate off-stack nested function trampolines

2021-11-15 Thread Joseph Myers
On Sat, 13 Nov 2021, Maxim Blinov wrote:

> the target specifically requires it, or you manually provide
> --enable-off-stack-trampolines when configuring gcc/libgcc.

If you're adding a new configure option, it needs documenting in 
install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Combine malloc + memset to calloc

2021-11-15 Thread Iain Buclaw via Gcc-patches
Excerpts from Seija K. via Gcc-patches's message of November 12, 2021 9:29 pm:
> diff --git a/gcc/d/dmd/ctfeexpr.c b/gcc/d/dmd/ctfeexpr.c
> index a8e97833ad0..401ed748f43 100644
> --- a/gcc/d/dmd/ctfeexpr.c
> +++ b/gcc/d/dmd/ctfeexpr.c
> @@ -1350,8 +1350,7 @@ int ctfeRawCmp(Loc loc, Expression *e1, Expression
> *e2)
>  if (es2->keys->length != dim)
>  return 1;
> 
> -bool *used = (bool *)mem.xmalloc(sizeof(bool) * dim);
> -memset(used, 0, sizeof(bool) * dim);
> +bool *used = (bool *)mem.xcalloc(dim, sizeof(bool));
> 
>  for (size_t i = 0; i < dim; ++i)
>  {

Hi,

Thanks, however all changes to the dmd front-end should go through
upstream first.  But as this file is about to be dropped, I don't
immediately see a need to keep this part in the patch.

Iain.


[pushed] c++: split_nonconstant_init and flexarrays

2021-11-15 Thread Jason Merrill via Gcc-patches
split_nonconstant_init was doing the wrong thing for both the initialization
and cleanup here; we know the size from the initializer, and we can pass it
along.  This doesn't make the testcase work, since the y destructor is still
broken, but it removes the wrong error for the aggregate initialization.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* typeck2.c (split_nonconstant_init_1): Handle flexarrays better.

gcc/testsuite/ChangeLog:

* g++.dg/ext/flexary37.C: Remove expected error.
---
 gcc/cp/typeck2.c | 9 +
 gcc/testsuite/g++.dg/ext/flexary37.C | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index c01f2f8ced4..e98fbf7f5fa 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -484,6 +484,15 @@ split_nonconstant_init_1 (tree dest, tree init, bool 
nested)
   && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
  || vla_type_p (type))
{
+ if (!TYPE_DOMAIN (type)
+ && TREE_CODE (init) == CONSTRUCTOR
+ && CONSTRUCTOR_NELTS (init))
+   {
+ /* Flexible array.  */
+ cp_complete_array_type (, init, /*default*/true);
+ dest = build1 (VIEW_CONVERT_EXPR, type, dest);
+   }
+
  /* For an array, we only need/want a single cleanup region rather
 than one per element.  */
  tree code = build_vec_init (dest, NULL_TREE, init, false, 1,
diff --git a/gcc/testsuite/g++.dg/ext/flexary37.C 
b/gcc/testsuite/g++.dg/ext/flexary37.C
index ceb5053de2e..5cd48c1f773 100644
--- a/gcc/testsuite/g++.dg/ext/flexary37.C
+++ b/gcc/testsuite/g++.dg/ext/flexary37.C
@@ -12,4 +12,4 @@ public:
 
 struct y { // { dg-error "unknown array size in delete" }
 int a; C b[];
-} y = { 1, { { 2, 3 } } }; // { dg-error "unknown array size in delete" }
+} y = { 1, { { 2, 3 } } };

base-commit: 323026c7dfe23e1093e80f7db5f4851d1a867b62
-- 
2.27.0



Re: [PATCH] c++: __builtin_bit_cast To C array target type [PR103140]

2021-11-15 Thread will wray via Gcc-patches
Yes - direct use of any builtin is not to be encouraged, in user code.

This __builtin_bit_cast patch is intended to encourage experimentation
with array copy semantics now, on truck, in preparation for P1997.

The builtin bit_cast is strictly more powerful than the std::bit_cast
library function that it helps implement, is available in any -std mode
and might also be useful in C, independent of any standardization effort.

The semantics of bit_cast is clear - it's just the resulting rvalue array
itself is unfamiliar and tricky to handle within current language rules.

On Mon, Nov 15, 2021 at 12:21 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 15, 2021 at 12:12:22PM -0500, will wray via Gcc-patches wrote:
> > One motivation for allowing builtin bit_cast to builtin array is that
> > it enables direct bitwise constexpr comparisons via memcmp:
> >
> > template
> > constexpr int bit_equal(A const& a, B const& b)
> > {
> >   static_assert( sizeof a == sizeof b,
> >   "bit_equal(a,b) requires same sizeof" );
> >   using bytes = unsigned char[sizeof(A)];
> >   return __builtin_memcmp(
> >  __builtin_bit_cast(bytes,a),
> >  __builtin_bit_cast(bytes,b),
> >  sizeof(A)) == 0;
> > }
>
> IMNSHO people shouldn't use this builtin directly, and we shouldn't
> encourage such uses, the standard interface is std::bit_cast.
>
> For the above, I don't see a reason to do it that way, you can
> instead portably:
>   struct bytes { unsigned char data[sizeof(A)]; };
>   bytes ab = std::bit_cast(bytes, a);
>   bytes bb = std::bit_cast(bytes, a);
>   for (size_t i = 0; i < sizeof(A); ++i)
> if (ab.data[i] != bb.data[i])
>   return false;
>   return true;
> - __builtin_memcmp isn't portable either and memcmp isn't constexpr.
>
> If P1997 is in, it is easy to support it in std::bit_cast and easy to
> explain what __builtin_bit_cast does for array types, but otherwise
> it is quite unclear what it exactly does...
>
> Jakub
>


Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-15 Thread David Malcolm via Gcc-patches
> On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > but changing the name is a trivial operation. 
> 
> Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
> changes.

Thanks for implementing this.

> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> -- >8 --
> From a link below:
> "An issue was discovered in the Bidirectional Algorithm in the Unicode
> Specification through 14.0. It permits the visual reordering of
> characters via control sequences, which can be used to craft source code
> that renders different logic than the logical ordering of tokens
> ingested by compilers and interpreters. Adversaries can leverage this to
> encode source code for compilers accepting Unicode such that targeted
> vulnerabilities are introduced invisibly to human reviewers."
> 
> More info:
> https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> https://trojansource.codes/
> 
> This is not a compiler bug.  However, to mitigate the problem, this patch
> implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> misleading Unicode bidirectional characters the preprocessor may encounter.
> 
> The default is =unpaired, which warns about improperly terminated
> bidirectional characters; e.g. a LRE without its appertaining PDF.  The

I like the default.

Wording nit: maybe use "corresponding" rather than "appertaining"; I
believe the latter has a sense that one is part of the other, when they
are more like peers.

> level =any warns about any use of bidirectional characters.

Terminology nit:
The patch is referring to "bidirectional characters", but I think the
term "bidirectional control characters" would be better.

For example, a passage of text containing both numbers and characters
in a right-to-left script could be considered "bidirectional", since
the numbers are written from left-to-right.

Specifically, the patch looks for these specific characters:
  * U+202A LEFT-TO-RIGHT EMBEDDING
  * U+202B RIGHT-TO-LEFT EMBEDDING
  * U+202C POP DIRECTIONAL FORMATTING
  * U+202D LEFT-TO-RIGHT OVERRIDE
  * U+202E RIGHT-TO-LEFT OVERRIDE
  * U+2066 LEFT-TO-RIGHT ISOLATE
  * U+2067 RIGHT-TO-LEFT ISOLATE
  * U+2068 FIRST STRONG ISOLATE
  * U+2069 POP DIRECTIONAL ISOLATE

However, the following characters could also be considered as
"bidirectional control characters":
  * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
  * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
but aren't checked for in the patch.  Should they be?  I can imagine
ways in which they could be abused, so I think so.

 
[...snip...]

> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 06457ac739e..b047df0f125 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -374,6 +374,30 @@ Wbad-function-cast
>  C ObjC Var(warn_bad_function_cast) Warning
>  Warn about casting functions to incompatible types.
>  
> +Wbidi-chars
> +C ObjC C++ ObjC++ Warning Alias(Wbidi-chars=,any,none)
> +;
> +
> +Wbidi-chars=
> +C ObjC C++ ObjC++ RejectNegative Joined Warning CPP(cpp_warn_bidirectional) 
> CppReason(CPP_W_BIDIRECTIONAL) Var(warn_bidirectional) 
> Init(bidirectional_unpaired) Enum(cpp_bidirectional_level)
> +-Wbidi-chars=[none|unpaired|any] Warn about UTF-8 bidirectional characters.

"control characters"

[...snip...]

>  
> +@item -Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{]}
> +@opindex Wbidi-chars=
> +@opindex Wbidi-chars
> +@opindex Wno-bidi-chars
> +Warn about possibly misleading UTF-8 bidirectional characters in comments,

(and here again)

> +string literals, character constants, and identifiers.  Such characters can
> +change left-to-right writing direction into right-to-left (and vice versa),
> +which can cause confusion between the logical order and visual order.  This
> +may be dangerous; for instance, it may seem that a piece of code is not
> +commented out, whereas it in fact is.
> +
> +There are three levels of warning supported by GCC@.  The default is
> +@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> +bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> +@option{-Wbidi-chars=any} warns about any use of bidirectional characters.

(and again)

[...snip...]


> diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c 
> b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> new file mode 100644
> index 000..9fd4bc535ca
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> @@ -0,0 +1,166 @@
> +/* PR preprocessor/103026 */
> +/* { dg-do compile } */
> +/* { dg-options "-Wbidi-chars=any -Wno-multichar -Wno-overflow" } */
> +/* Test all bidi chars in various contexts (identifiers, comments,
> +   string literals, character constants), both UCN and UTF-8.  The bidi
> +   chars here are properly terminated, except for the character constants.  
> */
> +
> +/* a b c LRE‪ 1 2 3 PDF‬ x y z */
> +/* { dg-warning "U\\+202A" "" { target *-*-* } .-1 } */
> +/* 

Re: [PATCH v2 2/3] gimple-fold: Use ranges to simplify _chk calls

2021-11-15 Thread Siddhesh Poyarekar

On 11/16/21 01:55, Jeff Law wrote:



On 11/15/2021 10:33 AM, Siddhesh Poyarekar wrote:

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_lower): New function.
(gimple_fold_builtin_strncat_chk,
gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.




@@ -3024,39 +3040,24 @@ gimple_fold_builtin_memory_chk 
(gimple_stmt_iterator *gsi,

  }
  }
-  if (! tree_fits_uhwi_p (size))
-    return false;
-
    tree maxlen = get_maxval_strlen (len, SRK_INT_VALUE);
-  if (! integer_all_onesp (size))
+  if (! integer_all_onesp (size)
+  && !known_lower (stmt, len, size) && !known_lower (stmt, 
maxlen, size))

Formatting it.  Move the trailing && !known_lower (...) to its own line.

OK with the formatting nit fixed.


Thanks, I fixed the nit and pushed the series.

Siddhesh


Re: [PATCH] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 1:25 AM, Ilya Lipnitskiy via Gcc-patches wrote:

musl only uses PT_GNU_STACK to set default thread stack size and has no
executable stack support[0], so there is no reason not to emit the
.note.GNU-stack section on musl builds.

[0]: 
https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: define TARGET_LIBC_GNUSTACK on musl
If musl has no executable stack support, then wouldn't we want this 
change to apply to all musl platforms, not just mips?


jeff



[GCC-11 PATCH] aarch64: enable Ampere-1 CPU (backport to GCC11)

2021-11-15 Thread Philipp Tomsich
This adds support and a basic turning model for the Ampere Computing
"Ampere-1" CPU.

The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
modelled as a 4-wide issue (as with all modern micro-architectures,
the chosen issue rate is a compromise between the maximum dispatch
rate and the maximum rate of uops issued to the scheduler).

This adds the -mcpu=ampere1 command-line option and the relevant cost
information/tuning tables for the Ampere-1.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1
core.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64-cost-tables.h: Add extra costs for
Ampere-1.
* config/aarch64/aarch64.c: Add tuning structures for Ampere-1.

(cherry picked from 67b0d47e20e655c0dd53a76ea88aab60fafb2059)

---
This is a backport from master and only affects the AArch64 backend.

OK for GCC-11?

 gcc/config/aarch64/aarch64-cores.def |   3 +-
 gcc/config/aarch64/aarch64-cost-tables.h | 104 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  78 +
 gcc/doc/invoke.texi  |   2 +-
 5 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index b2aa1670561..4643e0e2795 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -68,7 +68,8 @@ AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH
 AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a2, -1)
 AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a3, -1)
 
-/* Ampere Computing cores. */
+/* Ampere Computing ('\xC0') cores. */
+AARCH64_CORE("ampere1", ampere1, cortexa57, 8_6A, AARCH64_FL_FOR_ARCH8_6, 
ampere1, 0xC0, 0xac3, -1)
 /* Do not swap around "emag" and "xgene1",
this order is required to handle variant correctly. */
 AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index dd2e7e7cbb1..4b7e4e034a2 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -650,4 +650,108 @@ const struct cpu_cost_table a64fx_extra_costs =
   }
 };
 
+const struct cpu_cost_table ampere1_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+COSTS_N_INSNS (1), /* shift_reg.  */
+0, /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+0, /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0, /* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  COSTS_N_INSNS (3),   /* flag_setting.  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (4),   /* add.  */
+  COSTS_N_INSNS (4),   /* extend_add.  */
+  COSTS_N_INSNS (18)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (4),   /* add.  */
+  COSTS_N_INSNS (4),   /* extend_add.  */
+  COSTS_N_INSNS (34)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (4), /* load.  */
+COSTS_N_INSNS (4), /* load_sign_extend.  */
+0, /* ldrd (n/a).  */
+0, /* ldm_1st.  */
+0, /* ldm_regs_per_insn_1st.  */
+0, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (5), /* loadf.  */
+COSTS_N_INSNS (5), /* loadd.  */
+COSTS_N_INSNS (5), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+0, /* stm_regs_per_insn_1st.  */
+0, /* stm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2), /* storef.  */
+COSTS_N_INSNS (2), /* stored.  */
+COSTS_N_INSNS (2), /* store_unaligned.  */
+COSTS_N_INSNS (3), /* loadv.  */
+

Re: [PATCH] simplify get_range_strlen interface

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 3:05 PM, Martin Sebor via Gcc-patches wrote:

The deeply nested PHI handling in get_range_strlen_dynamic makes
the code bigger and harder to follow than it would be if done in
its own function.  The attached patch does that.

In addition, the get_range_strlen family of functions use a bitmap
to avoid infinite recursion.  Rather than dynamically allocating
and freeing it on demand the attached patch simplifies the code
by using an instance of auto_bitmap.  This avoids the risk of
neglecting to deallocate the bitmap.

Tested on x86_64-linux.

Martin

gcc-get_range_strlen_dynamic.diff

Slightly simplify get_range_strlen interface.

gcc/ChangeLog:

* gimple-fold.c (get_range_strlen): Take bitmap as an argument rather
than a pointer to it.
(get_range_strlen_tree): Same.  Remove bitmap allocation.  Use
an auto_bitmap.
(get_maxval_strlen): Use an auto_bitmap.
* tree-ssa-strlen.c (get_range_strlen_dynamic): Factor out PHI
handling...
(get_range_strlen_phi): ...into this function.
(printf_strlen_execute): Dump pointer query cache contents when
details are requisted.
No really a bugfix, but I'll go ahead and ACK for the trunk. Obviously 
the bar is going to be going up now that we're in stage3 :-)


jeff


Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-15 Thread David Malcolm via Gcc-patches
On Mon, 2021-11-15 at 15:45 +0100, Peter Zijlstra wrote:
> On Mon, Nov 15, 2021 at 12:33:16PM +0530, Prathamesh Kulkarni wrote:
> > On Sun, 14 Nov 2021 at 02:07, David Malcolm via Gcc-patches
> 
> > > +/* Handle "returns_zero_on_failure" and "returns_zero_on_success"
> > > attributes;
> > > +   arguments as in struct attribute_spec.handler.  */
> > > +
> > > +static tree
> > > +handle_returns_zero_on_attributes (tree *node, tree name, tree,
> > > int,
> > > +  bool *no_add_attrs)
> > > +{
> > > +  if (!INTEGRAL_TYPE_P (TREE_TYPE (*node)))
> > > +    {
> > > +  error ("%qE attribute on a function not returning an
> > > integral type",
> > > +    name);
> > > +  *no_add_attrs = true;
> > > +    }
> > > +  return NULL_TREE;
> > Hi David,
> > Just curious if a warning should be emitted if the function is marked
> > with the attribute but it's return value isn't actually 0 ?
> > 
> > There are other constants like -1 or 1 that are often used to
> > indicate
> > error, so maybe tweak the attribute to
> > take the integer as an argument ?
> > Sth like returns_int_on_success(cst) / returns_int_on_failure(cst) ?
> > 
> > Also, would it make sense to extend it for pointers too for returning
> > NULL on success / failure ?
> 
> Please also consider that in Linux we use the 'last' page for error
> code
> returns. That is, a function returning a pointer could return '(void
> *)-EFAULT' also see linux/err.h
> 

Thanks.

Am I right in thinking that such functions return non-NULL, giving
something like:

  __attribute__((returns_ptr_in_range_on_success (0x1, NULL - 4096)))
  __attribute__((returns_ptr_in_range_on_failure (NULL - 4096, NULL - 1)))
  __attribute__((returns_non_null))

as attributes?  (I have no idea if the above will parse, and I admit
these look ugly as-is, though I suppose they could be hidden behind a
macro).

Looking at include/linux/err.h I see functions:

static inline bool __must_check IS_ERR(__force const void *ptr)
{
return IS_ERR_VALUE((unsigned long)ptr);
}

static inline bool __must_check IS_ERR_OR_NULL(__force const void *ptr)
{
return unlikely(!ptr) || IS_ERR_VALUE((unsigned long)ptr);
}

so maybe attribute could refer to predicate functions, something like
this:

  __attribute__((return_value_success_predicate(FUNCTION_DECL)))
  __attribute__((return_value_failure_predicate(FUNCTION_DECL)))

where this case could use something like:

  __attribute__((return_value_failure_predicate(IS_ERR)))

to express the idea "this function can succeed or fail, and the given
function decl expresses whether a given return value is a failure" - or
somesuch.  The predicate function would probably have to be pure.

Obviously I'm just brainstorming here; as noted in my reply to
Prathamesh, all I need for the initial implementation of the trust
boundary work is just being able to express that zero vs non-zero
return is the success vs failure condition for a function.

Dave




Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-15 Thread David Malcolm via Gcc-patches
On Mon, 2021-11-15 at 12:33 +0530, Prathamesh Kulkarni wrote:
> On Sun, 14 Nov 2021 at 02:07, David Malcolm via Gcc-patches
>  wrote:
> > 
> > This patch adds two new attributes.  The followup patch makes use of
> > the attributes in -fanalyzer.

[...snip...]

> > +/* Handle "returns_zero_on_failure" and "returns_zero_on_success"
> > attributes;
> > +   arguments as in struct attribute_spec.handler.  */
> > +
> > +static tree
> > +handle_returns_zero_on_attributes (tree *node, tree name, tree,
> > int,
> > +  bool *no_add_attrs)
> > +{
> > +  if (!INTEGRAL_TYPE_P (TREE_TYPE (*node)))
> > +    {
> > +  error ("%qE attribute on a function not returning an
> > integral type",
> > +    name);
> > +  *no_add_attrs = true;
> > +    }
> > +  return NULL_TREE;
> Hi David,

Thanks for the ideas.

> Just curious if a warning should be emitted if the function is marked
> with the attribute but it's return value isn't actually 0 ?

That sounds like a worthwhile extension of the idea.  It should be
possible to identify functions that can't return zero or non-zero that
have been marked as being able to.

That said:

(a) if you apply the attribute to a function pointer for a callback,
you could have an implementation of the callback that always fails and
returns, say, -1; should the warning complain that the function has the
"returns_zero_on_success" property and is always returning -1?

(b) the attributes introduce a concept of "success" vs "failure", which
might be hard for a machine to determine.  It's only used later on in
terms of the events presented to the user, so that -fanalyzer can emit
e.g. "when 'copy_from_user' fails", which IMHO is easier to read than
"when 'copy_from_user' returns non-zero".

> 
> There are other constants like -1 or 1 that are often used to indicate
> error, so maybe tweak the attribute to
> take the integer as an argument ?
> Sth like returns_int_on_success(cst) / returns_int_on_failure(cst) ?

Those could work nicely; I like the idea of them being supplementary to
the returns_zero_on_* ones.

I got the urge to bikeshed about wording; some ideas:
  success_return_value(CST)
  failure_return_value(CST)
or maybe additionally:
  success_return_range(LOWER_BOUND_CST, UPPER_BOUND_CST)
  failure_return_range(LOWER_BOUND_CST, UPPER_BOUND_CST)

I can also imagine a
  sets_errno_on_failure
attribute being useful (and perhaps a "doesnt_touch_errno"???)

> Also, would it make sense to extend it for pointers too for returning
> NULL on success / failure ?

Possibly expressible by generalizing it to allow pointer types, or by
adding this pair:

  returns_null_on_failure
  returns_null_on_success

or by using the "range" idea above.

In terms of scope, for the trust boundary stuff, I want to be able to
express the idea that a call can succeed vs fail, what the success vs
failure is in terms of nonzero vs zero, and to be able to wire up the
heuristic that if it looks like a "copy function" (use of access
attributes and a size), that success/failure can mean "copies all of
it" vs "copies none of it" (which seems to get decent test coverage on
the Linux kernel with the copy_from/to_user fns).

Thanks
Dave


> 
> Thanks,
> Prathamesh

[...snip...]



[PATCH] simplify get_range_strlen interface

2021-11-15 Thread Martin Sebor via Gcc-patches

The deeply nested PHI handling in get_range_strlen_dynamic makes
the code bigger and harder to follow than it would be if done in
its own function.  The attached patch does that.

In addition, the get_range_strlen family of functions use a bitmap
to avoid infinite recursion.  Rather than dynamically allocating
and freeing it on demand the attached patch simplifies the code
by using an instance of auto_bitmap.  This avoids the risk of
neglecting to deallocate the bitmap.

Tested on x86_64-linux.

Martin
Slightly simplify get_range_strlen interface.

gcc/ChangeLog:

	* gimple-fold.c (get_range_strlen): Take bitmap as an argument rather
	than a pointer to it.
	(get_range_strlen_tree): Same.  Remove bitmap allocation.  Use
	an auto_bitmap.
	(get_maxval_strlen): Use an auto_bitmap.
	* tree-ssa-strlen.c (get_range_strlen_dynamic): Factor out PHI
	handling...
	(get_range_strlen_phi): ...into this function.
	(printf_strlen_execute): Dump pointer query cache contents when
	details are requisted.


diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..87c211c5e3f 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -86,7 +86,7 @@ enum strlen_range_kind {
 };
 
 static bool
-get_range_strlen (tree, bitmap *, strlen_range_kind, c_strlen_data *, unsigned);
+get_range_strlen (tree, bitmap, strlen_range_kind, c_strlen_data *, unsigned);
 
 /* Return true when DECL can be referenced from current unit.
FROM_DECL (if non-null) specify constructor of variable DECL was taken from.
@@ -1525,7 +1525,7 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *gsi, tree c, tree len)
 /* Helper of get_range_strlen for ARG that is not an SSA_NAME.  */
 
 static bool
-get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
+get_range_strlen_tree (tree arg, bitmap visited, strlen_range_kind rkind,
 		   c_strlen_data *pdata, unsigned eltsize)
 {
   gcc_assert (TREE_CODE (arg) != SSA_NAME);
@@ -1849,7 +1849,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
Return true if *PDATA was successfully populated and false otherwise.  */
 
 static bool
-get_range_strlen (tree arg, bitmap *visited,
+get_range_strlen (tree arg, bitmap visited,
 		  strlen_range_kind rkind,
 		  c_strlen_data *pdata, unsigned eltsize)
 {
@@ -1863,9 +1863,7 @@ get_range_strlen (tree arg, bitmap *visited,
 return false;
 
   /* If we were already here, break the infinite cycle.  */
-  if (!*visited)
-*visited = BITMAP_ALLOC (NULL);
-  if (!bitmap_set_bit (*visited, SSA_NAME_VERSION (arg)))
+  if (!bitmap_set_bit (visited, SSA_NAME_VERSION (arg)))
 return true;
 
   tree var = arg;
@@ -1962,10 +1960,10 @@ get_range_strlen (tree arg, bitmap *visited,
 bool
 get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
 {
-  bitmap visited = NULL;
+  auto_bitmap visited;
   tree maxbound = pdata->maxbound;
 
-  if (!get_range_strlen (arg, , SRK_LENRANGE, pdata, eltsize))
+  if (!get_range_strlen (arg, visited, SRK_LENRANGE, pdata, eltsize))
 {
   /* On failure extend the length range to an impossible maximum
 	 (a valid MAXLEN must be less than PTRDIFF_MAX - 1).  Other
@@ -1981,9 +1979,6 @@ get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)
   if (maxbound && pdata->maxbound == maxbound)
 pdata->maxbound = build_all_ones_cst (size_type_node);
 
-  if (visited)
-BITMAP_FREE (visited);
-
   return !integer_all_onesp (pdata->maxlen);
 }
 
@@ -2005,19 +2000,16 @@ get_maxval_strlen (tree arg, strlen_range_kind rkind, tree *nonstr = NULL)
   /* ARG must have an integral type when RKIND says so.  */
   gcc_assert (rkind != SRK_INT_VALUE || INTEGRAL_TYPE_P (TREE_TYPE (arg)));
 
-  bitmap visited = NULL;
+  auto_bitmap visited;
 
   /* Reset DATA.MAXLEN if the call fails or when DATA.MAXLEN
  is unbounded.  */
   c_strlen_data lendata = { };
-  if (!get_range_strlen (arg, , rkind, , /* eltsize = */1))
+  if (!get_range_strlen (arg, visited, rkind, , /* eltsize = */1))
 lendata.maxlen = NULL_TREE;
   else if (lendata.maxlen && integer_all_onesp (lendata.maxlen))
 lendata.maxlen = NULL_TREE;
 
-  if (visited)
-BITMAP_FREE (visited);
-
   if (nonstr)
 {
   /* For callers prepared to handle unterminated arrays set
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index c0ec7d20a60..536f796f82b 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -193,6 +193,8 @@ struct laststmt_struct
 } laststmt;
 
 static int get_stridx_plus_constant (strinfo *, unsigned HOST_WIDE_INT, tree);
+static bool get_range_strlen_dynamic (tree, gimple *s, c_strlen_data *,
+  bitmap, range_query *, unsigned *);
 
 /* Sets MINMAX to either the constant value or the range VAL is in
and returns either the constant value or VAL on success or null
@@ -1087,6 +1089,76 @@ dump_strlen_info (FILE *fp, gimple *stmt, range_query *rvals)
 }
 }
 
+/* Helper of get_range_strlen_dynamic().  See below.  */
+
+static bool

[PATCH] PR fortran/99061 - [10/11/12 Regression] ICE in gfc_conv_intrinsic_atan2d, at fortran/trans-intrinsic.c:4728

2021-11-15 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

the attached patch fixes the handling of the DEC trigonometric intrinsics
for different argument kinds.  It is based on the original patch by Steve,
which fixes the lookup for the needed intrinsics.

Regtested on x86_64-pc-linux-gnu.  OK for affected branches?

Thanks,
Harald

From e979db00b8e84333c53bc0b8f1c89cd8ce18d72c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 15 Nov 2021 22:32:17 +0100
Subject: [PATCH] Fortran: fix lookup for gfortran builtin math intrinsics used
 by DEC extensions

gcc/fortran/ChangeLog:

	PR fortran/99061
	* trans-intrinsic.c (gfc_lookup_intrinsic): Helper function for
	looking up gfortran builtin intrinsics.
	(gfc_conv_intrinsic_atrigd): Use it.
	(gfc_conv_intrinsic_cotan): Likewise.
	(gfc_conv_intrinsic_cotand): Likewise.
	(gfc_conv_intrinsic_atan2d): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/99061
	* gfortran.dg/dec_math_5.f90: New test.
---
 gcc/fortran/trans-intrinsic.c|  66 ---
 gcc/testsuite/gfortran.dg/dec_math_5.f90 | 103 +++
 2 files changed, 138 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/dec_math_5.f90

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 3f867911af5..bd67f4f44da 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -4555,6 +4555,18 @@ rad2deg (int kind)
 }


+static gfc_intrinsic_map_t *
+gfc_lookup_intrinsic (gfc_isym_id id)
+{
+  gfc_intrinsic_map_t *m = gfc_intrinsic_map;
+  for (; m->id != GFC_ISYM_NONE || m->double_built_in != END_BUILTINS; m++)
+if (id == m->id)
+  break;
+  gcc_assert (id == m->id);
+  return m;
+}
+
+
 /* ACOSD(x) is translated into ACOS(x) * 180 / pi.
ASIND(x) is translated into ASIN(x) * 180 / pi.
ATAND(x) is translated into ATAN(x) * 180 / pi.  */
@@ -4565,20 +4577,27 @@ gfc_conv_intrinsic_atrigd (gfc_se * se, gfc_expr * expr, gfc_isym_id id)
   tree arg;
   tree atrigd;
   tree type;
+  gfc_intrinsic_map_t *m;

   type = gfc_typenode_for_spec (>ts);

   gfc_conv_intrinsic_function_args (se, expr, , 1);

-  if (id == GFC_ISYM_ACOSD)
-atrigd = gfc_builtin_decl_for_float_kind (BUILT_IN_ACOS, expr->ts.kind);
-  else if (id == GFC_ISYM_ASIND)
-atrigd = gfc_builtin_decl_for_float_kind (BUILT_IN_ASIN, expr->ts.kind);
-  else if (id == GFC_ISYM_ATAND)
-atrigd = gfc_builtin_decl_for_float_kind (BUILT_IN_ATAN, expr->ts.kind);
-  else
-gcc_unreachable ();
-
+  switch (id)
+{
+case GFC_ISYM_ACOSD:
+  m = gfc_lookup_intrinsic (GFC_ISYM_ACOS);
+  break;
+case GFC_ISYM_ASIND:
+  m = gfc_lookup_intrinsic (GFC_ISYM_ASIN);
+  break;
+case GFC_ISYM_ATAND:
+  m = gfc_lookup_intrinsic (GFC_ISYM_ATAN);
+  break;
+default:
+  gcc_unreachable ();
+}
+  atrigd = gfc_get_intrinsic_lib_fndecl (m, expr);
   atrigd = build_call_expr_loc (input_location, atrigd, 1, arg);

   se->expr = fold_build2_loc (input_location, MULT_EXPR, type, atrigd,
@@ -4614,13 +4633,9 @@ gfc_conv_intrinsic_cotan (gfc_se *se, gfc_expr *expr)
   mpfr_clear (pio2);

   /* Find tan builtin function.  */
-  m = gfc_intrinsic_map;
-  for (; m->id != GFC_ISYM_NONE || m->double_built_in != END_BUILTINS; m++)
-	if (GFC_ISYM_TAN == m->id)
-	  break;
-
-  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, arg, tmp);
+  m = gfc_lookup_intrinsic (GFC_ISYM_TAN);
   tan = gfc_get_intrinsic_lib_fndecl (m, expr);
+  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, arg, tmp);
   tan = build_call_expr_loc (input_location, tan, 1, tmp);
   se->expr = fold_build1_loc (input_location, NEGATE_EXPR, type, tan);
 }
@@ -4630,20 +4645,12 @@ gfc_conv_intrinsic_cotan (gfc_se *se, gfc_expr *expr)
   tree cos;

   /* Find cos builtin function.  */
-  m = gfc_intrinsic_map;
-  for (; m->id != GFC_ISYM_NONE || m->double_built_in != END_BUILTINS; m++)
-	if (GFC_ISYM_COS == m->id)
-	  break;
-
+  m = gfc_lookup_intrinsic (GFC_ISYM_COS);
   cos = gfc_get_intrinsic_lib_fndecl (m, expr);
   cos = build_call_expr_loc (input_location, cos, 1, arg);

   /* Find sin builtin function.  */
-  m = gfc_intrinsic_map;
-  for (; m->id != GFC_ISYM_NONE || m->double_built_in != END_BUILTINS; m++)
-	if (GFC_ISYM_SIN == m->id)
-	  break;
-
+  m = gfc_lookup_intrinsic (GFC_ISYM_SIN);
   sin = gfc_get_intrinsic_lib_fndecl (m, expr);
   sin = build_call_expr_loc (input_location, sin, 1, arg);

@@ -4675,11 +4682,7 @@ gfc_conv_intrinsic_cotand (gfc_se *se, gfc_expr *expr)
   mpfr_clear (ninety);

   /* Find tand.  */
-  gfc_intrinsic_map_t *m = gfc_intrinsic_map;
-  for (; m->id != GFC_ISYM_NONE || m->double_built_in != END_BUILTINS; m++)
-if (GFC_ISYM_TAND == m->id)
-  break;
-
+  gfc_intrinsic_map_t *m = gfc_lookup_intrinsic (GFC_ISYM_TAND);
   tree tand = gfc_get_intrinsic_lib_fndecl (m, expr);
   tand = build_call_expr_loc (input_location, tand, 1, arg);

@@ 

[PATCH v2] Check optab before transforming atomic bit test and operations

2021-11-15 Thread H.J. Lu via Gcc-patches
On Mon, Nov 15, 2021 at 11:14 AM Jeff Law  wrote:
>
>
>
> On 11/15/2021 12:05 PM, H.J. Lu wrote:
> > On Mon, Nov 15, 2021 at 10:59 AM Jeff Law  wrote:
> >>
> >>
> >> On 11/15/2021 6:39 AM, H.J. Lu via Gcc-patches wrote:
> >>> Check optab before transforming equivalent, but slighly different cases
> >>> of atomic bit test and operations to their canonical forms.
> >>>
> >>> gcc/
> >>>
> >>>PR middle-end/103184
> >>>* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
> >>>before transforming equivalent, but slighly different cases to
> >>>their canonical forms.
> >>>
> >>> gcc/testsuite/
> >>>
> >>>PR middle-end/103184
> >>>* gcc.dg/pr103184-1.c: New test.
> >>>* gcc.dg/pr103184-2.c: Likewise.
> >>>}
> >>>}
> >>>
> >>> -  switch (fn)
> >>> -{
> >>> -case IFN_ATOMIC_BIT_TEST_AND_SET:
> >>> -  optab = atomic_bit_test_and_set_optab;
> >>> -  break;
> >>> -case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
> >>> -  optab = atomic_bit_test_and_complement_optab;
> >>> -  break;
> >>> -case IFN_ATOMIC_BIT_TEST_AND_RESET:
> >>> -  optab = atomic_bit_test_and_reset_optab;
> >>> -  break;
> >>> -default:
> >>> -  return;
> >>> -}
> >>> -
> >>>  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == 
> >>> CODE_FOR_nothing)
> >>>return;
> >> Shouldn't the test of the return value of optab_handler here just go
> >> away since we're testing it earlier?  OK with that fix.
> >>
> > The earlier check is predicated on if (rhs_code != BIT_AND_EXPR):
> >
> >if (rhs_code != BIT_AND_EXPR)
> >  {
> >if (rhs_code != NOP_EXPR && rhs_code != BIT_NOT_EXPR)
> >  return;
> >
> >tree use_lhs = gimple_assign_lhs (use_stmt);
> >if (TREE_CODE (use_lhs) == SSA_NAME
> >&& SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use_lhs))
> >  return;
> >
> >tree use_rhs = gimple_assign_rhs1 (use_stmt);
> >if (lhs != use_rhs)
> >  return;
> >
> >if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
> >== CODE_FOR_nothing)
> >  return;
> >
> > I can add an "else"
> >
> > else  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
> >  == CODE_FOR_nothing)
> >  return;
> >
> > Will it be OK?
> Sure.  THanks.
> jeff

This is the patch I am checking in.

Thanks.

-- 
H.J.
From 357d8a03b7ee32a29a78ac9be5711e6b0cd99cc4 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 12 Nov 2021 07:21:43 -0800
Subject: [PATCH v2] Check optab before transforming atomic bit test and
 operations

Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

	PR middle-end/103184
	* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
	before transforming equivalent, but slighly different cases to
	their canonical forms.

gcc/testsuite/

	PR middle-end/103184
	* gcc.dg/pr103184-1.c: New test.
	* gcc.dg/pr103184-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/pr103184-1.c | 43 +++
 gcc/testsuite/gcc.dg/pr103184-2.c | 12 +
 gcc/tree-ssa-ccp.c| 38 +++
 3 files changed, 76 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103184-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr103184-2.c

diff --git a/gcc/testsuite/gcc.dg/pr103184-1.c b/gcc/testsuite/gcc.dg/pr103184-1.c
new file mode 100644
index 000..e567f95f63f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103184-1.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern char foo;
+extern unsigned char bar;
+
+int
+foo1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+int
+foo2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+int
+foo3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+unsigned short
+bar1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+unsigned short
+bar2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+unsigned short
+bar3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+/* { dg-final { scan-assembler-times "lock;?\[ \t\]*cmpxchgb" 6 { target { x86_64-*-* i?86-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/pr103184-2.c b/gcc/testsuite/gcc.dg/pr103184-2.c
new file mode 100644
index 000..499761fdbfd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103184-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include 
+
+int
+tbit0 (_Atomic int* a, int n)
+{
+#define BIT (0x1 << n)
+  return atomic_fetch_or (a, BIT) & BIT;
+#undef BIT
+}
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 0f79e9f05bd..0666dc652d0 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3366,6 +3366,21 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
   || !gimple_vdef (call))
 return;
 
+  switch (fn)
+{
+case IFN_ATOMIC_BIT_TEST_AND_SET:
+  optab = 

Re: Basic kill analysis for modref

2021-11-15 Thread Jan Hubicka via Gcc-patches
> > > +  if (always_executed
> > > +  && callee_summary->kills.length ()
> > > +  && (!cfun->can_throw_non_call_exceptions
> > > + || !stmt_could_throw_p (cfun, stmt)))
> > > +{
> > > +  /* Watch for self recursive updates.  */
> > > +  auto_vec saved_kills;
> > > +
> > > +  saved_kills.reserve_exact (callee_summary->kills.length ());
> > > +  saved_kills.splice (callee_summary->kills);
> > > +  for (auto kill : saved_kills)
> > > +   {
> > > + if (kill.parm_index >= (int)parm_map.length ())
> > > +   continue;
> > > + modref_parm_map 
> > > + = kill.parm_index == MODREF_STATIC_CHAIN_PARM
> > > +   ? chain_map
> >     chain_map  isn't initialized.
> > 
> > This caused:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262
> Yup.  It's causing heartburn in various ways in the tester.  I was just
> tracking it down with valgrind...
> jeff

Oops, either me or patch much have mislocated the change within the
function when updating to new tree.  I am testing the following fix and
will cook up a testcase verifying that merging of kills works as
expected.

Thanks!
Honza

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index df4612bbff9..4784f68f585 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -964,38 +980,6 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (flags & (ECF_CONST | ECF_NOVOPS))
 return changed;
 
-  if (always_executed
-  && callee_summary->kills.length ()
-  && (!cfun->can_throw_non_call_exceptions
- || !stmt_could_throw_p (cfun, stmt)))
-{
-  /* Watch for self recursive updates.  */
-  auto_vec saved_kills;
-
-  saved_kills.reserve_exact (callee_summary->kills.length ());
-  saved_kills.splice (callee_summary->kills);
-  for (auto kill : saved_kills)
-   {
- if (kill.parm_index >= (int)parm_map.length ())
-   continue;
- modref_parm_map 
- = kill.parm_index == MODREF_STATIC_CHAIN_PARM
-   ? chain_map
-   : parm_map[kill.parm_index];
- if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
- || m.parm_index == MODREF_UNKNOWN_PARM
- || m.parm_index == MODREF_RETSLOT_PARM
- || !m.parm_offset_known)
-   continue;
- modref_access_node n = kill;
- n.parm_index = m.parm_index;
- n.parm_offset += m.parm_offset;
- if (modref_access_node::insert_kill (cur_summary->kills, n,
-  record_adjustments))
-   changed = true;
-   }
-}
-
   /* We can not safely optimize based on summary of callee if it does
  not always bind to current def: it is possible that memory load
  was optimized out earlier which may not happen in the interposed
@@ -1043,6 +1027,38 @@ merge_call_side_effects (modref_summary *cur_summary,
   if (dump_file)
 fprintf (dump_file, "\n");
 
+  if (always_executed
+  && callee_summary->kills.length ()
+  && (!cfun->can_throw_non_call_exceptions
+ || !stmt_could_throw_p (cfun, stmt)))
+{
+  /* Watch for self recursive updates.  */
+  auto_vec saved_kills;
+
+  saved_kills.reserve_exact (callee_summary->kills.length ());
+  saved_kills.splice (callee_summary->kills);
+  for (auto kill : saved_kills)
+   {
+ if (kill.parm_index >= (int)parm_map.length ())
+   continue;
+ modref_parm_map 
+ = kill.parm_index == MODREF_STATIC_CHAIN_PARM
+   ? chain_map
+   : parm_map[kill.parm_index];
+ if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
+ || m.parm_index == MODREF_UNKNOWN_PARM
+ || m.parm_index == MODREF_RETSLOT_PARM
+ || !m.parm_offset_known)
+   continue;
+ modref_access_node n = kill;
+ n.parm_index = m.parm_index;
+ n.parm_offset += m.parm_offset;
+ if (modref_access_node::insert_kill (cur_summary->kills, n,
+  record_adjustments))
+   changed = true;
+   }
+}
+
   /* Merge with callee's summary.  */
   changed |= cur_summary->loads->merge (callee_summary->loads, _map,
_map, record_adjustments);


[PATCH] Avoid pathological function redeclarations when checking access sizes [PR102759]

2021-11-15 Thread Martin Sebor via Gcc-patches

Declaring a function with a prototype at block scope and
then redeclaring it without a prototype at file scope results
in losing the prototype but not the attribute access that was
implicitly added to the function decl based on the prototype.
The middle end code that checks function calls for out-of-bounds
accesses based on the attribute is unprepared for this case and
fails with an ICE.  The attached patch corrects this by having
it ignore these pathological cases.  In addition, the change
also improves the format of the informational note printed
after these warnings to reflect the form of the argument
(e.g., to print int[7] rather than int * if the former was
the form used in the declaration).

Tested on x86_64-linux.

Martin
Avoid pathological function redeclarations when checking access sizes [PR102759].

Resolves:
PR tree-optimization/102759 - ICE: Segmentation fault in maybe_check_access_sizes since r12-2976-gb48d4e6818674898

	PR tree-optimization/102759

gcc/ChangeLog:

	PR tree-optimization/102759
	* gimple-array-bounds.cc (build_printable_array_type): Move...
	* gimple-ssa-warn-access.cc (build_printable_array_type): Avoid
	pathological function redeclarations that remove a previously
	declared prototype.
	Improve formatting of function arguments in informational notes.
	* pointer-query.cc (build_printable_array_type): ...to here.
	* pointer-query.h (build_printable_array_type): Declared.

gcc/testsuite/ChangeLog:

	PR tree-optimization/102759
	* gcc.dg/Warray-parameter-10.c: New test.
	* gcc.dg/Wstringop-overflow-82.c: New test.

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index a3535598998..ddb99d263d1 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -372,31 +372,6 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
   return warned;
 }
 
-/* Wrapper around build_array_type_nelts that makes sure the array
-   can be created at all and handles zero sized arrays specially.  */
-
-static tree
-build_printable_array_type (tree eltype, unsigned HOST_WIDE_INT nelts)
-{
-  if (TYPE_SIZE_UNIT (eltype)
-  && TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST
-  && !integer_zerop (TYPE_SIZE_UNIT (eltype))
-  && TYPE_ALIGN_UNIT (eltype) > 1
-  && wi::zext (wi::to_wide (TYPE_SIZE_UNIT (eltype)),
-		   ffs_hwi (TYPE_ALIGN_UNIT (eltype)) - 1) != 0)
-eltype = TYPE_MAIN_VARIANT (eltype);
-
-  if (nelts)
-return build_array_type_nelts (eltype, nelts);
-
-  tree idxtype = build_range_type (sizetype, size_zero_node, NULL_TREE);
-  tree arrtype = build_array_type (eltype, idxtype);
-  arrtype = build_distinct_type_copy (TYPE_MAIN_VARIANT (arrtype));
-  TYPE_SIZE (arrtype) = bitsize_zero_node;
-  TYPE_SIZE_UNIT (arrtype) = size_zero_node;
-  return arrtype;
-}
-
 /* Checks one MEM_REF in REF, located at LOCATION, for out-of-bounds
references to string constants.  If VRP can determine that the array
subscript is a constant, check if it is outside valid range.
diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 073f122af31..8248a5b38a1 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -2976,10 +2976,16 @@ pass_waccess::maybe_check_access_sizes (rdwr_map *rwm, tree fndecl, tree fntype,
 	continue;
 
   tree ptrtype = fntype_argno_type (fntype, ptridx);
+  if (!ptrtype)
+	/* A function with a prototype was redeclared without one and
+	   the protype has been lost.  See pr102759.  Avoid dealing
+	   with this pathological case.  */
+	return;
+
   tree argtype = TREE_TYPE (ptrtype);
 
-  /* The size of the access by the call.  */
-  tree access_size;
+  /* The size of the access by the call in elements.  */
+  tree access_nelts;
   if (sizidx == -1)
 	{
 	  /* If only the pointer attribute operand was specified and
@@ -2989,17 +2995,17 @@ pass_waccess::maybe_check_access_sizes (rdwr_map *rwm, tree fndecl, tree fntype,
 	 if the pointer is also declared with attribute nonnull.  */
 	  if (access.second.minsize
 	  && access.second.minsize != HOST_WIDE_INT_M1U)
-	access_size = build_int_cstu (sizetype, access.second.minsize);
+	access_nelts = build_int_cstu (sizetype, access.second.minsize);
 	  else
-	access_size = size_one_node;
+	access_nelts = size_one_node;
 	}
   else
-	access_size = rwm->get (sizidx)->size;
+	access_nelts = rwm->get (sizidx)->size;
 
   /* Format the value or range to avoid an explosion of messages.  */
   char sizstr[80];
   tree sizrng[2] = { size_zero_node, build_all_ones_cst (sizetype) };
-  if (get_size_range (m_ptr_qry.rvals, access_size, stmt, sizrng, 1))
+  if (get_size_range (m_ptr_qry.rvals, access_nelts, stmt, sizrng, 1))
 	{
 	  char *s0 = print_generic_expr_to_str (sizrng[0]);
 	  if (tree_int_cst_equal (sizrng[0], sizrng[1]))
@@ -3057,6 +3063,8 @@ pass_waccess::maybe_check_access_sizes (rdwr_map *rwm, tree fndecl, tree fntype,
 	}
 	}
 

Re: [PATCH v2 2/3] gimple-fold: Use ranges to simplify _chk calls

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 10:33 AM, Siddhesh Poyarekar wrote:

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_lower): New function.
(gimple_fold_builtin_strncat_chk,
gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.





@@ -3024,39 +3040,24 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
}
  }
  
-  if (! tree_fits_uhwi_p (size))

-return false;
-
tree maxlen = get_maxval_strlen (len, SRK_INT_VALUE);
-  if (! integer_all_onesp (size))
+  if (! integer_all_onesp (size)
+  && !known_lower (stmt, len, size) && !known_lower (stmt, maxlen, size))

Formatting it.  Move the trailing && !known_lower (...) to its own line.

OK with the formatting nit fixed.

jeff



[pushed] configure, Darwin: Check ld64 support for -platform-version.

2021-11-15 Thread Iain Sandoe via Gcc-patches
Newer versions of ld64 allow specifiying the OS target (e.g.
macos or ios) the version and the SDK version all in a single
command.  This checks the availability of the command for the
current toolchain.

tested on *-darwin*, x86_64-linux-gnu,
pushed to master, thanks
Iain

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Test ld64 for -platform-version support.
---
 gcc/config.in|  6 ++
 gcc/configure| 21 -
 gcc/configure.ac | 16 +++-
 3 files changed, 41 insertions(+), 2 deletions(-)


diff --git a/gcc/configure.ac b/gcc/configure.ac
index 065080a4b39..c9ee1fb8919 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -6253,6 +6253,7 @@ if test x"$ld64_flag" = x"yes"; then
 
   # Set defaults for possibly untestable items.
   gcc_cv_ld64_export_dynamic=0
+  gcc_cv_ld64_platform_version=0
 
   if test "$build" = "$host"; then
 darwin_try_test=1
@@ -6274,9 +6275,12 @@ if test x"$ld64_flag" = x"yes"; then
 AC_MSG_CHECKING(ld64 specified version)
 gcc_cv_ld64_major=`echo "$gcc_cv_ld64_version" | sed -e 's/\..*//'`
 AC_MSG_RESULT($gcc_cv_ld64_major)
-   if test "$gcc_cv_ld64_major" -ge 236; then
+if test "$gcc_cv_ld64_major" -ge 236; then
   gcc_cv_ld64_export_dynamic=1
 fi
+if test "$gcc_cv_ld64_major" -ge 512; then
+  gcc_cv_ld64_platform_version=1
+fi
   elif test -x "$gcc_cv_ld" -a "$darwin_try_test" -eq 1; then
 # If the version was not specified, try to find it.
 AC_MSG_CHECKING(linker version)
@@ -6291,6 +6295,13 @@ if test x"$ld64_flag" = x"yes"; then
   gcc_cv_ld64_export_dynamic=0
 fi
 AC_MSG_RESULT($gcc_cv_ld64_export_dynamic)
+
+AC_MSG_CHECKING(linker for -platform_version support)
+gcc_cv_ld64_platform_version=1
+if $gcc_cv_ld -platform_version macos 10.5 0.0 < /dev/null 2>&1 | grep 
'unknown option' > /dev/null; then
+  gcc_cv_ld64_platform_version=0
+fi
+AC_MSG_RESULT($gcc_cv_ld64_platform_version)
   fi
 
   if test x"${gcc_cv_ld64_version}" != x; then
@@ -6300,6 +6311,9 @@ if test x"$ld64_flag" = x"yes"; then
 
   AC_DEFINE_UNQUOTED(LD64_HAS_EXPORT_DYNAMIC, $gcc_cv_ld64_export_dynamic,
   [Define to 1 if ld64 supports '-export_dynamic'.])
+
+  AC_DEFINE_UNQUOTED(LD64_HAS_PLATFORM_VERSION, $gcc_cv_ld64_platform_version,
+  [Define to 1 if ld64 supports '-platform_version'.])
 fi
 
 if test x"$dsymutil_flag" = x"yes"; then
-- 
2.24.3 (Apple Git-128)



Re: [PATCH v2 3/3] gimple-fold: Use ranges to simplify strncat and snprintf

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 10:33 AM, Siddhesh Poyarekar wrote:

Use ranges for lengths and object sizes in strncat and snprintf to
determine if they can be transformed into simpler operations.

gcc/ChangeLog:

* gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to
determine if it is safe to transform to strcat.
(gimple_fold_builtin_snprintf): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops-2.c: Define size_t.
(safe1): Adjust.
(safe4): New test.
* gcc.dg/fold-stringops-3.c: New test.

OK
jeff



Re: [PATCH] Add TSVC tests.

2021-11-15 Thread Iain Sandoe
Hi folks,

> On 6 Nov 2021, at 08:05, Martin Liška  wrote:

> Sorry for issue related to portability.
> 
> On 11/6/21 03:45, David Edelsohn wrote:
>> I just noticed that Iain adjusted the tsvc.h for Darwin in the same
>> way that I need to adjust it for AIX.  Are we trying to keep the
>> testcase directory pristine and in sync with its upstream source or
>> can we fix it locally?
> 
> We can fix it locally as the source files are split from the original
> all-in-one tsvc.c file.

I needed one small addition - posix_memalign is not available on the
earliest Darwin versions that are regularly tested - but the malloc there
is guaranteed to be sufficiently aligned for the largest vectors in use.

tested on *-darwin*, x86_64-linux-gnu and powerpc-aix.
pushed to master, thanks
Iain

[pushed] testsuite, Darwin: In tsvc.h, use malloc for Darwin <= 9.

Earlier Darwin versions fdo not have posix_memalign() but the
malloc implementation is guaranteed to produce memory suitably
aligned for the largest vector type.

Signed-off-by: Iain Sandoe 

gcc/testsuite/ChangeLog:

* gcc.dg/vect/tsvc/tsvc.h: Use malloc for Darwin 9 and
earlier.
---
 gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h 
b/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
index 63ea1e2601f..665ca747f8e 100644
--- a/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
+++ b/gcc/testsuite/gcc.dg/vect/tsvc/tsvc.h
@@ -193,8 +193,16 @@ void init(int** ip, real_t* s1, real_t* s2){
 xx = (real_t*) memalign(ARRAY_ALIGNMENT, LEN_1D*sizeof(real_t));
 *ip = (int *) memalign(ARRAY_ALIGNMENT, LEN_1D*sizeof(real_t));
 #else
+# if defined (__APPLE__) \
+&& __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ < 1060
+/* We have no aligned allocator, but malloc is guaranteed to return
+   alignment suitable for the largest vector item.  */
+xx = (real_t*) malloc (LEN_1D*sizeof(real_t));
+*ip = (int *) malloc (LEN_1D*sizeof(real_t));
+# else
 posix_memalign ((void*), ARRAY_ALIGNMENT, LEN_1D*sizeof(real_t));
 posix_memalign ((void*)ip, ARRAY_ALIGNMENT, LEN_1D*sizeof(real_t));
+# endif
 #endif
 
 for (int i = 0; i < LEN_1D; i = i+5){
-- 
2.24.3 (Apple Git-128)




Re: [PATCH 2/3] elf: Introduce GLRO (dl_libc_freeres), called from __libc_freeres

2021-11-15 Thread Adhemerval Zanella via Gcc-patches
Maybe add a comment why this is will be used.

Reviewed-by: Adhemerval Zanella  

On 03/11/2021 13:27, Florian Weimer via Gcc-patches wrote:
> ---
>  elf/Makefile   |  2 +-
>  elf/dl-libc_freeres.c  | 24 
>  elf/rtld.c |  1 +
>  malloc/set-freeres.c   |  5 +
>  sysdeps/generic/ldsodefs.h |  7 +++
>  5 files changed, 38 insertions(+), 1 deletion(-)
>  create mode 100644 elf/dl-libc_freeres.c
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index cb9bcfb799..1c768bdf47 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -68,7 +68,7 @@ elide-routines.os = $(all-dl-routines) dl-support 
> enbl-secure dl-origin \
>  rtld-routines= rtld $(all-dl-routines) dl-sysdep dl-environ 
> dl-minimal \
>dl-error-minimal dl-conflict dl-hwcaps dl-hwcaps_split dl-hwcaps-subdirs \
>dl-usage dl-diagnostics dl-diagnostics-kernel dl-diagnostics-cpu \
> -  dl-mutex
> +  dl-mutex dl-libc_freeres
>  all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines)
>  
>  CFLAGS-dl-runtime.c += -fexceptions -fasynchronous-unwind-tables

Ok.

> diff --git a/elf/dl-libc_freeres.c b/elf/dl-libc_freeres.c
> new file mode 100644
> index 00..68f305a6f9
> --- /dev/null
> +++ b/elf/dl-libc_freeres.c
> @@ -0,0 +1,24 @@
> +/* Deallocating malloc'ed memory from the dynamic loader.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   .  */
> +
> +#include 
> +
> +void
> +__rtld_libc_freeres (void)
> +{
> +}

Ok.

> diff --git a/elf/rtld.c b/elf/rtld.c
> index be2d5d8e74..847141e21d 100644
> --- a/elf/rtld.c
> +++ b/elf/rtld.c
> @@ -378,6 +378,7 @@ struct rtld_global_ro _rtld_global_ro attribute_relro =
>  ._dl_catch_error = _rtld_catch_error,
>  ._dl_error_free = _dl_error_free,
>  ._dl_tls_get_addr_soft = _dl_tls_get_addr_soft,
> +._dl_libc_freeres = __rtld_libc_freeres,
>  #ifdef HAVE_DL_DISCOVER_OSVERSION
>  ._dl_discover_osversion = _dl_discover_osversion
>  #endif

Ok.

> diff --git a/malloc/set-freeres.c b/malloc/set-freeres.c
> index 5c19a2725c..856ff7831f 100644
> --- a/malloc/set-freeres.c
> +++ b/malloc/set-freeres.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "../nss/nsswitch.h"
>  #include "../libio/libioP.h"
> @@ -67,6 +68,10 @@ __libc_freeres (void)
>  
>call_function_static_weak (__libc_dlerror_result_free);
>  
> +#ifdef SHARED
> +  GLRO (dl_libc_freeres) ();
> +#endif
> +
>for (p = symbol_set_first_element (__libc_freeres_ptrs);
> !symbol_set_end_p (__libc_freeres_ptrs, p); ++p)
>  free (*p);

OK.

> diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> index 1318c36dce..c26860430c 100644
> --- a/sysdeps/generic/ldsodefs.h
> +++ b/sysdeps/generic/ldsodefs.h
> @@ -712,6 +712,10 @@ struct rtld_global_ro
>   namespace.  */
>void (*_dl_error_free) (void *);
>void *(*_dl_tls_get_addr_soft) (struct link_map *);
> +
> +  /* Called from __libc_shared to deallocate malloc'ed memory.  */
> +  void (*_dl_libc_freeres) (void);
> +
>  #ifdef HAVE_DL_DISCOVER_OSVERSION
>int (*_dl_discover_osversion) (void);
>  #endif
> @@ -1416,6 +1420,9 @@ __rtld_mutex_init (void)
>  }
>  #endif /* !PTHREAD_IN_LIBC */
>  
> +/* Implementation of GL (dl_libc_freeres).  */
> +void __rtld_libc_freeres (void) attribute_hidden;
> +
>  void __thread_gscope_wait (void) attribute_hidden;
>  # define THREAD_GSCOPE_WAIT() __thread_gscope_wait ()
>  
> 

OK.


Re: [PATCH 1/3] nptl: Extract from pthread_cond_common.c

2021-11-15 Thread Adhemerval Zanella via Gcc-patches



On 03/11/2021 13:27, Florian Weimer via Libc-alpha wrote:
> And make it an installed header.  This addresses a few aliasing
> violations (which do not seem to result in miscompilation due to
> the use of atomics), and also enables use of wide counters in other
> parts of the library.
> 
> The debug output in nptl/tst-cond22 has been adjusted to print
> the 32-bit values instead because it avoids a big-endian/little-endian
> difference.

LGTM, thanks.

Reviewed-by: Adhemerval Zanella  

> ---
>  bits/atomic_wide_counter.h  |  35 
>  include/atomic_wide_counter.h   |  89 +++
>  include/bits/atomic_wide_counter.h  |   1 +
>  misc/Makefile   |   3 +-
>  misc/atomic_wide_counter.c  | 127 +++
>  nptl/Makefile   |  13 +-
>  nptl/pthread_cond_common.c  | 204 
>  nptl/tst-cond22.c   |  14 +-
>  sysdeps/nptl/bits/thread-shared-types.h |  22 +--
>  9 files changed, 310 insertions(+), 198 deletions(-)
>  create mode 100644 bits/atomic_wide_counter.h
>  create mode 100644 include/atomic_wide_counter.h
>  create mode 100644 include/bits/atomic_wide_counter.h
>  create mode 100644 misc/atomic_wide_counter.c
> 
> diff --git a/bits/atomic_wide_counter.h b/bits/atomic_wide_counter.h
> new file mode 100644
> index 00..0687eb554e
> --- /dev/null
> +++ b/bits/atomic_wide_counter.h
> @@ -0,0 +1,35 @@
> +/* Monotonically increasing wide counters (at least 62 bits).
> +   Copyright (C) 2016-2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   .  */
> +
> +#ifndef _BITS_ATOMIC_WIDE_COUNTER_H
> +#define _BITS_ATOMIC_WIDE_COUNTER_H
> +
> +/* Counter that is monotonically increasing (by less than 2**31 per
> +   increment), with a single writer, and an arbitrary number of
> +   readers.  */
> +typedef union
> +{
> +  __extension__ unsigned long long int __value64;
> +  struct
> +  {
> +unsigned int __low;
> +unsigned int __high;
> +  } __value32;
> +} __atomic_wide_counter;
> +
> +#endif /* _BITS_ATOMIC_WIDE_COUNTER_H */

Ok, it would be included in multiple places so we can't tie to a specific
header.

> diff --git a/include/atomic_wide_counter.h b/include/atomic_wide_counter.h
> new file mode 100644
> index 00..31f009d5e6
> --- /dev/null
> +++ b/include/atomic_wide_counter.h
> @@ -0,0 +1,89 @@
> +/* Monotonically increasing wide counters (at least 62 bits).
> +   Copyright (C) 2016-2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   .  */
> +
> +#ifndef _ATOMIC_WIDE_COUNTER_H
> +#define _ATOMIC_WIDE_COUNTER_H
> +
> +#include 
> +#include 
> +
> +#if __HAVE_64B_ATOMICS
> +
> +static inline uint64_t
> +__atomic_wide_counter_load_relaxed (__atomic_wide_counter *c)
> +{
> +  return atomic_load_relaxed (>__value64);
> +}
> +
> +static inline uint64_t
> +__atomic_wide_counter_fetch_add_relaxed (__atomic_wide_counter *c,
> + unsigned int val)
> +{
> +  return atomic_fetch_add_relaxed (>__value64, val);
> +}
> +
> +static inline uint64_t
> +__atomic_wide_counter_fetch_add_acquire (__atomic_wide_counter *c,
> + unsigned int val)
> +{
> +  return atomic_fetch_add_acquire (>__value64, val);
> +}
> +
> +static inline void
> +__atomic_wide_counter_add_relaxed (__atomic_wide_counter *c,
> +   unsigned int val)
> +{
> +  atomic_store_relaxed (>__value64,
> +  

Re: [PATCH v2 1/3] gimple-fold: Transform stp*cpy_chk to str*cpy directly

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 10:33 AM, Siddhesh Poyarekar wrote:

Avoid going through another folding cycle and use the ignore flag to
directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.

Dump the transformation in dump_file so that we can verify in tests that
the direct transformation actually happened.

gcc/ChangeLog:

* gimple-fold.c (dump_transformation): New function.
(gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk): Use it.  Simplify to
BUILT_IN_STRNCPY if return value is not used.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops.c: New test.

OK
jeff



Re: [PATCH] Check optab before transforming atomic bit test and operations

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 12:05 PM, H.J. Lu wrote:

On Mon, Nov 15, 2021 at 10:59 AM Jeff Law  wrote:



On 11/15/2021 6:39 AM, H.J. Lu via Gcc-patches wrote:

Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

   PR middle-end/103184
   * tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
   before transforming equivalent, but slighly different cases to
   their canonical forms.

gcc/testsuite/

   PR middle-end/103184
   * gcc.dg/pr103184-1.c: New test.
   * gcc.dg/pr103184-2.c: Likewise.
   }
   }

-  switch (fn)
-{
-case IFN_ATOMIC_BIT_TEST_AND_SET:
-  optab = atomic_bit_test_and_set_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
-  optab = atomic_bit_test_and_complement_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_RESET:
-  optab = atomic_bit_test_and_reset_optab;
-  break;
-default:
-  return;
-}
-
 if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == CODE_FOR_nothing)
   return;

Shouldn't the test of the return value of optab_handler here just go
away since we're testing it earlier?  OK with that fix.


The earlier check is predicated on if (rhs_code != BIT_AND_EXPR):

   if (rhs_code != BIT_AND_EXPR)
 {
   if (rhs_code != NOP_EXPR && rhs_code != BIT_NOT_EXPR)
 return;

   tree use_lhs = gimple_assign_lhs (use_stmt);
   if (TREE_CODE (use_lhs) == SSA_NAME
   && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use_lhs))
 return;

   tree use_rhs = gimple_assign_rhs1 (use_stmt);
   if (lhs != use_rhs)
 return;

   if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
   == CODE_FOR_nothing)
 return;

I can add an "else"

else  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
 == CODE_FOR_nothing)
 return;

Will it be OK?

Sure.  THanks.
jeff


Re: [PATCH] Check optab before transforming atomic bit test and operations

2021-11-15 Thread H.J. Lu via Gcc-patches
On Mon, Nov 15, 2021 at 10:59 AM Jeff Law  wrote:
>
>
>
> On 11/15/2021 6:39 AM, H.J. Lu via Gcc-patches wrote:
> > Check optab before transforming equivalent, but slighly different cases
> > of atomic bit test and operations to their canonical forms.
> >
> > gcc/
> >
> >   PR middle-end/103184
> >   * tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
> >   before transforming equivalent, but slighly different cases to
> >   their canonical forms.
> >
> > gcc/testsuite/
> >
> >   PR middle-end/103184
> >   * gcc.dg/pr103184-1.c: New test.
> >   * gcc.dg/pr103184-2.c: Likewise.
>
> >   }
> >   }
> >
> > -  switch (fn)
> > -{
> > -case IFN_ATOMIC_BIT_TEST_AND_SET:
> > -  optab = atomic_bit_test_and_set_optab;
> > -  break;
> > -case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
> > -  optab = atomic_bit_test_and_complement_optab;
> > -  break;
> > -case IFN_ATOMIC_BIT_TEST_AND_RESET:
> > -  optab = atomic_bit_test_and_reset_optab;
> > -  break;
> > -default:
> > -  return;
> > -}
> > -
> > if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == 
> > CODE_FOR_nothing)
> >   return;
> Shouldn't the test of the return value of optab_handler here just go
> away since we're testing it earlier?  OK with that fix.
>

The earlier check is predicated on if (rhs_code != BIT_AND_EXPR):

  if (rhs_code != BIT_AND_EXPR)
{
  if (rhs_code != NOP_EXPR && rhs_code != BIT_NOT_EXPR)
return;

  tree use_lhs = gimple_assign_lhs (use_stmt);
  if (TREE_CODE (use_lhs) == SSA_NAME
  && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (use_lhs))
return;

  tree use_rhs = gimple_assign_rhs1 (use_stmt);
  if (lhs != use_rhs)
return;

  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
  == CODE_FOR_nothing)
return;

I can add an "else"

else  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
== CODE_FOR_nothing)
return;

Will it be OK?

Thanks.

-- 
H.J.


Re: Basic kill analysis for modref

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 11:51 AM, H.J. Lu via Gcc-patches wrote:

On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
 wrote:

I think you want get_addr_base_and_unit_offset here.  All
variable indexed addresses are in separate stmts.  That also means
you can eventually work with just byte sizes/offsets?

Will do.  The access range in modref summary is bit based (since we want
to disabiguate bitfields like we do in rest of alias oracle) but indeed
this part cna be in bytes.

Actually after the unifiation I can just use get_ao_ref which will call
ao_ref_init_from_ptr_and_range that has all the logic I need in there.
I also noticed that I ended up duplicating the code matching bases
and ranges which is done already twice in the function - once for store
targets and once for MEMSET and friends.  The later copy lacked overflow
checks so I took the first copy and moved it to helper function.  This
makes the gimple part of patch really straighforward: just build ao_ref
if possible and then pass it to this function.

I also added statistics.

I have bootstrapped/regtsed on x86_64-linux the updated patch and
comitted it so I can break out the patches that depends on it.
I have patch improving the kill tracking at modref side and also the
kill oracle itself can use fnspec and does not need to special case
mem* functions.

For cc1plus LTO link I now get:

Alias oracle query stats:
   refs_may_alias_p: 76106130 disambiguations, 100928932 queries
   on_includes: 12539931 disambiguations, 39864841 queries
   ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
   call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
   stmt_kills_ref_p: 107503 kills, 5699589 queries
   nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
   nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must 
overlaps, 96698 queries
   aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
   TBAA oracle: 28248334 disambiguations 104710521 queries
15220245 are in alias set 0
8905994 queries asked about the same object
98 queries asked about the same alias set
0 access volatile
50371110 are dependent in the DAG
1964740 are aritificially in conflict with void *

Modref stats:
   modref kill: 52 kills, 6655 queries
   modref use: 25204 disambiguations, 692151 queries
   modref clobber: 2309709 disambiguations, 21877806 queries
   5320532 tbaa queries (0.243193 per modref query)
   761785 base compares (0.034820 per modref query)

PTA query stats:
   pt_solution_includes: 12539931 disambiguations, 39864841 queries
   pt_solutions_intersect: 1713075 disambiguations, 14023484 queries

Newly we get statis of kill oracle itself:
   stmt_kills_ref_p: 107503 kills, 5699589 queries
and the modref part:
   modref kill: 52 kills, 6655 queries
So an improvemnet over 1 kill using modref I had before. Still not
really great.

Honza

gcc/ChangeLog:

 * ipa-modref-tree.c (modref_access_node::update_for_kills): New
 member function.
 (modref_access_node::merge_for_kills): Likewise.
 (modref_access_node::insert_kill): Likewise.
 * ipa-modref-tree.h (modref_access_node::update_for_kills,
 modref_access_node::merge_for_kills, 
modref_access_node::insert_kill):
 Declare.
 (modref_access_node::useful_for_kill): New member function.
 * ipa-modref.c (modref_summary::useful_p): Release useless kills.
 (lto_modref_summary): Add kills.
 (modref_summary::dump): Dump kills.
 (record_access): Add mdoref_access_node parameter.
 (record_access_lto): Likewise.
 (merge_call_side_effects): Merge kills.
 (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
 (struct summary_ptrs): Add always_executed filed.
 (analyze_load): Update.
 (analyze_store): Update; record kills.
 (analyze_stmt): Add always_executed; record kills in clobbers.
 (analyze_function): Track always_executed.
 (modref_summaries::duplicate): Duplicate kills.
 (update_signature): Release kills.
 * ipa-modref.h (struct modref_summary): Add kills.
 * tree-ssa-alias.c (alias_stats): Add kill stats.
 (dump_alias_stats): Dump kill stats.
 (store_kills_ref_p): Break out from ...
 (stmt_kills_ref_p): Use it; handle modref info based kills.

 gcc/testsuite/ChangeLog:

 2021-11-14  Jan Hubicka  

 * gcc.dg/tree-ssa/modref-dse-3.c: New test.


diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
index 6fc2b7298f4..bbe23a5a211 100644
--- a/gcc/ipa-modref-tree.c
+++ b/gcc/ipa-modref-tree.c
@@ -638,6 +638,185 @@ modref_access_node::get_ao_ref (const gcall *stmt, ao_ref 
*ref) const
return true;
  }

+/* Return true A is 

Re: [PATCH] Check optab before transforming atomic bit test and operations

2021-11-15 Thread Jeff Law via Gcc-patches




On 11/15/2021 6:39 AM, H.J. Lu via Gcc-patches wrote:

Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

PR middle-end/103184
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
before transforming equivalent, but slighly different cases to
their canonical forms.

gcc/testsuite/

PR middle-end/103184
* gcc.dg/pr103184-1.c: New test.
* gcc.dg/pr103184-2.c: Likewise.



}
  }
  
-  switch (fn)

-{
-case IFN_ATOMIC_BIT_TEST_AND_SET:
-  optab = atomic_bit_test_and_set_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
-  optab = atomic_bit_test_and_complement_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_RESET:
-  optab = atomic_bit_test_and_reset_optab;
-  break;
-default:
-  return;
-}
-
if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == CODE_FOR_nothing)
  return;
Shouldn't the test of the return value of optab_handler here just go 
away since we're testing it earlier?  OK with that fix.


Jeff


Re: Basic kill analysis for modref

2021-11-15 Thread H.J. Lu via Gcc-patches
On Sun, Nov 14, 2021 at 10:53 AM Jan Hubicka via Gcc-patches
 wrote:
>
> > >
> > > I think you want get_addr_base_and_unit_offset here.  All
> > > variable indexed addresses are in separate stmts.  That also means
> > > you can eventually work with just byte sizes/offsets?
> >
> > Will do.  The access range in modref summary is bit based (since we want
> > to disabiguate bitfields like we do in rest of alias oracle) but indeed
> > this part cna be in bytes.
>
> Actually after the unifiation I can just use get_ao_ref which will call
> ao_ref_init_from_ptr_and_range that has all the logic I need in there.
> I also noticed that I ended up duplicating the code matching bases
> and ranges which is done already twice in the function - once for store
> targets and once for MEMSET and friends.  The later copy lacked overflow
> checks so I took the first copy and moved it to helper function.  This
> makes the gimple part of patch really straighforward: just build ao_ref
> if possible and then pass it to this function.
>
> I also added statistics.
>
> I have bootstrapped/regtsed on x86_64-linux the updated patch and
> comitted it so I can break out the patches that depends on it.
> I have patch improving the kill tracking at modref side and also the
> kill oracle itself can use fnspec and does not need to special case
> mem* functions.
>
> For cc1plus LTO link I now get:
>
> Alias oracle query stats:
>   refs_may_alias_p: 76106130 disambiguations, 100928932 queries
>   on_includes: 12539931 disambiguations, 39864841 queries
>   ref_maybe_used_by_call_p: 625857 disambiguations, 77138089 queries
>   call_may_clobber_ref_p: 366420 disambiguations, 369293 queries
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 26176 queries
>   nonoverlapping_refs_since_match_p: 30339 disambiguations, 65400 must 
> overlaps, 96698 queries
>   aliasing_component_refs_p: 57500 disambiguations, 15464678 queries
>   TBAA oracle: 28248334 disambiguations 104710521 queries
>15220245 are in alias set 0
>8905994 queries asked about the same object
>98 queries asked about the same alias set
>0 access volatile
>50371110 are dependent in the DAG
>1964740 are aritificially in conflict with void *
>
> Modref stats:
>   modref kill: 52 kills, 6655 queries
>   modref use: 25204 disambiguations, 692151 queries
>   modref clobber: 2309709 disambiguations, 21877806 queries
>   5320532 tbaa queries (0.243193 per modref query)
>   761785 base compares (0.034820 per modref query)
>
> PTA query stats:
>   pt_solution_includes: 12539931 disambiguations, 39864841 queries
>   pt_solutions_intersect: 1713075 disambiguations, 14023484 queries
>
> Newly we get statis of kill oracle itself:
>   stmt_kills_ref_p: 107503 kills, 5699589 queries
> and the modref part:
>   modref kill: 52 kills, 6655 queries
> So an improvemnet over 1 kill using modref I had before. Still not
> really great.
>
> Honza
>
> gcc/ChangeLog:
>
> * ipa-modref-tree.c (modref_access_node::update_for_kills): New
> member function.
> (modref_access_node::merge_for_kills): Likewise.
> (modref_access_node::insert_kill): Likewise.
> * ipa-modref-tree.h (modref_access_node::update_for_kills,
> modref_access_node::merge_for_kills, 
> modref_access_node::insert_kill):
> Declare.
> (modref_access_node::useful_for_kill): New member function.
> * ipa-modref.c (modref_summary::useful_p): Release useless kills.
> (lto_modref_summary): Add kills.
> (modref_summary::dump): Dump kills.
> (record_access): Add mdoref_access_node parameter.
> (record_access_lto): Likewise.
> (merge_call_side_effects): Merge kills.
> (analyze_call): Add ALWAYS_EXECUTED param and pass it around.
> (struct summary_ptrs): Add always_executed filed.
> (analyze_load): Update.
> (analyze_store): Update; record kills.
> (analyze_stmt): Add always_executed; record kills in clobbers.
> (analyze_function): Track always_executed.
> (modref_summaries::duplicate): Duplicate kills.
> (update_signature): Release kills.
> * ipa-modref.h (struct modref_summary): Add kills.
> * tree-ssa-alias.c (alias_stats): Add kill stats.
> (dump_alias_stats): Dump kill stats.
> (store_kills_ref_p): Break out from ...
> (stmt_kills_ref_p): Use it; handle modref info based kills.
>
> gcc/testsuite/ChangeLog:
>
> 2021-11-14  Jan Hubicka  
>
> * gcc.dg/tree-ssa/modref-dse-3.c: New test.
>
>
> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
> index 6fc2b7298f4..bbe23a5a211 100644
> --- a/gcc/ipa-modref-tree.c
> +++ b/gcc/ipa-modref-tree.c
> @@ -638,6 +638,185 @@ 

PING: [PATCH] c++: designated init of char array by string constant [PR55227]

2021-11-15 Thread will wray via Gcc-patches
The fixes test out, as does the FIXME that's fixed based on the fixes...

Note that the bug causes bogus rejection of any designated initialization
of char array from a string literal, except for the singular case where the
string literal initializer size exactly matches the target char array size
and is not enclosed in optional braces:

  typedef struct
  C { char id[4]; } C;

  C a = {.id = "abc"};   // g++ accepts iff sizeof(C::c) == sizeof("abc")

  C b = {.id = {"abc"}}; // g++ rejects valid (gcc accepts)
  C c = {.id = "a"}; // g++ rejects valid (gcc accepts)

I'd expect this to be common in C code bases, so the bug would be hit in
any attempt to compile with g++. From the bugzilla comments, it seems that
the following 'workaround' is being used:

  C d = {{.id = "a"}};   // g++ accepts invalid (gcc rejects)

which 'works' in this case but is completely borked, consider:

  struct name {char first[32], second[32], third[32];};
  name DMR {{.first = "Dennis"}, {.third = "Ritchie"}};

Only g++ accepts, ignores the designators, interprets as positional,
and generates correspondingly invalid output:

DMR:
.string "Dennis"
.zero   25
.string "Ritchie"
.zero   24
.zero   32


[PATCH v5 1/1] [ARM] Add support for TLS register based stack protector canary access

2021-11-15 Thread Ard Biesheuvel via Gcc-patches
Add support for accessing the stack canary value via the TLS register,
so that multiple threads running in the same address space can use
distinct canary values. This is intended for the Linux kernel running in
SMP mode, where processes entering the kernel are essentially threads
running the same program concurrently: using a global variable for the
canary in that context is problematic because it can never be rotated,
and so the OS is forced to use the same value as long as it remains up.

Using the TLS register to index the stack canary helps with this, as it
allows each CPU to context switch the TLS register along with the rest
of the process, permitting each process to use its own value for the
stack canary.

2021-11-15 Ard Biesheuvel 

* config/arm/arm-opts.h (enum stack_protector_guard): New
* config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
New
* config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
(arm_option_override_internal): Handle and put in error checks
for stack protector guard options.
(arm_option_reconfigure_globals): Likewise
(arm_stack_protect_tls_canary_mem): New
(arm_stack_protect_guard): New
* config/arm/arm.md (stack_protect_set): New
(stack_protect_set_tls): Likewise
(stack_protect_test): Likewise
(stack_protect_test_tls): Likewise
(reload_tp_hard): Likewise
* config/arm/arm.opt (-mstack-protector-guard): New
(-mstack-protector-guard-offset): New.
* doc/invoke.texi: Document new options

gcc/testsuite/ChangeLog:

* gcc.target/arm/stack-protector-7.c: New test.
* gcc.target/arm/stack-protector-8.c: New test.

Signed-off-by: Ard Biesheuvel 
---
 gcc/config/arm/arm-opts.h|  6 ++
 gcc/config/arm/arm-protos.h  |  2 +
 gcc/config/arm/arm.c | 55 +++
 gcc/config/arm/arm.md| 71 +++-
 gcc/config/arm/arm.opt   | 22 ++
 gcc/doc/invoke.texi  | 11 +++
 gcc/testsuite/gcc.target/arm/stack-protector-7.c | 10 +++
 gcc/testsuite/gcc.target/arm/stack-protector-8.c |  5 ++
 8 files changed, 180 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
index 5c4b62f404f7..581ba3c4fbbb 100644
--- a/gcc/config/arm/arm-opts.h
+++ b/gcc/config/arm/arm-opts.h
@@ -69,4 +69,10 @@ enum arm_tls_type {
   TLS_GNU,
   TLS_GNU2
 };
+
+/* Where to get the canary for the stack protector.  */
+enum stack_protector_guard {
+  SSP_TLSREG,  /* per-thread canary in TLS register */
+  SSP_GLOBAL   /* global canary */
+};
 #endif
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 9b1f61394ad7..d8d605920c97 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -195,6 +195,8 @@ extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, 
rtx, rtx, rtx, rtx);
 extern rtx arm_load_tp (rtx);
 extern bool arm_coproc_builtin_available (enum unspecv);
 extern bool arm_coproc_ldc_stc_legitimate_address (rtx);
+extern rtx arm_stack_protect_tls_canary_mem (bool);
+
 
 #if defined TREE_CODE
 extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a5b403eb3e49..e5077348ce07 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -829,6 +829,9 @@ static const struct attribute_spec arm_attribute_table[] =
 
 #undef TARGET_MD_ASM_ADJUST
 #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust
+
+#undef TARGET_STACK_PROTECT_GUARD
+#define TARGET_STACK_PROTECT_GUARD arm_stack_protect_guard
 
 /* Obstack for minipool constant handling.  */
 static struct obstack minipool_obstack;
@@ -3176,6 +3179,26 @@ arm_option_override_internal (struct gcc_options *opts,
   if (TARGET_THUMB2_P (opts->x_target_flags))
 opts->x_inline_asm_unified = true;
 
+  if (arm_stack_protector_guard == SSP_GLOBAL
+  && opts->x_arm_stack_protector_guard_offset_str)
+{
+  error ("incompatible options %'-mstack-protector-guard=global%' and"
+"%'-mstack-protector-guard-offset=%qs%'",
+arm_stack_protector_guard_offset_str);
+}
+
+  if (opts->x_arm_stack_protector_guard_offset_str)
+{
+  char *end;
+  const char *str = arm_stack_protector_guard_offset_str;
+  errno = 0;
+  long offs = strtol (arm_stack_protector_guard_offset_str, , 0);
+  if (!*str || *end || errno)
+   error ("%qs is not a valid offset in %qs", str,
+  "-mstack-protector-guard-offset=");
+  arm_stack_protector_guard_offset = offs;
+}
+
 #ifdef SUBTARGET_OVERRIDE_INTERNAL_OPTIONS
   SUBTARGET_OVERRIDE_INTERNAL_OPTIONS;
 #endif
@@ -3843,6 +3866,9 @@ arm_option_reconfigure_globals (void)
   else
target_thread_pointer = TP_SOFT;
 }
+
+  if (!TARGET_HARD_TP && 

[PATCH v5 0/1] implement TLS register based stack canary for ARM

2021-11-15 Thread Ard Biesheuvel via Gcc-patches
Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102352

In the Linux kernel, user processes calling into the kernel are
essentially threads running in the same address space, of a program that
never terminates. This means that using a global variable for the stack
protector canary value is problematic on SMP systems, as we can never
change it unless we reboot the system. (Processes that sleep for any
reason will do so on a call into the kernel, which means that there will
always be live kernel stack frames carrying copies of the canary taken
when the function was entered)

AArch64 implements -mstack-protector-guard=sysreg for this purpose, as
this permits the kernel to use different memory addresses for the stack
canary for each CPU, and context switch the chosen system register with
the rest of the process, allowing each process to use its own unique
value for the stack canary.

This patch implements something similar, but for the 32-bit ARM kernel,
which will start using the user space TLS register TPIDRURO to index
per-process metadata while running in the kernel. This means we can just
add an offset to TPIDRURO to obtain the address from which to load the
canary value.

Changes since v4:
- add a couple of test cases
- incorporate feedback received from Qing and Kyrylo

Changes since v3:
- force a reload of the TLS register before performing the stack
  protector check, so that we never rely on the stack for the address of
  the canary 
Changes since v2:
- fix the template for stack_protect_test_tls so it correctly conveys
  the fact that it sets the Z flag

Cc: Keith Packard 
Cc: thomas.preudho...@celest.fr
Cc: adhemerval.zane...@linaro.org
Cc: Qing Zhao 
Cc: Richard Sandiford 
Cc: Kyrylo Tkachov 
Cc: gcc-patches@gcc.gnu.org

Ard Biesheuvel (1):
  [ARM] Add support for TLS register based stack protector canary access

 gcc/config/arm/arm-opts.h|  6 ++
 gcc/config/arm/arm-protos.h  |  2 +
 gcc/config/arm/arm.c | 55 +++
 gcc/config/arm/arm.md| 71 +++-
 gcc/config/arm/arm.opt   | 22 ++
 gcc/doc/invoke.texi  | 11 +++
 gcc/testsuite/gcc.target/arm/stack-protector-7.c | 10 +++
 gcc/testsuite/gcc.target/arm/stack-protector-8.c |  5 ++
 8 files changed, 180 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/stack-protector-7.c
 create mode 100644 gcc/testsuite/gcc.target/arm/stack-protector-8.c

-- 
2.30.2



[PATCH v2 3/3] gimple-fold: Use ranges to simplify strncat and snprintf

2021-11-15 Thread Siddhesh Poyarekar
Use ranges for lengths and object sizes in strncat and snprintf to
determine if they can be transformed into simpler operations.

gcc/ChangeLog:

* gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to
determine if it is safe to transform to strcat.
(gimple_fold_builtin_snprintf): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops-2.c: Define size_t.
(safe1): Adjust.
(safe4): New test.
* gcc.dg/fold-stringops-3.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/gimple-fold.c   | 102 
 gcc/testsuite/gcc.dg/fold-stringops-2.c |  16 +++-
 gcc/testsuite/gcc.dg/fold-stringops-3.c |  18 +
 3 files changed, 82 insertions(+), 54 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-3.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index f3362287c0d..50b9ba8d558 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2485,72 +2485,73 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
   tree dst = gimple_call_arg (stmt, 0);
   tree src = gimple_call_arg (stmt, 1);
   tree len = gimple_call_arg (stmt, 2);
-
-  const char *p = c_getstr (src);
+  tree src_len = c_strlen (src, 1);
 
   /* If the requested length is zero, or the src parameter string
  length is zero, return the dst parameter.  */
-  if (integer_zerop (len) || (p && *p == '\0'))
+  if (integer_zerop (len) || (src_len && integer_zerop (src_len)))
 {
   replace_call_with_value (gsi, dst);
   return true;
 }
 
-  if (TREE_CODE (len) != INTEGER_CST || !p)
-return false;
-
-  unsigned srclen = strlen (p);
-
-  int cmpsrc = compare_tree_int (len, srclen);
-
   /* Return early if the requested len is less than the string length.
  Warnings will be issued elsewhere later.  */
-  if (cmpsrc < 0)
+  if (!src_len || known_lower (stmt, len, src_len, true))
 return false;
 
   unsigned HOST_WIDE_INT dstsize;
+  bool found_dstsize = compute_builtin_object_size (dst, 1, );
 
-  bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
-
-  if (!nowarn && compute_builtin_object_size (dst, 1, ))
+  /* Warn on constant LEN.  */
+  if (TREE_CODE (len) == INTEGER_CST)
 {
-  int cmpdst = compare_tree_int (len, dstsize);
+  bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
 
-  if (cmpdst >= 0)
+  if (!nowarn && found_dstsize)
{
- tree fndecl = gimple_call_fndecl (stmt);
+ int cmpdst = compare_tree_int (len, dstsize);
+
+ if (cmpdst >= 0)
+   {
+ tree fndecl = gimple_call_fndecl (stmt);
+
+ /* Strncat copies (at most) LEN bytes and always appends
+the terminating NUL so the specified bound should never
+be equal to (or greater than) the size of the destination.
+If it is, the copy could overflow.  */
+ location_t loc = gimple_location (stmt);
+ nowarn = warning_at (loc, OPT_Wstringop_overflow_,
+  cmpdst == 0
+  ? G_("%qD specified bound %E equals "
+   "destination size")
+  : G_("%qD specified bound %E exceeds "
+   "destination size %wu"),
+  fndecl, len, dstsize);
+ if (nowarn)
+   suppress_warning (stmt, OPT_Wstringop_overflow_);
+   }
+   }
 
- /* Strncat copies (at most) LEN bytes and always appends
-the terminating NUL so the specified bound should never
-be equal to (or greater than) the size of the destination.
-If it is, the copy could overflow.  */
+  if (!nowarn && TREE_CODE (src_len) == INTEGER_CST
+ && tree_int_cst_compare (src_len, len) == 0)
+   {
+ tree fndecl = gimple_call_fndecl (stmt);
  location_t loc = gimple_location (stmt);
- nowarn = warning_at (loc, OPT_Wstringop_overflow_,
-  cmpdst == 0
-  ? G_("%qD specified bound %E equals "
-   "destination size")
-  : G_("%qD specified bound %E exceeds "
-   "destination size %wu"),
-  fndecl, len, dstsize);
- if (nowarn)
+
+ /* To avoid possible overflow the specified bound should also
+not be equal to the length of the source, even when the size
+of the destination is unknown (it's not an uncommon mistake
+to specify as the bound to strncpy the length of the source).  */
+ if (warning_at (loc, OPT_Wstringop_overflow_,
+ "%qD specified bound %E equals source length",
+ fndecl, len))
suppress_warning (stmt, OPT_Wstringop_overflow_);
  

[PATCH v2 2/3] gimple-fold: Use ranges to simplify _chk calls

2021-11-15 Thread Siddhesh Poyarekar
Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_lower): New function.
(gimple_fold_builtin_strncat_chk,
gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/builtin-chk-fold.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/gimple-fold.c   | 216 +---
 gcc/testsuite/gcc.dg/Wobjsize-1.c   |   5 +-
 gcc/testsuite/gcc.dg/fold-stringops-2.c |  49 ++
 3 files changed, 132 insertions(+), 138 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-2.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 2e92efa7f61..f3362287c0d 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2031,6 +2031,28 @@ get_maxval_strlen (tree arg, strlen_range_kind rkind, 
tree *nonstr = NULL)
   return lendata.decl ? NULL_TREE : lendata.maxlen;
 }
 
+/* Return true if LEN is known to be less than or equal to (or if STRICT is
+   true, strictly less than) the lower bound of SIZE at compile time and false
+   otherwise.  */
+
+static bool
+known_lower (gimple *stmt, tree len, tree size, bool strict = false)
+{
+  if (len == NULL_TREE)
+return false;
+
+  wide_int size_range[2];
+  wide_int len_range[2];
+  if (get_range (len, stmt, len_range) && get_range (size, stmt, size_range))
+{
+  if (strict)
+   return wi::ltu_p (len_range[1], size_range[0]);
+  else
+   return wi::leu_p (len_range[1], size_range[0]);
+}
+
+  return false;
+}
 
 /* Fold function call to builtin strcpy with arguments DEST and SRC.
If LEN is not NULL, it represents the length of the string to be
@@ -2566,16 +2588,10 @@ gimple_fold_builtin_strncat_chk (gimple_stmt_iterator 
*gsi)
   return true;
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   if (! integer_all_onesp (size))
 {
   tree src_len = c_strlen (src, 1);
-  if (src_len
- && tree_fits_uhwi_p (src_len)
- && tree_fits_uhwi_p (len)
- && ! tree_int_cst_lt (len, src_len))
+  if (known_lower (stmt, src_len, len))
{
  /* If LEN >= strlen (SRC), optimize into __strcat_chk.  */
  fn = builtin_decl_explicit (BUILT_IN_STRCAT_CHK);
@@ -3024,39 +3040,24 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
}
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (len, SRK_INT_VALUE);
-  if (! integer_all_onesp (size))
+  if (! integer_all_onesp (size)
+  && !known_lower (stmt, len, size) && !known_lower (stmt, maxlen, size))
 {
-  if (! tree_fits_uhwi_p (len))
+  /* MAXLEN and LEN both cannot be proved to be less than SIZE, at
+least try to optimize (void) __mempcpy_chk () into
+(void) __memcpy_chk () */
+  if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
{
- /* If LEN is not constant, try MAXLEN too.
-For MAXLEN only allow optimizing into non-_ocs function
-if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
- if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
-   {
- if (fcode == BUILT_IN_MEMPCPY_CHK && ignore)
-   {
- /* (void) __mempcpy_chk () can be optimized into
-(void) __memcpy_chk ().  */
- fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
- if (!fn)
-   return false;
+ fn = builtin_decl_explicit (BUILT_IN_MEMCPY_CHK);
+ if (!fn)
+   return false;
 
- gimple *repl = gimple_build_call (fn, 4, dest, src, len, 
size);
- replace_call_with_call_and_fold (gsi, repl);
- return true;
-   }
- return false;
-   }
+ gimple *repl = gimple_build_call (fn, 4, dest, src, len, size);
+ replace_call_with_call_and_fold (gsi, repl);
+ return true;
}
-  else
-   maxlen = len;
-
-  if (tree_int_cst_lt (size, maxlen))
-   return false;
+  return false;
 }
 
   fn = NULL_TREE;
@@ -3136,61 +3137,48 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator 
*gsi,
   return true;
 }
 
-  if (! tree_fits_uhwi_p (size))
-return false;
-
   tree maxlen = get_maxval_strlen (src, SRK_STRLENMAX);
   if (! integer_all_onesp (size))
 {
   len = c_strlen (src, 1);
-  if (! len || ! tree_fits_uhwi_p (len))
+  if (!known_lower (stmt, len, size, 

[PATCH v2 1/3] gimple-fold: Transform stp*cpy_chk to str*cpy directly

2021-11-15 Thread Siddhesh Poyarekar
Avoid going through another folding cycle and use the ignore flag to
directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.

Dump the transformation in dump_file so that we can verify in tests that
the direct transformation actually happened.

gcc/ChangeLog:

* gimple-fold.c (dump_transformation): New function.
(gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk): Use it.  Simplify to
BUILT_IN_STRNCPY if return value is not used.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/gimple-fold.c   | 55 -
 gcc/testsuite/gcc.dg/fold-stringops-1.c | 23 +++
 2 files changed, 58 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 6e25a7c05db..2e92efa7f61 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -3088,6 +3088,16 @@ gimple_fold_builtin_memory_chk (gimple_stmt_iterator 
*gsi,
   return true;
 }
 
+/* Print a message in the dump file recording transformation of FROM to TO.  */
+
+static void
+dump_transformation (gcall *from, gcall *to)
+{
+  if (dump_enabled_p ())
+dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, from, "simplified %T to %T\n",
+gimple_call_fn (from), gimple_call_fn (to));
+}
+
 /* Fold a call to the __st[rp]cpy_chk builtin.
DEST, SRC, and SIZE are the arguments to the call.
IGNORE is true if return value can be ignored.  FCODE is the BUILT_IN_*
@@ -3100,7 +3110,7 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator *gsi,
tree src, tree size,
enum built_in_function fcode)
 {
-  gimple *stmt = gsi_stmt (*gsi);
+  gcall *stmt = as_a  (gsi_stmt (*gsi));
   location_t loc = gimple_location (stmt);
   bool ignore = gimple_call_lhs (stmt) == NULL_TREE;
   tree len, fn;
@@ -3184,12 +3194,13 @@ gimple_fold_builtin_stxcpy_chk (gimple_stmt_iterator 
*gsi,
 }
 
   /* If __builtin_st{r,p}cpy_chk is used, assume st{r,p}cpy is available.  */
-  fn = builtin_decl_explicit (fcode == BUILT_IN_STPCPY_CHK
+  fn = builtin_decl_explicit (fcode == BUILT_IN_STPCPY_CHK && !ignore
  ? BUILT_IN_STPCPY : BUILT_IN_STRCPY);
   if (!fn)
 return false;
 
-  gimple *repl = gimple_build_call (fn, 2, dest, src);
+  gcall *repl = gimple_build_call (fn, 2, dest, src);
+  dump_transformation (stmt, repl);
   replace_call_with_call_and_fold (gsi, repl);
   return true;
 }
@@ -3205,23 +3216,10 @@ gimple_fold_builtin_stxncpy_chk (gimple_stmt_iterator 
*gsi,
 tree len, tree size,
 enum built_in_function fcode)
 {
-  gimple *stmt = gsi_stmt (*gsi);
+  gcall *stmt = as_a  (gsi_stmt (*gsi));
   bool ignore = gimple_call_lhs (stmt) == NULL_TREE;
   tree fn;
 
-  if (fcode == BUILT_IN_STPNCPY_CHK && ignore)
-{
-   /* If return value of __stpncpy_chk is ignored,
-  optimize into __strncpy_chk.  */
-   fn = builtin_decl_explicit (BUILT_IN_STRNCPY_CHK);
-   if (fn)
-{
-  gimple *repl = gimple_build_call (fn, 4, dest, src, len, size);
-  replace_call_with_call_and_fold (gsi, repl);
-  return true;
-}
-}
-
   if (! tree_fits_uhwi_p (size))
 return false;
 
@@ -3234,7 +3232,23 @@ gimple_fold_builtin_stxncpy_chk (gimple_stmt_iterator 
*gsi,
 For MAXLEN only allow optimizing into non-_ocs function
 if SIZE is >= MAXLEN, never convert to __ocs_fail ().  */
  if (maxlen == NULL_TREE || ! tree_fits_uhwi_p (maxlen))
-   return false;
+   {
+ if (fcode == BUILT_IN_STPNCPY_CHK && ignore)
+   {
+ /* If return value of __stpncpy_chk is ignored,
+optimize into __strncpy_chk.  */
+ fn = builtin_decl_explicit (BUILT_IN_STRNCPY_CHK);
+ if (fn)
+   {
+ gimple *repl = gimple_build_call (fn, 4, dest, src, len,
+   size);
+ replace_call_with_call_and_fold (gsi, repl);
+ return true;
+   }
+   }
+
+ return false;
+   }
}
   else
maxlen = len;
@@ -3244,12 +3258,13 @@ gimple_fold_builtin_stxncpy_chk (gimple_stmt_iterator 
*gsi,
 }
 
   /* If __builtin_st{r,p}ncpy_chk is used, assume st{r,p}ncpy is available.  */
-  fn = builtin_decl_explicit (fcode == BUILT_IN_STPNCPY_CHK
+  fn = builtin_decl_explicit (fcode == BUILT_IN_STPNCPY_CHK && !ignore
  ? BUILT_IN_STPNCPY : BUILT_IN_STRNCPY);
   if (!fn)
 return false;
 
-  gimple *repl = gimple_build_call (fn, 3, dest, src, len);
+  gcall *repl = gimple_build_call 

[PATCH v2 0/3] gimple-fold improvements

2021-11-15 Thread Siddhesh Poyarekar
This patchset improves folding in cases where input lengths
and/or destination sizes may not be constant but are range bound.

Tested on x86_64 with a full bootstrap build and verified that there are
no regressions resulting from this patchset.  I double-checked that the
run was current and I wasn't checking logs of a stale build.

I tested builds of bash and wpa_supplicant with this patchset.
wpa_supplicant ends up with 30 fewer __memcpy_chk calls, of which 14 are
completely optimized away and the rest transformed to __memcpy. 

In bash, 3 __memcpy_chk calls are optimized away completely in addition
to a couple of memmove and strcpy chk variants being transformed into
regular calls.

Changes from v1:

- Use dump_* functions instead of directly using dump_file
- Bring back warnings and reduce scope of changes to strncat to only
  using ranges to determine if call can be simplified to strcat.
- Renamed known_safe to known_lower

Siddhesh Poyarekar (3):
  gimple-fold: Transform stp*cpy_chk to str*cpy directly
  gimple-fold: Use ranges to simplify _chk calls
  gimple-fold: Use ranges to simplify strncat and snprintf

 gcc/gimple-fold.c   | 343 ++--
 gcc/testsuite/gcc.dg/Wobjsize-1.c   |   5 +-
 gcc/testsuite/gcc.dg/fold-stringops-1.c |  23 ++
 gcc/testsuite/gcc.dg/fold-stringops-2.c |  63 +
 gcc/testsuite/gcc.dg/fold-stringops-3.c |  18 ++
 5 files changed, 256 insertions(+), 196 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-1.c
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-2.c
 create mode 100644 gcc/testsuite/gcc.dg/fold-stringops-3.c

-- 
2.31.1



[PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-15 Thread Marek Polacek via Gcc-patches
On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> but changing the name is a trivial operation. 

Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
changes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
>From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional characters the preprocessor may encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional characters; e.g. a LRE without its appertaining PDF.  The
level =any warns about any use of bidirectional characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.

We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional characters can nest
so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

PR preprocessor/103026

gcc/c-family/ChangeLog:

* c.opt (Wbidi-chars, Wbidi-chars=): New option.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wbidi-chars.

libcpp/ChangeLog:

* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
---
 gcc/c-family/c.opt  |  24 ++
 gcc/doc/invoke.texi |  20 +-
 gcc/testsuite/c-c++-common/Wbidi-chars-1.c  |  12 +
 gcc/testsuite/c-c++-common/Wbidi-chars-10.c |  27 ++
 gcc/testsuite/c-c++-common/Wbidi-chars-11.c |  13 +
 gcc/testsuite/c-c++-common/Wbidi-chars-12.c |  19 +
 gcc/testsuite/c-c++-common/Wbidi-chars-13.c |  17 +
 gcc/testsuite/c-c++-common/Wbidi-chars-14.c |  38 ++
 gcc/testsuite/c-c++-common/Wbidi-chars-2.c  |   9 +
 gcc/testsuite/c-c++-common/Wbidi-chars-3.c  |  11 +
 gcc/testsuite/c-c++-common/Wbidi-chars-4.c  | 166 
 gcc/testsuite/c-c++-common/Wbidi-chars-5.c  | 166 
 gcc/testsuite/c-c++-common/Wbidi-chars-6.c  | 155 
 gcc/testsuite/c-c++-common/Wbidi-chars-7.c  |   9 +
 gcc/testsuite/c-c++-common/Wbidi-chars-8.c  |  13 +
 gcc/testsuite/c-c++-common/Wbidi-chars-9.c  |  29 ++
 libcpp/include/cpplib.h |  18 +-
 libcpp/init.c   |   1 +
 libcpp/lex.c| 407 +++-
 19 files changed, 1139 

RE: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

2021-11-15 Thread Di Zhao OS via Gcc-patches
Attached is the updated patch. Fixed some errors in testcases.

> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, November 10, 2021 5:44 PM
> To: Di Zhao OS 
> Cc: gcc-patches@gcc.gnu.org; Andrew MacLeod 
> Subject: Re: [PATCH v2] tree-optimization/101186 - extend FRE with
> "equivalence map" for condition prediction
> 
> On Sun, Oct 24, 2021 at 9:03 PM Di Zhao OS
>  wrote:
> >
> > Hi,
> >
> > Attached is a new version of the patch, mainly for improving performance
> > and simplifying the code.
> 
> The patch doesn't apply anymore, can you update it please?
> 
> I see the new ssa-fre-101.c test already passing without the patch.

It was a mistake in test ssa-fre-101.c::g to define the variables with
the unsigned integers, in this way "a >= 0" is always true. After modified
the case, now fre1 in the patch can remove unreachable call "foo ()", and
evrp on trunk does not.

> Likewise ssa-fre-100.c and ssa-fre-102.c would PASS if you scan
> the pass dump after fre1 which is evrp so it seems that evrp already
> handles the equivalences (likely with the relation oracle) now?
> I'm sure there are second order effects when eliminating conditions
> in FRE but did you re-evaluate what made you improve VN to see
> if the cases are handled as expected now without this change?

In case ssa-fre-102.c, the unreachable call "foo ()" can be removed by
evrp, but fre in the patch can additionally replace "g = x + b" with
"g = 99". (Again I'm sorry the regex to check this was wrong..)

Test cases to simulate the original problem I found are moved into
gcc.dg/tree-ssa/ssa-pre-34.c. The unreachable calls to "foo ()" are
still removed by pre with the patch. (If compiled with -O3, the
"foo ()"s in test file can now be removed by thread2/threadfull2 and
dom3 on trunk. This relies on jump threading across the loops, so
even with -O3, similar cases may not get optimized say if there're
too many statements to copy.)

Thanks,
Di Zhao

> 
> I will still look at and consider the change btw, but given the EVRP
> improvements I'm also considering to remove the predication
> support from VN alltogether.  At least in the non-iterating mode
> it should be trivially easy to use rangers relation oracle to simplify
> predicates.  For the iterating mode it might not be 100% effective
> since I'm not sure we can make it use the current SSA values and
> how it would behave with those eventually changing to worse.
> 
> Andrew, how would one ask the relation oracle to simplify a
> condition?  Do I have to do any bookkeeping to register
> predicates on edges for it?
> 
> Thanks,
> Richard.
> 
> > First, regarding the comments:
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Friday, October 1, 2021 9:00 PM
> > > To: Di Zhao OS 
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH v2] tree-optimization/101186 - extend FRE with
> > > "equivalence map" for condition prediction
> > >
> > > On Thu, Sep 16, 2021 at 8:13 PM Di Zhao OS
> > >  wrote:
> > > >
> > > > Sorry about updating on this after so long. It took me much time to 
> > > > work out a
> > > > new plan and pass the tests.
> > > >
> > > > The new idea is to use one variable to represent a set of equal 
> > > > variables at
> > > > some basic-block. This variable is called a "equivalence head" or 
> > > > "equiv-head"
> > > > in the code. (There's no-longer a "equivalence map".)
> > > >
> > > > - Initially an SSA_NAME's "equivalence head" is its value number. 
> > > > Temporary
> > > >   equivalence heads are recorded as unary NOP_EXPR results in the 
> > > > vn_nary_op_t
> > > >   map. Besides, when inserting into vn_nary_op_t map, make the new 
> > > > result at
> > > >   front of the vn_pval list, so that when searching for a variable's
> > > >   equivalence head, the first result represents the largest equivalence 
> > > > set at
> > > >   current location.
> > > > - In vn_ssa_aux_t, maintain a list of references to valid_info->nary 
> > > > entry.
> > > >   For recorded equivalences, the reference is result->entry; for normal 
> > > > N-ary
> > > >   operations, the reference is operand->entry.
> > > > - When recording equivalences, if one side A is constant or has more 
> > > > refs, make
> > > >   it the new equivalence head of the other side B. Traverse B's 
> > > > ref-list,if a
> > > >   variable C's previous equiv-head is B, update to A. And re-insert B's 
> > > > n-ary
> > > >   operations by replacing B with A.
> > > > - When inserting and looking for the results of n-ary operations, 
> > > > insert and
> > > >   lookup by the operands' equiv-heads.
> > > > ...
> > > >
> > > > Thanks,
> > > > Di Zhao
> > > >
> > > > 
> > > > Extend FRE with temporary equivalences.
> > >
> > > Comments on the patch:
> > >
> > > +  /* nary_ref count.  */
> > > +  unsigned num_nary_ref;
> > > +
> > >
> > > I think a unsigned short should be enough and that would nicely
> > > pack after value_id together with the bitfield (maybe change 

Re: [PATCH] c++: __builtin_bit_cast To C array target type [PR103140]

2021-11-15 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 15, 2021 at 12:12:22PM -0500, will wray via Gcc-patches wrote:
> One motivation for allowing builtin bit_cast to builtin array is that
> it enables direct bitwise constexpr comparisons via memcmp:
> 
> template
> constexpr int bit_equal(A const& a, B const& b)
> {
>   static_assert( sizeof a == sizeof b,
>   "bit_equal(a,b) requires same sizeof" );
>   using bytes = unsigned char[sizeof(A)];
>   return __builtin_memcmp(
>  __builtin_bit_cast(bytes,a),
>  __builtin_bit_cast(bytes,b),
>  sizeof(A)) == 0;
> }

IMNSHO people shouldn't use this builtin directly, and we shouldn't
encourage such uses, the standard interface is std::bit_cast.

For the above, I don't see a reason to do it that way, you can
instead portably:
  struct bytes { unsigned char data[sizeof(A)]; };
  bytes ab = std::bit_cast(bytes, a);
  bytes bb = std::bit_cast(bytes, a);
  for (size_t i = 0; i < sizeof(A); ++i)
if (ab.data[i] != bb.data[i])
  return false;
  return true;
- __builtin_memcmp isn't portable either and memcmp isn't constexpr.

If P1997 is in, it is easy to support it in std::bit_cast and easy to
explain what __builtin_bit_cast does for array types, but otherwise
it is quite unclear what it exactly does...

Jakub



Re: [PATCH] c++: __builtin_bit_cast To C array target type [PR103140]

2021-11-15 Thread will wray via Gcc-patches
Ping.

One motivation for allowing builtin bit_cast to builtin array is that
it enables direct bitwise constexpr comparisons via memcmp:

template
constexpr int bit_equal(A const& a, B const& b)
{
  static_assert( sizeof a == sizeof b,
  "bit_equal(a,b) requires same sizeof" );
  using bytes = unsigned char[sizeof(A)];
  return __builtin_memcmp(
 __builtin_bit_cast(bytes,a),
 __builtin_bit_cast(bytes,b),
 sizeof(A)) == 0;
}


On Mon, Nov 8, 2021 at 3:03 PM Will Wray  wrote:
>
> This patch allows __builtin_bit_cast to materialize a C array as its To type.
>
> It was developed as part of an implementation of P1997, array copy-semantics,
> but is independent, so makes sense to submit, review and merge ahead of it.
>
> gcc/cp/ChangeLog:
>
> * constexpr.c (check_bit_cast_type): handle ARRAY_TYPE check,
> (cxx_eval_bit_cast): handle ARRAY_TYPE copy.
> * semantics.c (cp_build_bit_cast): warn only on unbounded/VLA.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp2a/bit-cast2.C: update XFAIL tests.
> * g++.dg/cpp2a/bit-cast-to-array1.C: New test.
> ---
>  gcc/cp/constexpr.c  |  8 -
>  gcc/cp/semantics.c  |  7 ++---
>  gcc/testsuite/g++.dg/cpp2a/bit-cast-to-array1.C | 40 
> +
>  gcc/testsuite/g++.dg/cpp2a/bit-cast2.C  |  8 ++---
>  4 files changed, 53 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> index 453007c686b..be1cdada6f8 100644
> --- a/gcc/cp/constexpr.c
> +++ b/gcc/cp/constexpr.c
> @@ -4124,6 +4124,11 @@ static bool
>  check_bit_cast_type (const constexpr_ctx *ctx, location_t loc, tree type,
>  tree orig_type)
>  {
> +  if (TREE_CODE (type) == ARRAY_TYPE)
> +  return check_bit_cast_type (ctx, loc,
> + TYPE_MAIN_VARIANT (TREE_TYPE (type)),
> + orig_type);
> +
>if (TREE_CODE (type) == UNION_TYPE)
>  {
>if (!ctx->quiet)
> @@ -4280,7 +4285,8 @@ cxx_eval_bit_cast (const constexpr_ctx *ctx, tree t, 
> bool *non_constant_p,
>tree r = NULL_TREE;
>if (can_native_interpret_type_p (TREE_TYPE (t)))
>  r = native_interpret_expr (TREE_TYPE (t), ptr, len);
> -  else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE)
> +  else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE
> +  || TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
>  {
>r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len);
>if (r != NULL_TREE)
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 2443d032749..b3126b12abc 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -11562,13 +11562,10 @@ cp_build_bit_cast (location_t loc, tree type, tree 
> arg,
>  {
>if (!complete_type_or_maybe_complain (type, NULL_TREE, complain))
> return error_mark_node;
> -  if (TREE_CODE (type) == ARRAY_TYPE)
> +  if (TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
> {
> - /* std::bit_cast for destination ARRAY_TYPE is not possible,
> -as functions may not return an array, so don't bother trying
> -to support this (and then deal with VLAs etc.).  */
>   error_at (loc, "%<__builtin_bit_cast%> destination type %qT "
> -"is an array type", type);
> +"is a VLA variable-length array type", type);
>   return error_mark_node;
> }
>if (!trivially_copyable_p (type))
> diff --git a/gcc/testsuite/g++.dg/cpp2a/bit-cast-to-array1.C 
> b/gcc/testsuite/g++.dg/cpp2a/bit-cast-to-array1.C
> new file mode 100644
> index 000..e6e50c06389
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/bit-cast-to-array1.C
> @@ -0,0 +1,40 @@
> +// { dg-do compile }
> +
> +class S { int s; };
> +S s();
> +class U { int a, b; };
> +U u();
> +
> +void
> +foo (int *q)
> +{
> +  __builtin_bit_cast (int [1], 0);
> +  __builtin_bit_cast (S [1], 0);
> +  __builtin_bit_cast (U [1], u);
> +}
> +
> +template 
> +void
> +bar (int *q)
> +{
> +  int intN[N] = {};
> +  int int2N[2*N] = {};
> +  __builtin_bit_cast (int [N], intN);
> +  __builtin_bit_cast (S [N], intN);
> +  __builtin_bit_cast (U [N], int2N);
> +}
> +
> +template 
> +void
> +baz (T1 ia, T2 sa, T3 ua)
> +{
> +  __builtin_bit_cast (T1, *ia);
> +  __builtin_bit_cast (T2, *sa);
> +  __builtin_bit_cast (T3, *ua);
> +}
> +
> +void
> +qux (S* sp, int *ip, U* up)
> +{
> +  baz  (ip, sp, up);
> +}
> diff --git a/gcc/testsuite/g++.dg/cpp2a/bit-cast2.C 
> b/gcc/testsuite/g++.dg/cpp2a/bit-cast2.C
> index 6bb1760e621..7f1836ee4e9 100644
> --- a/gcc/testsuite/g++.dg/cpp2a/bit-cast2.C
> +++ b/gcc/testsuite/g++.dg/cpp2a/bit-cast2.C
> @@ -14,7 +14,7 @@ foo (int *q)
>__builtin_bit_cast (int, s); // { dg-error "'__builtin_bit_cast' 
> source type 'S' is not trivially copyable" }
>__builtin_bit_cast (S, 0);  

RE: [Patch 1/8, Arm, AArch64, GCC] Refactor mbranch-protection option parsing and make it common to AArch32 and AArch64 backends. [Was RE: [Patch 2/7, Arm, GCC] Add option -mbranch-protection.]

2021-11-15 Thread Tejas Belagod via Gcc-patches
Ping for this series.

Thanks,
Tejas.

> -Original Message-
> From: Gcc-patches  bounces+belagod=gcc.gnu@gcc.gnu.org> On Behalf Of Tejas Belagod via
> Gcc-patches
> Sent: Thursday, October 28, 2021 12:41 PM
> To: Richard Earnshaw ; gcc-
> patc...@gcc.gnu.org
> Subject: [Patch 1/8, Arm, AArch64, GCC] Refactor mbranch-protection option
> parsing and make it common to AArch32 and AArch64 backends. [Was RE:
> [Patch 2/7, Arm, GCC] Add option -mbranch-protection.]
> 
> 
> 
> > -Original Message-
> > From: Richard Earnshaw 
> > Sent: Monday, October 11, 2021 1:58 PM
> > To: Tejas Belagod ; gcc-patches@gcc.gnu.org
> > Subject: Re: [Patch 2/7, Arm, GCC] Add option -mbranch-protection.
> >
> > On 08/10/2021 13:17, Tejas Belagod via Gcc-patches wrote:
> > > Hi,
> > >
> > > Add -mbranch-protection option and its associated parsing routines.
> > > This option enables the code-generation of pointer signing and
> > > authentication instructions in function prologues and epilogues.
> > >
> > > Tested on arm-none-eabi. OK for trunk?
> > >
> > > 2021-10-04  Tejas Belagod  
> > >
> > > gcc/ChangeLog:
> > >
> > >   * common/config/arm/arm-common.c
> > >(arm_print_hit_for_pacbti_option): New.
> > >(arm_progress_next_token): New.
> > >(arm_parse_pac_ret_clause): New routine for parsing the
> > >   pac-ret clause for -mbranch-protection.
> > >   (arm_parse_pacbti_option): New routine to parse all the options
> > >   to -mbranch-protection.
> > >   * config/arm/arm-protos.h (arm_parse_pacbti_option): Export.
> > >   * config/arm/arm.c (arm_configure)build_target): Handle option
> > >   to -mbranch-protection.
> > >   * config/arm/arm.opt (mbranch-protection). New.
> > >   (arm_enable_pacbti): New.
> > >
> >
> > You're missing documentation for invoke.texi.
> >
> > Also, how does this differ from the exising option in aarch64?  Can
> > the code from that be adapted to be made common to both targets rather
> > than doing a new implementation?
> >
> > Finally, there are far to many manifest constants in this patch, they
> > need replacing with enums or #defines as appropriate if we cannot
> > share the
> > aarch64 code.
> >
> 
> Thanks for the reviews.
> 
> This change refactors all the mbranch-protection option parsing code and
> types to make it common to both AArch32 and AArch64 backends.  This
> change also pulls in some supporting types from AArch64 to make it common
> (aarch_parse_opt_result).  The significant changes in this patch are the
> movement of all branch protection parsing routines from aarch64.c to aarch-
> common.c and supporting data types and static data structures.  This patch
> also pre-declares variables and types required in the aarch32 back for moved
> variables for function sign scope and key to prepare for the impending series
> of patches that support parsing the feature mbranch-protection in the
> aarch32 back end.
> 
> 2021-10-25  Tejas Belagod  
> 
> gcc/ChangeLog:
> 
>   * common/config/aarch64/aarch64-common.c: Include aarch-
> common.h.
>   (all_architectures): Fix comment.
>   (aarch64_parse_extension): Rename return type, enum value
> names.
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
> Rename
>   factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
>   Also rename corresponding enum values.
>   * config/aarch64/aarch64-opts.h (aarch64_function_type): Factor out
>   aarch64_function_type and move it to common code as
> aarch_function_type
>   in aarch-common.h.
>   * config/aarch64/aarch64-protos.h: Include common types header,
> move out
>   types aarch64_parse_opt_result and aarch64_key_type to aarch-
> common.h
>   * config/aarch64/aarch64.c: Move mbranch-protection parsing types
> and
>   functions out into aarch-common.h and aarch-common.c.  Fix up all
> the name
>   changes resulting from the move.
>   * config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name
> change
>   and enum value.
>   * config/aarch64/aarch64.opt: Include aarch-common.h to import
> type move.
>   Fix up name changes from factoring out common code and data.
>   * config/arm/aarch-common-protos.h: Export factored out routines
> to both
>   backends.
>   * config/arm/aarch-common.c: Include newly factored out types.
> Move all
>   mbranch-protection code and data structures from aarch64.c.
>   * config/arm/aarch-common.h: New header that declares types
> shared between
>   aarch32 and aarch64 backends.
>   * config/arm/arm-protos.h: Declare types and variables that are
> made common
>   to aarch64 and aarch32 backends - aarch_ra_sign_key,
> aarch_ra_sign_scope and
>   aarch_enable_bti.
> 
> 
> Tested the following configurations. OK for trunk?
> 
> -mthumb/-march=armv8.1-m.main+pacbti/-mfloat-abi=soft
> -marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp
> mcmodel=small and tiny
> aarch64-none-linux-gnu native test and bootstrap
> 
> 

[PATCH 2/2] libstdc++: Use diagnose_as attribute to improve simd diagnostics

2021-11-15 Thread Matthias Kretz


Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Diagnose
'std::experimental::parallelism_v2::simd_abi' as 'simd_abi'.
On x86, diagnose _VecBuiltin<16>, _VecBuiltin<32>, and
_VecBltnBtmsk<64> as 'simd_abi::[SSE]', 'simd_abi::[AVX]', and
'simd_abi::AVX512' respectively.
(simd_abi::_Scalar): Diagnose as 'simd_abi::scalar'.
(simd_abi::_Fixed): Diagnose as 'simd_abi::fixed_size'.
(__odr_helper): Shorten implementation details (effectively
hiding them).
* include/experimental/bits/simd_detail.h: Diagnose
'std::experimental::parallelism_v2' as 'stdₓ'.
---
 libstdc++-v3/include/experimental/bits/simd.h | 37 +--
 .../include/experimental/bits/simd_detail.h   |  2 +-
 2 files changed, 11 insertions(+), 28 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 4fbad7d67b5..f581b46fbd8 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -83,13 +83,13 @@ using __m512d [[__gnu__::__vector_size__(64)]] = double;
 using __m512i [[__gnu__::__vector_size__(64)]] = long long;
 #endif
 
-namespace simd_abi {
+namespace simd_abi [[__gnu__::__diagnose_as__("simd_abi")]] {
 // simd_abi forward declarations {{{
 // implementation details:
-struct _Scalar;
+  struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar;
 
 template 
-  struct _Fixed;
+  struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed;
 
 // There are two major ABIs that appear on different architectures.
 // Both have non-boolean values packed into an N Byte register
@@ -108,28 +108,11 @@ template 
 template 
   struct _VecBltnBtmsk;
 
-template 
-  using _VecN = _VecBuiltin;
-
-template 
-  using _Sse = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx512 = _VecBltnBtmsk<_UsedBytes>;
-
-template 
-  using _Neon = _VecBuiltin<_UsedBytes>;
-
-// implementation-defined:
-using __sse = _Sse<>;
-using __avx = _Avx<>;
-using __avx512 = _Avx512<>;
-using __neon = _Neon<>;
-using __neon128 = _Neon<16>;
-using __neon64 = _Neon<8>;
+#if defined __i386__ || defined __x86_64__
+using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>;
+using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>;
+using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = _VecBltnBtmsk<64>;
+#endif
 
 // standard:
 template 
@@ -367,7 +350,7 @@ namespace __detail
* users link TUs compiled with different flags. This is especially important
* for using simd in libraries.
*/
-  using __odr_helper
+  using __odr_helper [[__gnu__::__diagnose_as__("[ODR helper]")]]
 = conditional_t<__machine_flags() == 0, _OdrEnforcer,
 		_MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>;
 
@@ -692,7 +675,7 @@ template 
   __is_avx512_abi()
   {
 constexpr auto _Bytes = __abi_bytes_v<_Abi>;
-return _Bytes <= 64 && is_same_v, _Abi>;
+return _Bytes <= 64 && is_same_v, _Abi>;
   }
 
 // }}}
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
index 198c925c133..437f1ddb278 100644
--- a/libstdc++-v3/include/experimental/bits/simd_detail.h
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -37,7 +37,7 @@
   {\
 _GLIBCXX_BEGIN_NAMESPACE_VERSION   \
   namespace experimental { \
-  inline namespace parallelism_v2 {
+	inline namespace parallelism_v2 [[__gnu__::__diagnose_as__("std\u2093")]] {
 #define _GLIBCXX_SIMD_END_NAMESPACE\
   }\
   }\


[PATCH 1/2] libstdc++: Use diagnose_as attribute to improve string diagnostics

2021-11-15 Thread Matthias Kretz


This hides the basic_string template in all diagnostics, reducing the
signal-to-noise ratio significantly. It also hides the std::__cxx11
namespace from users by presenting it as std.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR c++/89370
* include/bits/c++config: Diagnose std::__cxx11:: as std:: using
the diagnose_as attribute.
* include/bits/stringfwd.h: Add diagnose_as attribute to string,
wstring, u8string, u16string, and u32string.
* include/debug/string: Ditto.
* include/experimental/string: Ditto.
* include/std/string: Ditto.
---
 libstdc++-v3/include/bits/c++config  |  3 ++-
 libstdc++-v3/include/bits/stringfwd.h| 10 +-
 libstdc++-v3/include/debug/string| 10 +-
 libstdc++-v3/include/experimental/string | 10 +-
 libstdc++-v3/include/std/string  | 10 +-
 5 files changed, 22 insertions(+), 21 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index a6495809671..02d11afc1aa 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -318,7 +318,8 @@ namespace std
 #if _GLIBCXX_USE_CXX11_ABI
 namespace std
 {
-  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
+  inline namespace __cxx11
+__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { }
 }
 namespace __gnu_cxx
 {
diff --git a/libstdc++-v3/include/bits/stringfwd.h b/libstdc++-v3/include/bits/stringfwd.h
index bcfd350e505..3f653feae14 100644
--- a/libstdc++-v3/include/bits/stringfwd.h
+++ b/libstdc++-v3/include/bits/stringfwd.h
@@ -74,22 +74,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   /// A string of @c char
-  typedef basic_stringstring;   
+  typedef basic_stringstring __attribute__((__diagnose_as__));
 
   /// A string of @c wchar_t
-  typedef basic_string wstring;   
+  typedef basic_string wstring __attribute__((__diagnose_as__));
 
 #ifdef _GLIBCXX_USE_CHAR8_T
   /// A string of @c char8_t
-  typedef basic_string u8string;
+  typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
 
 #if __cplusplus >= 201103L
   /// A string of @c char16_t
-  typedef basic_string u16string; 
+  typedef basic_string u16string __attribute__((__diagnose_as__));
 
   /// A string of @c char32_t
-  typedef basic_string u32string; 
+  typedef basic_string u32string __attribute__((__diagnose_as__));
 #endif
 
   /** @}  */
diff --git a/libstdc++-v3/include/debug/string b/libstdc++-v3/include/debug/string
index a8389528001..d6299e5552f 100644
--- a/libstdc++-v3/include/debug/string
+++ b/libstdc++-v3/include/debug/string
@@ -1296,21 +1296,21 @@ namespace __gnu_debug
   return __res;
 }
 
-  typedef basic_stringstring;
+  typedef basic_stringstring __attribute__((__diagnose_as__));
 
-  typedef basic_string wstring;
+  typedef basic_string wstring __attribute__((__diagnose_as__));
 
 #ifdef _GLIBCXX_USE_CHAR8_T
   /// A string of @c char8_t
-  typedef basic_string u8string;
+  typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
 
 #if __cplusplus >= 201103L
   /// A string of @c char16_t
-  typedef basic_string u16string;
+  typedef basic_string u16string __attribute__((__diagnose_as__));
 
   /// A string of @c char32_t
-  typedef basic_string u32string;
+  typedef basic_string u32string __attribute__((__diagnose_as__));
 #endif
 
   template
diff --git a/libstdc++-v3/include/experimental/string b/libstdc++-v3/include/experimental/string
index 4d92a7e39cc..91a9dd8b164 100644
--- a/libstdc++-v3/include/experimental/string
+++ b/libstdc++-v3/include/experimental/string
@@ -73,13 +73,13 @@ inline namespace fundamentals_v2
 
 // basic_string typedef names using polymorphic allocator in namespace
 // std::experimental::pmr
-typedef basic_string string;
+typedef basic_string string __attribute__((__diagnose_as__));
 #ifdef _GLIBCXX_USE_CHAR8_T
-typedef basic_string u8string;
+typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
-typedef basic_string u16string;
-typedef basic_string u32string;
-typedef basic_string wstring;
+typedef basic_string u16string __attribute__((__diagnose_as__));
+typedef basic_string u32string __attribute__((__diagnose_as__));
+typedef basic_string wstring __attribute__((__diagnose_as__));
 
   } // namespace pmr
 #endif
diff --git a/libstdc++-v3/include/std/string b/libstdc++-v3/include/std/string
index af840e887d5..03a3c68050f 100644
--- a/libstdc++-v3/include/std/string
+++ b/libstdc++-v3/include/std/string
@@ -62,13 +62,13 @@ 

[PATCH 0/2] Make use of the diagnose_as attribute to improve libstdc++ diagnostics

2021-11-15 Thread Matthias Kretz
After my two C++ patches for template diagnostics and the diagnose_as 
attribute are in, I'd like to make use of the attribute for std::*string and 
std::pmr::*string as well as for std::experimental::simd diagnostics.

Matthias Kretz (2):
  libstdc++: Use diagnose_as attribute to improve string diagnostics
  libstdc++: Use diagnose_as attribute to improve simd diagnostics

 libstdc++-v3/include/bits/c++config   |  3 +-
 libstdc++-v3/include/bits/stringfwd.h | 10 ++---
 libstdc++-v3/include/debug/string | 10 ++---
 libstdc++-v3/include/experimental/bits/simd.h | 37 +--
 .../include/experimental/bits/simd_detail.h   |  2 +-
 libstdc++-v3/include/experimental/string  | 10 ++---
 libstdc++-v3/include/std/string   | 10 ++---
 7 files changed, 33 insertions(+), 49 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──



PING 2 [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-11-15 Thread Martin Sebor via Gcc-patches

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582415.html

On 11/7/21 4:31 PM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582415.html

On 10/23/21 5:06 PM, Martin Sebor wrote:

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.

Tested on x86_64-linux.

Martin






PING [PATCH] fix up compute_objsize (including PR 103143)

2021-11-15 Thread Martin Sebor via Gcc-patches

Ping for the following cleanup patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583735.html

On 11/8/21 7:34 PM, Martin Sebor wrote:

The pointer-query code that implements compute_objsize() that's
in turn used by most middle end access warnings now has a few
warts in it and (at least) one bug.  With the exception of
the bug the warts aren't behind any user-visible bugs that
I know of but they do cause problems in new code I've been
implementing on top of it.  Besides fixing the one bug (just
a typo) the attached patch cleans up these latent issues:

1) It moves the bndrng member from the access_ref class to
    access_data.  As a FIXME in the code notes, the member never
    did belong in the former and only takes up space in the cache.

2) The compute_objsize_r() function is big, unwieldy, and tedious
    to step through because of all the if statements that are better
    coded as one switch statement.  This change factors out more
    of its code into smaller handler functions as has been suggested
    and done a few times before.

3) (2) exposed a few places where I fail to pass the current
    GIMPLE statement down to ranger.  This leads to worse quality
    range info, including possible false positives and negatives.
    I just spotted these problems in code review but I haven't
    taken the time to come up with test cases.  This change fixes
    these oversights as well.

4) The handling of PHI statements is also in one big, hard-to-
    follow function.  This change moves the handling of each PHI
    argument into its own handler which merges it into the previous
    argument.  This makes the code easier to work with and opens it
    to reuse also for MIN_EXPR and MAX_EXPR.  (This is primarily
    used to print informational notes after warnings.)

5) Finally, the patch factors code to dump each access_ref
    cached by the pointer_query cache out of pointer_query::dump
    and into access_ref::dump.  This helps with debugging.

These changes should have no user-visible effect and other than
a regression test for the typo (PR 103143) come with no tests.
They've been tested on x86_64-linux.

Martin




PING 2 [PATCH 0/2] provide simple detection of indeterminate pointers

2021-11-15 Thread Martin Sebor via Gcc-patches

Pinging the two patches below:

-Wuse-after-free:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583044.html

and -Wdangling-pointer:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583045.html

On 11/8/21 3:41 PM, Martin Sebor wrote:

Ping for the two patches below:

-Wuse-after-free:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583044.html

and -Wdangling-pointer:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583045.html

On 11/1/21 4:15 PM, Martin Sebor wrote:

This two-patch series adds support for the detection of uses
of pointers invalidated as a result of the lifetime of
the objects they point to having ended: either explicitly,
after a call to a dynamic deallocation function, or implicitly,
by virtue of an object with automatic storage duration having
gone out of scope.

To minimize false positives the initial logic is very simple
(even simplistic): the code only checks uses in basic blocks
dominated by the invalidating calls (either calls to
deallocation functions or GCC's clobbers).

A more thorough checker is certainly possible and I'd say most
desirable but will require a more sophisticated implementation
and a better predicate analyzer than is available, and so will
need to wait for GCC 13.

Martin






Re: [ping] Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'

2021-11-15 Thread Martin Sebor via Gcc-patches

On 11/15/21 8:01 AM, Thomas Schwinge wrote:

Hi!

Ping.


This change looks good to me.

Martin



Grüße
  Thomas


On 2021-11-09T15:18:44+0100, I wrote:

Hi!

On 2021-09-03T21:16:46+0200, I wrote:

On 2021-09-01T18:14:46-0600, Martin Sebor  wrote:

On 9/1/21 1:35 PM, Thomas Schwinge wrote:

On 2021-06-23T13:47:08-0600, Martin Sebor via Gcc-patches 
 wrote:

--- /dev/null
+++ b/gcc/diagnostic-spec.h



+typedef location_t key_type_t;
+typedef int_hash  xint_hash_t;



By the way, it seems we should probably also use a manifest constant
for Empty (probably UNKNOWN_LOCATION since we're reserving it).


Yes, that will be part of another patch here -- waiting for approval of
"Generalize 'gcc/input.h:struct location_hash'" posted elsewhere.


... which I have now pushed, so I may now propose the attached patch to
"Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'".  OK to
push?


Grüße
  Thomas



-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955





Ping: [PATCH 5/5] Add Power10 XXSPLTIDP for SFmode/DFmode constants.

2021-11-15 Thread Michael Meissner via Gcc-patches
Ping patch.

| Date: Fri, 5 Nov 2021 00:11:20 -0400
| Subject: [PATCH 5/5] Add Power10 XXSPLTIDP for SFmode/DFmode constants.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 4/5] Add Power10 XXSPLTIDP for vector constants

2021-11-15 Thread Michael Meissner via Gcc-patches
Ping patch.

| Date: Fri, 5 Nov 2021 00:10:18 -0400
| Subject: [PATCH 4/5] Add Power10 XXSPLTIDP for vector constants
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 3/5] Add Power10 XXSPLTIW

2021-11-15 Thread Michael Meissner via Gcc-patches
Ping patch.

| Date: Fri, 5 Nov 2021 00:09:07 -0400
| Subject: [PATCH 3/5] Add Power10 XXSPLTIW
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 2/5] Add Power10 XXSPLTI* and LXVKQ instructions (LXVKQ)

2021-11-15 Thread Michael Meissner via Gcc-patches
Ping patch:

| Date: Fri, 5 Nov 2021 00:07:05 -0400
| Subject: [PATCH 2/5] Add Power10 XXSPLTI* and LXVKQ instructions (LXVKQ)
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 1/5] Add XXSPLTI* and LXVKQ instructions (new data structure and function)

2021-11-15 Thread Michael Meissner via Gcc-patches
Ping patch.

| Date: Fri, 5 Nov 2021 00:04:40 -0400
| Subject: [PATCH 1/5] Add XXSPLTI* and LXVKQ instructions (new data structure 
and function)
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[COMMITTED] Drop tree overflow in irange setter.

2021-11-15 Thread Aldy Hernandez via Gcc-patches
Drop meaningless overflow that may creep into the IL.

Tested on x86-64 Linux.

gcc/ChangeLog:

PR tree-optimization/103207
* value-range.cc (irange::set): Drop overflow.

gcc/testsuite/ChangeLog:

* gcc.dg/pr103207.c: New test.
---
 gcc/testsuite/gcc.dg/pr103207.c | 15 +++
 gcc/value-range.cc  |  8 
 2 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr103207.c

diff --git a/gcc/testsuite/gcc.dg/pr103207.c b/gcc/testsuite/gcc.dg/pr103207.c
new file mode 100644
index 000..69c0f555f86
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103207.c
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-options "-O2 --param case-values-threshold=1 -w" }
+
+int f (int i)
+{
+  switch (i) {
+  case 2147483647:
+return 1;
+  case 9223372036854775807L:
+return 2;
+  case (2147483647*4)%4:
+return 4;
+  }
+  return 0;
+}
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index caef2498959..82509fa55a7 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -270,6 +270,14 @@ irange::irange_set_anti_range (tree min, tree max)
 void
 irange::set (tree min, tree max, value_range_kind kind)
 {
+  if (kind != VR_UNDEFINED)
+{
+  if (TREE_OVERFLOW_P (min))
+   min = drop_tree_overflow (min);
+  if (TREE_OVERFLOW_P (max))
+   max = drop_tree_overflow (max);
+}
+
   if (!legacy_mode_p ())
 {
   if (kind == VR_RANGE)
-- 
2.31.1



Re: Enable ipa-sra for functions with fnspec attribute

2021-11-15 Thread Martin Jambor
Hi,

On Sat, Nov 13 2021, Jan Hubicka via Gcc-patches wrote:
> Hi,
> this patch enables some ipa-sra on fortran by allowing signature changes on 
> functions
> with "fn spec" attribute when ipa-modref is enabled.  This is possible since 
> ipa-modref
> knows how to preserve things we trace in fnspec and fnspec generated by 
> fortran forntend
> are quite simple and can be analysed automatically now.  To be sure I will 
> also add
> code that merge fnspec to parameters.
>
> This unfortunately hits bug in ipa-param-manipulation when we remove parameter
> that specifies size of variable length parameter. For this reason I added a 
> hack
> that prevent signature changes on such functions and will handle it 
> incrementally.
>
> I tried creating C testcase but it is blocked by another problem that we punt 
> ipa-sra
> on access attribute.  This is optimization regression we ought to fix so I 
> filled
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223.
>
> As a followup I will add code classifying the type attributes (we have just 
> few) and 
> get stats on access attribute.
>
> Martin, can you please check that the code detecting signature changes is 
> correct

I think both ways of detecting it in the patch are correct, in the sense
that if the clone in question does not modify parameters but is a clone
of another clone which does, the methods would still consider it a
changing clone (to do otherwise they would need to check
prev_clone_index and not base_index).  But I don't think the difference
matters here.  It might if we start updating the strings of the
attributes where position is important.

> ...and can't be done more easily?

in the case of ipa_param_body_adjustments you could introduce another
flag of the class and set it whenever any of the elements in local
vector kept in common_initialization() turned out not to be true.

Martin


>
> Bootstrapped/regtested x86_64-linux, comitted.
> Honza
>
> gcc/ChangeLog:
>
>   * ipa-fnsummary.c (compute_fn_summary): Do not give up on signature
>   changes on "fn spec" attribute; give up on varadic types.
>   * ipa-param-manipulation.c: Include attribs.h.
>   (build_adjusted_function_type): New parameter ARG_MODIFIED; if it is
>   true remove "fn spec" attribute.
>   (ipa_param_adjustments::build_new_function_type): Update.
>   (ipa_param_body_adjustments::modify_formal_parameters): update.
>   * ipa-sra.c: Include attribs.h.
>   (ipa_sra_preliminary_function_checks): Do not check for TYPE_ATTRIBUTES.
>
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 2cfa9a6d0e9..94a80d3ec90 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -3135,10 +3135,38 @@ compute_fn_summary (struct cgraph_node *node, bool 
> early)
> else
>info->inlinable = tree_inlinable_function_p (node->decl);
>  
> -   /* Type attributes can use parameter indices to describe them.  */
> -   if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl))
> -/* Likewise for #pragma omp declare simd functions or functions
> -   with simd attribute.  */
> +   bool no_signature = false;
> +   /* Type attributes can use parameter indices to describe them.
> +   Special case fn spec since we can safely preserve them in
> +   modref summaries.  */
> +   for (tree list = TYPE_ATTRIBUTES (TREE_TYPE (node->decl));
> + list && !no_signature; list = TREE_CHAIN (list))
> +  if (!flag_ipa_modref
> +  || !is_attribute_p ("fn spec", get_attribute_name (list)))
> +{
> +  if (dump_file)
> + {
> +   fprintf (dump_file, "No signature change:"
> +" function type has unhandled attribute %s.\n",
> +IDENTIFIER_POINTER (get_attribute_name (list)));
> + }
> +  no_signature = true;
> +}
> +   for (tree parm = DECL_ARGUMENTS (node->decl);
> + parm && !no_signature; parm = DECL_CHAIN (parm))
> +  if (variably_modified_type_p (TREE_TYPE (parm), node->decl))
> +{
> +  if (dump_file)
> + {
> +   fprintf (dump_file, "No signature change:"
> +" has parameter with variably modified type.\n");
> + }
> +  no_signature = true;
> +}
> +
> +   /* Likewise for #pragma omp declare simd functions or functions
> +   with simd attribute.  */
> +   if (no_signature
>  || lookup_attribute ("omp declare simd",
>   DECL_ATTRIBUTES (node->decl)))
>node->can_change_signature = false;
> diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
> index ae3149718ca..20f41dd5363 100644
> --- a/gcc/ipa-param-manipulation.c
> +++ b/gcc/ipa-param-manipulation.c
> @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "symtab-clones.h"
>  #include "tree-phinodes.h"
>  #include "cfgexpand.h"
> +#include 

Re: Enable more type attributes for signature changes

2021-11-15 Thread Martin Sebor via Gcc-patches

On 11/13/21 7:44 AM, Jan Hubicka wrote:

Hi,
this patch whitelists attributes that are safe for attribute changes and
also makes access attribute dropped if function sigunature is changed.
We could do better by updating the attribute, but doing so seems to be
bit snowballing since with LTO the warnings produced seems bit confused.
We would also like to output original name of function
instead of mangledname.constprop or so.  I looked into what attributes
are dorpped in bootstrap and it does not look too bad.

Bootstrapped/regtested x86_64-linux, will commit it shortly.


I've always intended the access attribute to eventually benefit
optimization so please feel free (and encouraged :) to use it
for that purpose.

Martin



Honza

gcc/ChangeLog:

* ipa-fnsummary.c (compute_fn_summary): Use type_attribut_allowed_p
* ipa-param-manipulation.c 
(ipa_param_adjustments::type_attribute_allowed_p):
New member function.
(drop_type_attribute_if_params_changed_p): New function.
(build_adjusted_function_type): Use it.
* ipa-param-manipulation.h: Add type_attribute_allowed_p.

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 94a80d3ec90..7e9201a554a 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3141,8 +3141,8 @@ compute_fn_summary (struct cgraph_node *node, bool early)
  modref summaries.  */
 for (tree list = TYPE_ATTRIBUTES (TREE_TYPE (node->decl));
list && !no_signature; list = TREE_CHAIN (list))
-if (!flag_ipa_modref
-|| !is_attribute_p ("fn spec", get_attribute_name (list)))
+   if (!ipa_param_adjustments::type_attribute_allowed_p
+   (get_attribute_name (list)))
   {
 if (dump_file)
{
diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 991db0d9b1b..29268fa5a58 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -279,6 +279,32 @@ fill_vector_of_new_param_types (vec *new_types, 
vec *otypes,
  }
  }
  
+/* Return false if given attribute should prevent type adjustments.  */

+
+bool
+ipa_param_adjustments::type_attribute_allowed_p (tree name)
+{
+  if ((is_attribute_p ("fn spec", name) && flag_ipa_modref)
+  || is_attribute_p ("access", name)
+  || is_attribute_p ("returns_nonnull", name)
+  || is_attribute_p ("assume_aligned", name)
+  || is_attribute_p ("nocf_check", name)
+  || is_attribute_p ("warn_unused_result", name))
+return true;
+  return false;
+}
+
+/* Return true if attribute should be dropped if parameter changed.  */
+
+static bool
+drop_type_attribute_if_params_changed_p (tree name)
+{
+  if (is_attribute_p ("fn spec", name)
+  || is_attribute_p ("access", name))
+return true;
+  return false;
+}
+
  /* Build and return a function type just like ORIG_TYPE but with parameter
 types given in NEW_PARAM_TYPES - which can be NULL if, but only if,
 ORIG_TYPE itself has NULL TREE_ARG_TYPEs.  If METHOD2FUNC is true, also 
make
@@ -337,16 +363,19 @@ build_adjusted_function_type (tree orig_type, vec 
*new_param_types,
if (skip_return)
TREE_TYPE (new_type) = void_type_node;
  }
-  /* We only support one fn spec attribute on type.  Be sure to remove it.
- Once we support multiple attributes we will need to be able to unshare
- the list.  */
if (args_modified && TYPE_ATTRIBUTES (new_type))
  {
-  gcc_checking_assert
- (!TREE_CHAIN (TYPE_ATTRIBUTES (new_type))
-  && (is_attribute_p ("fn spec",
- get_attribute_name (TYPE_ATTRIBUTES (new_type);
+  tree t = TYPE_ATTRIBUTES (new_type);
+  tree *last = _ATTRIBUTES (new_type);
TYPE_ATTRIBUTES (new_type) = NULL;
+  for (;t; t = TREE_CHAIN (t))
+   if (!drop_type_attribute_if_params_changed_p
+   (get_attribute_name (t)))
+ {
+   *last = copy_node (t);
+   TREE_CHAIN (*last) = NULL;
+   last = _CHAIN (*last);
+ }
  }
  
return new_type;

diff --git a/gcc/ipa-param-manipulation.h b/gcc/ipa-param-manipulation.h
index 9440cbfc56c..5adf8a22356 100644
--- a/gcc/ipa-param-manipulation.h
+++ b/gcc/ipa-param-manipulation.h
@@ -254,6 +254,7 @@ public:
/* If true, make the function not return any value.  */
bool m_skip_return;
  
+  static bool type_attribute_allowed_p (tree);

  private:
ipa_param_adjustments () {}
  





[ping] Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'

2021-11-15 Thread Thomas Schwinge
Hi!

Ping.


Grüße
 Thomas


On 2021-11-09T15:18:44+0100, I wrote:
> Hi!
>
> On 2021-09-03T21:16:46+0200, I wrote:
>> On 2021-09-01T18:14:46-0600, Martin Sebor  wrote:
>>> On 9/1/21 1:35 PM, Thomas Schwinge wrote:
 On 2021-06-23T13:47:08-0600, Martin Sebor via Gcc-patches 
  wrote:
> --- /dev/null
> +++ b/gcc/diagnostic-spec.h

> +typedef location_t key_type_t;
> +typedef int_hash  xint_hash_t;
>>
>>> By the way, it seems we should probably also use a manifest constant
>>> for Empty (probably UNKNOWN_LOCATION since we're reserving it).
>>
>> Yes, that will be part of another patch here -- waiting for approval of
>> "Generalize 'gcc/input.h:struct location_hash'" posted elsewhere.
>
> ... which I have now pushed, so I may now propose the attached patch to
> "Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'".  OK to
> push?
>
>
> Grüße
>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From d346292fc95f1990abbc9f6a4a8eb89be0f0e88d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 31 Aug 2021 23:35:15 +0200
Subject: [PATCH] Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'

Instead of hard-coded '0'/'UINT_MAX', we now use the 'RESERVED_LOCATION_P'
values 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' as spare values for
'Empty'/'Deleted', and generally simplify the code.

	gcc/
	* diagnostic-spec.h (typedef xint_hash_t)
	(typedef xint_hash_map_t): Replace with...
	(typedef nowarn_map_t): ... this.
	(nowarn_map): Adjust.
	* diagnostic-spec.c (nowarn_map, suppress_warning_at): Likewise.
---
 gcc/diagnostic-spec.c | 4 ++--
 gcc/diagnostic-spec.h | 9 ++---
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/diagnostic-spec.c b/gcc/diagnostic-spec.c
index 85ffb725c02..d1e563d19ba 100644
--- a/gcc/diagnostic-spec.c
+++ b/gcc/diagnostic-spec.c
@@ -107,7 +107,7 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
 
 /* A mapping from a 'location_t' to the warning spec set for it.  */
 
-GTY(()) xint_hash_map_t *nowarn_map;
+GTY(()) nowarn_map_t *nowarn_map;
 
 /* Return the no-warning disposition for location LOC and option OPT
or for all/any otions by default.  */
@@ -163,7 +163,7 @@ suppress_warning_at (location_t loc, opt_code opt /* = all_warnings */,
 return false;
 
   if (!nowarn_map)
-nowarn_map = xint_hash_map_t::create_ggc (32);
+nowarn_map = nowarn_map_t::create_ggc (32);
 
   nowarn_map->put (loc, optspec);
   return true;
diff --git a/gcc/diagnostic-spec.h b/gcc/diagnostic-spec.h
index e54e9e3ddbe..368b75f3254 100644
--- a/gcc/diagnostic-spec.h
+++ b/gcc/diagnostic-spec.h
@@ -130,14 +130,9 @@ operator!= (const nowarn_spec_t , const nowarn_spec_t )
   return !(lhs == rhs);
 }
 
-/* Per PR103157 "'gengtype': 'typedef' causing infinite-recursion code to be
-   generated", don't use
-   typedef int_hash xint_hash_t;
-   here.  */
-struct xint_hash_t : int_hash {};
-typedef hash_map xint_hash_map_t;
+typedef hash_map nowarn_map_t;
 
 /* A mapping from a 'location_t' to the warning spec set for it.  */
-extern GTY(()) xint_hash_map_t *nowarn_map;
+extern GTY(()) nowarn_map_t *nowarn_map;
 
 #endif // DIAGNOSTIC_SPEC_H_INCLUDED
-- 
2.33.0



[PATCH] tree-optimization/102880 - make PHI-OPT recognize more CFGs

2021-11-15 Thread Richard Biener via Gcc-patches
This allows extra edges into the middle BB for the PHI-OPT
transforms using replace_phi_edge_with_variable that do not
end up moving stmts from that middle BB.  This avoids regressing
gcc.dg/tree-ssa/ssa-hoist-4.c with the actual fix for PR102880
where CFG cleanup has the choice to remove two forwarders and
picks "the wrong" leading to

   if (a > b) /
   /\/
  /  
 /|
  # PHI 

rather than

   if (a > b)  |
   /\  |
 \ |
 /\|
  # PHI 

but it's relatively straight-forward to support extra edges
into the middle-BB in paths ending in replace_phi_edge_with_variable
and that do not require moving stmts.  That's because we really
only want to remove the edge from the condition to the middle BB.
Of course actually doing that means updating dominators in non-trival
ways which is why I kept the original code for the single edge
case and simply defer to CFG cleanup by adjusting the condition for
the complicated case.

The testcase needs to be a GIMPLE one since it's quite unreliable
to produce the desired CFG.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I'm going to push it tomorrow unless hearing complaints and will
followup with the actual regression fix which was already posted
on Friday.

Richard.

2021-11-15  Richard Biener  

PR tree-optimization/102880
* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Push
single_pred (bb1) condition to places that really need it.
(match_simplify_replacement): Likewise.
(value_replacement): Likewise.
(replace_phi_edge_with_variable): Deal with extra edges
into the middle BB.

* gcc.dg/tree-ssa/phi-opt-26.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-26.c | 31 ++
 gcc/tree-ssa-phiopt.c  | 71 +-
 2 files changed, 72 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-26.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-26.c
new file mode 100644
index 000..21aa66e38b8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-26.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fgimple -fdump-tree-phiopt1" } */
+
+int __GIMPLE (ssa,startwith("phiopt"))
+foo (int a, int b, int flag)
+{
+  int res;
+
+  __BB(2):
+  if (flag_2(D) != 0)
+goto __BB6;
+  else
+goto __BB4;
+
+  __BB(4):
+  if (a_3(D) > b_4(D))
+goto __BB7;
+  else
+goto __BB6;
+
+  __BB(6):
+  goto __BB7;
+
+  __BB(7):
+  res_1 = __PHI (__BB4: a_3(D), __BB6: b_4(D));
+  return res_1;
+}
+
+/* We should be able to detect MAX despite the extra edge into
+   the middle BB.  */
+/* { dg-final { scan-tree-dump "MAX" "phiopt1" } } */
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 173ac835ca6..6b22f6bedd4 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -220,7 +220,6 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
 
   /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
   if (EDGE_COUNT (bb1->succs) == 0
-  || bb2 == NULL
  || EDGE_COUNT (bb2->succs) == 0)
 continue;
 
@@ -276,14 +275,14 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
  || (e1->flags & EDGE_FALLTHRU) == 0)
 continue;
 
-  /* Also make sure that bb1 only have one predecessor and that it
-is bb.  */
-  if (!single_pred_p (bb1)
-  || single_pred (bb1) != bb)
-   continue;
-
   if (do_store_elim)
{
+ /* Also make sure that bb1 only have one predecessor and that it
+is bb.  */
+ if (!single_pred_p (bb1)
+ || single_pred (bb1) != bb)
+   continue;
+
  /* bb1 is the middle block, bb2 the join block, bb the split block,
 e1 the fallthrough edge from bb1 to bb2.  We can't do the
 optimization if the join block has more than two predecessors.  */
@@ -328,10 +327,11 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
 node.  */
  gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
 
- gphi *newphi = factor_out_conditional_conversion (e1, e2, phi,
-   arg0, arg1,
-   cond_stmt);
- if (newphi != NULL)
+ gphi *newphi;
+ if (single_pred_p (bb1)
+ && (newphi = factor_out_conditional_conversion (e1, e2, phi,
+ arg0, arg1,
+ cond_stmt)))
{
  phi = newphi;
  /* factor_out_conditional_conversion may create a new PHI in
@@ -350,12 +350,14 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
 

Re: [ping^4] Make sure that we get unique test names if several DejaGnu directives refer to the same line [PR102735]

2021-11-15 Thread Thomas Schwinge
Hi!

..., and here is another ping.


Grüße
 Thomas


On 2021-11-08T11:45:12+0100, I wrote:
> Hi!
>
> Ping, once more.
>
>
> Grüße
>  Thomas
>
>
> On 2021-10-14T12:12:41+0200, I wrote:
>> Hi!
>>
>> Ping, again.
>>
>> Commit log updated for 
>> "privatization-1-compute.c results in both XFAIL and PASS".
>>
>>
>> Grüße
>>  Thomas
>>
>>
>> On 2021-09-30T08:42:25+0200, I wrote:
>>> Hi!
>>>
>>> Ping.
>>>
>>> On 2021-09-22T13:03:46+0200, I wrote:
 On 2021-09-19T11:35:00-0600, Jeff Law via Gcc-patches 
  wrote:
> A couple of goacc tests do not have unique names.

 Thanks for fixing this up, and sorry, largely my "fault", I suppose.  ;-|

> This causes problems
> for the test comparison script when one of the test passes and the other
> fails -- in this scenario the test comparison script claims there is a
> regression.

 So I understand correctly that this is a problem not just for actual
 mixed PASS vs. FAIL (which we'd like you to report anyway!) that appear
 for the same line, but also for mixed PASS vs. XFAIL?  (Because, the
 latter appears to be what you're addressing with your commit here.)

> This slipped through for a while because I had turned off x86_64 testing
> (others test it regularly and I was revamping the tester's hardware
> requirements).  Now that I've acquired more x86_64 resources and turned
> on native x86 testing again, it's been flagged.

 (I don't follow that argument -- these test cases should be all generic?
 Anyway, not important, I guess.)

> This patch just adds a numeric suffix to the TODO string to disambiguate
> them.

 So, instead of doing this manually (always error-prone!), like you've...

> Committed to the trunk,

> commit f75b237254f32d5be32c9d9610983b777abea633
> Author: Jeff Law 
> Date:   Sun Sep 19 13:31:32 2021 -0400
>
> [committed] Make test names unique for a couple of goacc tests

> --- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
> @@ -39,9 +39,9 @@ contains
>!$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
>y = a
>  !$acc end parallel
> -! { dg-note {variable 'i' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> -! { dg-note {variable 'j' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> -! { dg-note {variable 'a' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* } 
> l_compute$c_compute }
> +! { dg-note {variable 'i' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO2" { xfail *-*-* 
> } l_compute$c_compute }
> +! { dg-note {variable 'j' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO3" { xfail *-*-* 
> } l_compute$c_compute }
> +! { dg-note {variable 'a' in 'private' clause potentially has 
> improper OpenACC privatization level: 'parm_decl'} "TODO4" { xfail *-*-* 
> } l_compute$c_compute }

 ... etc. (also similarly in a handful of earlier commits, if I remember
 correctly), why don't we do that programmatically, like in the attached
 "Make sure that we get unique test names if several DejaGnu directives
 refer to the same line", once and for all?  OK to push after proper
 testing?
>>>
>>> Attached again, for easy reference.
>>>
>>> I figure it may help if I showed an example of how this changes things;
>>> for the test case cited above (word-diff):
>>>
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 40+} (test for warnings, line 39)
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 41+} (test for warnings, line 22)
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 42+} (test for warnings, line 39)
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 43+} (test for warnings, line 22)
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 44+} (test for warnings, line 39)
>>> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
>>> 45+} (test for warnings, line 22)
>>> XFAIL: gfortran.dg/goacc/privatization-1-compute.f90   -O  TODO2 {+at 
>>> line 50+} (test for warnings, line 29)
>>> XFAIL: gfortran.dg/goacc/privatization-1-compute.f90   -O  TODO3 {+at 
>>> line 51+} (test for warnings, line 29)
>>> XFAIL: gfortran.dg/goacc/privatization-1-compute.f90   -O  TODO4 {+at 
>>> line 52+} (test for warnings, line 29)
>>> 

Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-15 Thread Peter Zijlstra
On Mon, Nov 15, 2021 at 12:33:16PM +0530, Prathamesh Kulkarni wrote:
> On Sun, 14 Nov 2021 at 02:07, David Malcolm via Gcc-patches

> > +/* Handle "returns_zero_on_failure" and "returns_zero_on_success" 
> > attributes;
> > +   arguments as in struct attribute_spec.handler.  */
> > +
> > +static tree
> > +handle_returns_zero_on_attributes (tree *node, tree name, tree, int,
> > +  bool *no_add_attrs)
> > +{
> > +  if (!INTEGRAL_TYPE_P (TREE_TYPE (*node)))
> > +{
> > +  error ("%qE attribute on a function not returning an integral type",
> > +name);
> > +  *no_add_attrs = true;
> > +}
> > +  return NULL_TREE;
> Hi David,
> Just curious if a warning should be emitted if the function is marked
> with the attribute but it's return value isn't actually 0 ?
> 
> There are other constants like -1 or 1 that are often used to indicate
> error, so maybe tweak the attribute to
> take the integer as an argument ?
> Sth like returns_int_on_success(cst) / returns_int_on_failure(cst) ?
> 
> Also, would it make sense to extend it for pointers too for returning
> NULL on success / failure ?

Please also consider that in Linux we use the 'last' page for error code
returns. That is, a function returning a pointer could return '(void
*)-EFAULT' also see linux/err.h


Re: [PATCH] rs6000: Fix a handful of 32-bit built-in function problems in the new support

2021-11-15 Thread Bill Schmidt via Gcc-patches
Hi Segher,

On 11/14/21 9:29 AM, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Nov 14, 2021 at 08:17:41AM -0600, Bill Schmidt wrote:
>> On 11/11/21 10:50 AM, Bill Schmidt wrote:
>>> On 11/11/21 7:11 AM, Segher Boessenkool wrote:
 void f(long x) { __builtin_set_texasr(x); }

 built with -m32 -mpowerpc64 gives (in the expand dump):

 void f (long int x)
 {
   long long unsigned int _1;

 ;;   basic block 2, loop depth 0
 ;;pred:   ENTRY
   _1 = (long long unsigned int) x_2(D);
   __builtin_set_texasr (_1); [tail call]
   return;
 ;;succ:   EXIT

 }

 The builtins have a "long long" argument in the existing code, in this
 configuration.  And this is not the same as "long" here.
>>> Hm, strange.  I'll have to go back and revisit this.  Something subtle 
>>> going on.
>>>
>> So, we have one of the more bizarre API situations here that I've ever seen.
>>
>> We have three 64-bit HTM registers:  TEXASR, TFHAR, and TFIAR.  We also have 
>> the
>> 32-bit TEXASRU, which is the upper half of TEXASR.  The documnted interfaces 
>> for
>> reading and modifying these registers are:
>>
>>   unsigned long __builtin_get_texasr (void);
>>   unsigned long __builtin_get_texasru (void);
>>   unsigned long __builtin_get_tfhar (void);
>>   unsigned long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long);
>>   void __builtin_set_texasru (unsigned long);
>>   void __builtin_set_tfhar (unsigned long);
>>   void __builtin_set_tfiar (unsigned long);
>>
>> In reality, these interfaces are defined this way for pure 32-bit and pure 
>> 64-bit,
>> but for -m32 -mpowerpc64 we have some grotesque hackery that overrides the
>> expected interfaces to be:
>>
>>   unsigned long long __builtin_get_texasr (void);
>>   unsigned long long __builtin_get_texasru (void);
>>   unsigned long long __builtin_get_tfhar (void);
>>   unsigned long long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long long);
>>   void __builtin_set_texasru (unsigned long long);
>>   void __builtin_set_tfhar (unsigned long long);
>>   void __builtin_set_tfiar (unsigned long long);
> Yes.  Everything in -m32 -mpowerpc64 should follow the 32-bit ABI.  If
> you consider these builtins part of the ABI (are they documented there?)
> then this is simply a bug.
>
>> An undocumented conditional API is a really, really bad idea, given that it
>> forces users of this interface for general code to #ifdef on the -m32
>> -mpowerpc64 condition.  Not to mention treating 32-bit registers the same as
>> 64-bit ones, and only modifying half the register on a 32-bit system.  (Is 
>> HTM
>> even supported on a 32-bit system?)
> There are no pure 32 bit CPUs that implement HTM, to my knowledge.  But
> of course HTM works fine with SF=0 (that is the reason TEXASRU exists!
> Compare to TB and TBU).
>
>> It would have likely been better to have one consistent interface, using
>> int for TEXASRU and long long for the others, even though that requires
>> dealing with two registers for the 32-bit case; but that's all water under
>> the bridge.  We have what we have.
> "long" for the others, actually.
>
> TFHAR and TFIAR hold code addresses.  TEXASR gets only the low 32 bits
> of the register read, that is why TEXASRU exists :-)

Yes, right - That makes sense, and is what is currently documented.

>
>> If I sound irritated, it's because, just for this case, I'll have to add a
>> bunch of extra machinery to track up to two prototypes for each builtin
>> function, and perform conditional initialization when it applies.  The one
>> good thing is that these already have a builtin attribute "htmspr" that I
>> can key off of to do the extra processing.
> Another option might be to finally fix this.  There still are shipping
> CPUs that support HTM ;-)
>
> And essentially no one uses -m32 -mpowerpc64 on Linux or AIX.  On Linux
> because ucontext_t and jmp_buf do not deal with the high half of the
> registers, and iiuc on AIX the kernel doesn't deal with it in context
> switches even.  Darwin does use it, but afaik no one runs Darwin on a
> CPU with HTM.

Agreed, this seems like an odd use case from the beginning.

>
>> And somebody ought to fix the misleading documentation...
> Yes.
>
> Do you want to fix this mess?  I will take a patch using "long" for
> all these registers and builtins (just like we have for essentially all
> other SPRs!)

Sure!  In fact, that's what my existing patch does.

Thanks -- I will use the time that frees up to take a look at how to get
the overloaded builtin name the user wrote to show up on error messages,
instead of translating it to the specific builtin name.  Then I'll
repost the remaining pieces of the testsuite patch and the 32-bit patch
with all outstanding issues resolved.

Thanks again for all the help.

Bill

>
>
> Segher


Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-15 Thread Richard Biener via Gcc-patches
On Mon, Nov 15, 2021 at 3:00 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> This patch is a prerequisite for a later one.  At the moment,
> >> if-conversion converts predicated POINTER_PLUS_EXPRs into
> >> non-wrapping forms, which for:
> >>
> >> … = base + offset
> >>
> >> becomes:
> >>
> >> tmp = (unsigned long) base
> >> … = tmp + offset
> >>
> >> It then hoists these conversions out of the loop where possible.
> >>
> >> However, because “base” is a valid gimple operand, there can be
> >> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
> >> lead to multiple instances of the same conversion.  The later VN pass
> >> is (and I think needs to be) restricted to the new if-converted code,
> >> whereas here we're deliberately inserting the conversions before the
> >> .LOOP_VECTORIZED condition:
> >>
> >> /* If we versioned loop then make sure to insert invariant
> >>stmts before the .LOOP_VECTORIZED check since the vectorizer
> >>will re-use that for things like runtime alias versioning
> >>whose condition can end up using those invariants.  */
> >>
> >> We can therefore enter the vectoriser with redundant conversions.
> >>
> >> The easiest fix seemed to be to defer the hoisting until after VN.
> >> This catches other hoisting opportunities too.
> >>
> >> Hoisting the code from the (artificial) loop in pr99102.c means
> >> that it's no longer worth vectorising.  The patch forces vectorisation
> >> instead of relying on the cost model.
> >>
> >> The patch also reverts pr87007-4.c and pr87007-5.c back to their
> >> original forms, undoing changes in 783dc66f9ccb0019c3dad.
> >> The code at the time the tests were added was:
> >>
> >> testl   %edi, %edi
> >> je  .L10
> >> vxorps  %xmm1, %xmm1, %xmm1
> >> vsqrtsd d3(%rip), %xmm1, %xmm0
> >> vsqrtsd d2(%rip), %xmm1, %xmm1
> >> ...
> >> .L10:
> >> ret
> >>
> >> with the operations being hoisted, and the vxorps was specifically
> >> wanted (compared to the previous code).  This patch restores the code
> >> to that form, with the hoisted operations and the vxorps.
> >>
> >> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >>
> >> Richard
> >>
> >>
> >> gcc/
> >> * tree-if-conv.c: Include tree-eh.h.
> >> (predicate_statements): Remove pe argument.  Don't hoist
> >> statements here.
> >> (combine_blocks): Remove pe argument.
> >> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
> >> (ifcvt_hoist_invariants): Likewise.
> >> (tree_if_conversion): Update call to combine_blocks.  Call
> >> ifcvt_hoist_invariants after VN.
> >>
> >> gcc/testsuite/
> >> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
> >>
> >> Revert:
> >>
> >> 2020-09-09  Richard Biener  
> >>
> >> * gcc.target/i386/pr87007-4.c: Adjust.
> >> * gcc.target/i386/pr87007-5.c: Likewise.
> >> ---
> >>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
> >>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
> >>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
> >>  gcc/tree-if-conv.c| 122 --
> >>  4 files changed, 114 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
> >> b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> index 6c1a13f0783..0d030d15c86 100644
> >> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> @@ -1,4 +1,4 @@
> >> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> >> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
> >> -fdump-tree-vect-details" } */
> >>  /* { dg-additional-options "-msve-vector-bits=256" { target 
> >> aarch64_sve256_hw } } */
> >>  long a[44];
> >>  short d, e = -7;
> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
> >> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> index 9c4b8005af3..e91bdcbac44 100644
> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> @@ -15,4 +15,4 @@ foo (int n, int k)
> >>d1 = ceil (d3);
> >>  }
> >>
> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c 
> >> b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> index e4d956a5d7f..20d13cf650b 100644
> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> >> @@ -15,4 +15,4 @@ foo (int n, int k)
> >>d1 = sqrt (d3);
> >>  }
> >>
> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> >> +/* { dg-final { scan-assembler-times 

Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-15 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> This patch is a prerequisite for a later one.  At the moment,
>> if-conversion converts predicated POINTER_PLUS_EXPRs into
>> non-wrapping forms, which for:
>>
>> … = base + offset
>>
>> becomes:
>>
>> tmp = (unsigned long) base
>> … = tmp + offset
>>
>> It then hoists these conversions out of the loop where possible.
>>
>> However, because “base” is a valid gimple operand, there can be
>> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
>> lead to multiple instances of the same conversion.  The later VN pass
>> is (and I think needs to be) restricted to the new if-converted code,
>> whereas here we're deliberately inserting the conversions before the
>> .LOOP_VECTORIZED condition:
>>
>> /* If we versioned loop then make sure to insert invariant
>>stmts before the .LOOP_VECTORIZED check since the vectorizer
>>will re-use that for things like runtime alias versioning
>>whose condition can end up using those invariants.  */
>>
>> We can therefore enter the vectoriser with redundant conversions.
>>
>> The easiest fix seemed to be to defer the hoisting until after VN.
>> This catches other hoisting opportunities too.
>>
>> Hoisting the code from the (artificial) loop in pr99102.c means
>> that it's no longer worth vectorising.  The patch forces vectorisation
>> instead of relying on the cost model.
>>
>> The patch also reverts pr87007-4.c and pr87007-5.c back to their
>> original forms, undoing changes in 783dc66f9ccb0019c3dad.
>> The code at the time the tests were added was:
>>
>> testl   %edi, %edi
>> je  .L10
>> vxorps  %xmm1, %xmm1, %xmm1
>> vsqrtsd d3(%rip), %xmm1, %xmm0
>> vsqrtsd d2(%rip), %xmm1, %xmm1
>> ...
>> .L10:
>> ret
>>
>> with the operations being hoisted, and the vxorps was specifically
>> wanted (compared to the previous code).  This patch restores the code
>> to that form, with the hoisted operations and the vxorps.
>>
>> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * tree-if-conv.c: Include tree-eh.h.
>> (predicate_statements): Remove pe argument.  Don't hoist
>> statements here.
>> (combine_blocks): Remove pe argument.
>> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
>> (ifcvt_hoist_invariants): Likewise.
>> (tree_if_conversion): Update call to combine_blocks.  Call
>> ifcvt_hoist_invariants after VN.
>>
>> gcc/testsuite/
>> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
>>
>> Revert:
>>
>> 2020-09-09  Richard Biener  
>>
>> * gcc.target/i386/pr87007-4.c: Adjust.
>> * gcc.target/i386/pr87007-5.c: Likewise.
>> ---
>>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
>>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
>>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
>>  gcc/tree-if-conv.c| 122 --
>>  4 files changed, 114 insertions(+), 14 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
>> b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> index 6c1a13f0783..0d030d15c86 100644
>> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
>> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> @@ -1,4 +1,4 @@
>> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
>> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
>> -fdump-tree-vect-details" } */
>>  /* { dg-additional-options "-msve-vector-bits=256" { target 
>> aarch64_sve256_hw } } */
>>  long a[44];
>>  short d, e = -7;
>> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
>> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> index 9c4b8005af3..e91bdcbac44 100644
>> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> @@ -15,4 +15,4 @@ foo (int n, int k)
>>d1 = ceil (d3);
>>  }
>>
>> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
>> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
>> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c 
>> b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> index e4d956a5d7f..20d13cf650b 100644
>> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> @@ -15,4 +15,4 @@ foo (int n, int k)
>>d1 = sqrt (d3);
>>  }
>>
>> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
>> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
>> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
>> index e88ddc9f788..0ad557a2f4d 100644
>> --- a/gcc/tree-if-conv.c
>> +++ b/gcc/tree-if-conv.c
>> @@ -121,6 +121,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "tree-cfgcleanup.h"
>>  #include 

[PATCH] x86_64: Avoid rorx rotation instructions with -Os

2021-11-15 Thread Roger Sayle

This patch teaches the i386 backend to avoid using BMI2's rorx
instructions when optimizing for size.  The benefits are shown
with the following example:

unsigned int ror1(unsigned int x) { return (x >> 1) | (x << 31); }
unsigned int ror2(unsigned int x) { return (x >> 2) | (x << 30); }
unsigned int rol2(unsigned int x) { return (x >> 30) | (x << 2); }
unsigned int rol1(unsigned int x) { return (x >> 31) | (x << 1); }

which currently with -Os -march=cascadelake generates:

ror1:   rorx$1, %edi, %eax  // 6 bytes
ret
ror2:   rorx$2, %edi, %eax  // 6 bytes
ret
rol2:   rorx$30, %edi, %eax // 6 bytes
ret
rol1:   rorx$31, %edi, %eax // 6 bytes
ret

but with this patch now generates:

ror1:   movl%edi, %eax  // 2 bytes
rorl%eax// 2 bytes
ret
ror2:   movl%edi, %eax  // 2 bytes
rorl$2, %eax// 3 bytes
ret
rol2:   movl%edi, %eax  // 2 bytes
roll$2, %eax// 3 bytes
ret
rol1:   movl%edi, %eax  // 2 bytes
roll%eax// 2 bytes
ret

I've confirmed that this patch is a win on the CSiBE benchmark,
even though rotations are rare, where for example libmspack/test/md5.o
shrinks from 5824 bytes to 5632 bytes.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2021-11-15  Roger Sayle  

gcc/ChangeLog
* config/i386/i386.md (*bmi2_rorx_1): Make conditional
on !optimize_function_for_size_p.
(*3_1): Add preferred_for_size attribute.
(define_splits): Conditionalize on !optimize_function_for_size_p.
(*bmi2_rorxsi3_1_zext): Likewise.
(*si2_1_zext): Add preferred_for_size attribute.
(define_splits): Conditionalize on !optimize_function_for_size_p.

Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6eb9de8..7394906 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12775,7 +12775,7 @@
(rotatert:SWI48
  (match_operand:SWI48 1 "nonimmediate_operand" "rm")
  (match_operand:QI 2 "" "")))]
-  "TARGET_BMI2"
+  "TARGET_BMI2 && !optimize_function_for_size_p (cfun)"
   "rorx\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "rotatex")
(set_attr "mode" "")])
@@ -12803,6 +12803,10 @@
 }
   [(set_attr "isa" "*,bmi2")
(set_attr "type" "rotate,rotatex")
+   (set (attr "preferred_for_size")
+ (cond [(eq_attr "alternative" "0")
+ (symbol_ref "true")]
+  (symbol_ref "false")))
(set (attr "length_immediate")
  (if_then_else
(and (eq_attr "type" "rotate")
@@ -12819,7 +12823,7 @@
(rotate:SWI48 (match_operand:SWI48 1 "nonimmediate_operand")
  (match_operand:QI 2 "const_int_operand")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_BMI2 && reload_completed"
+  "TARGET_BMI2 && reload_completed && !optimize_function_for_size_p (cfun)"
   [(set (match_dup 0)
(rotatert:SWI48 (match_dup 1) (match_dup 2)))]
 {
@@ -12833,7 +12837,7 @@
(rotatert:SWI48 (match_operand:SWI48 1 "nonimmediate_operand")
(match_operand:QI 2 "const_int_operand")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_BMI2 && reload_completed"
+  "TARGET_BMI2 && reload_completed && !optimize_function_for_size_p (cfun)"
   [(set (match_dup 0)
(rotatert:SWI48 (match_dup 1) (match_dup 2)))])
 
@@ -12842,7 +12846,7 @@
(zero_extend:DI
  (rotatert:SI (match_operand:SI 1 "nonimmediate_operand" "rm")
   (match_operand:QI 2 "const_0_to_31_operand" "I"]
-  "TARGET_64BIT && TARGET_BMI2"
+  "TARGET_64BIT && TARGET_BMI2 && !optimize_function_for_size_p (cfun)"
   "rorx\t{%2, %1, %k0|%k0, %1, %2}"
   [(set_attr "type" "rotatex")
(set_attr "mode" "SI")])
@@ -12870,6 +12874,10 @@
 }
   [(set_attr "isa" "*,bmi2")
(set_attr "type" "rotate,rotatex")
+   (set (attr "preferred_for_size")
+ (cond [(eq_attr "alternative" "0")
+ (symbol_ref "true")]
+  (symbol_ref "false")))
(set (attr "length_immediate")
  (if_then_else
(and (eq_attr "type" "rotate")
@@ -12887,7 +12895,8 @@
  (rotate:SI (match_operand:SI 1 "nonimmediate_operand")
 (match_operand:QI 2 "const_int_operand"
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_64BIT && TARGET_BMI2 && reload_completed"
+  "TARGET_64BIT && TARGET_BMI2 && reload_completed
+   && !optimize_function_for_size_p (cfun)"
   [(set (match_dup 0)
(zero_extend:DI (rotatert:SI (match_dup 1) (match_dup 2]
 {
@@ -12902,7 +12911,8 @@
  (rotatert:SI (match_operand:SI 1 "nonimmediate_operand")
   (match_operand:QI 2 "const_int_operand"
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_64BIT && TARGET_BMI2 && 

Re: [PATCH] x86: Add gcc.target/i386/pr103205-2.c

2021-11-15 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 15, 2021 at 05:40:01AM -0800, H.J. Lu via Gcc-patches wrote:
>   PR target/103205
>   * gcc.target/i386/pr103205-2.c: New test.

Ok, thanks.

Jakub



[committed] testsuite: Add testcase for already fixed PR [PR100469]

2021-11-15 Thread Jakub Jelinek via Gcc-patches
Hi!

This bug introduced in r11-7448-gff92ede8d269375f800e1b347a48f4698874b4a3
has been fixed already by r12-1354-g2d2ed777b23ab6503027039e0adbfe1162f52b2f
aka PR100852 fix.

Regtested on x86_64-linux -m32/-m64, committed to trunk (so far) as obvious.

2021-11-15  Jakub Jelinek  

PR debug/100469
* g++.dg/opt/pr100469.C: New test.

--- gcc/testsuite/g++.dg/opt/pr100469.C.jj  2021-11-15 14:44:46.796604198 
+0100
+++ gcc/testsuite/g++.dg/opt/pr100469.C 2021-11-15 14:44:38.662715051 +0100
@@ -0,0 +1,32 @@
+// PR debug/100469
+// { dg-do compile }
+// { dg-options "-O2 -fcompare-debug -fno-tree-dse -fno-tree-forwprop 
-fno-tree-tail-merge --param=sccvn-max-alias-queries-per-access=0" }
+
+struct S
+{
+  long m;
+  S (const S )
+  {
+m = s.m;
+  }
+  S (long l)
+  {
+m = l;
+  }
+  bool operatorX (const S )
+  {
+return m >= s.m;
+  }
+};
+
+static inline S
+bar (S a, S b)
+{
+  return a.operatorX (b) ? a : b;
+}
+
+S
+foo (S s)
+{
+  return bar (s, (S) 0);
+}

Jakub



Re: [PATCH][GCC] aarch64: Add LS64 extension and intrinsics

2021-11-15 Thread Richard Sandiford via Gcc-patches
Przemyslaw Wirkus  writes:
> Hi,
>
> This patch is adding support for LS64 (Armv8.7-A Load/Store 64 Byte extension)
> which is part of Armv8.7-A architecture. Changes include missing plumbing for
> TARGET_LS64, LS64 data structure and intrinsics defined in ACLE [0]. Machine
> description of intrinsics is using new V8DI mode added in a separate patch.
> __ARM_FEATURE_LS64 is defined if the Armv8.7-A LS64 instructions for atomic
> 64-byte access to device memory are supported.
>
> New compiler internal type is added wrapping ACLE struct data512_t [0]:
>
> typedef struct {
>   uint64_t val[8];
> } __arm_data512_t;
>
> Please note that command line support for this feature was already added [1].
>
>   [0] 
> https://github.com/ARM-software/acle/blob/main/main/acle.rst#load-store-64-byte-intrinsics
>   [1] commit e159c0aa10e50c292a534535c73f38d22b6129a8 (AArch64: Add 
> command-line
>   support for Armv8.7-a)
>
> For below C code see example snippets of generated code:
>
> #include 
>
> void
> func(const void * addr, data512_t *data) {
>   *data = __arm_ld64b (addr);
> }
>
> func:
>   ld64b   x8, [x0]
>   stp x8, x9, [x1]
>   sub sp, sp, #64
>   stp x10, x11, [x1, 16]
>   stp x12, x13, [x1, 32]
>   stp x14, x15, [x1, 48]
>   add sp, sp, 64
>   ret
> ~~~
>
> #include 
>
> uint64_t
> func(void *addr, data512_t value) {
> return  __arm_st64bv (addr, value);
> }
>
> func:
>   ldp x8, x9, [x1]
>   ldp x10, x11, [x1, 16]
>   ldp x12, x13, [x1, 32]
>   ldp x14, x15, [x1, 48]
>   st64bv  x1, x8, [x0]
>   mov x0, x1
>   ret
>
> ~~~
>
> uint64_t
> ls64_store_v0(const data512_t *input, void *addr)
> {
> uint64_t status;
> __asm__ volatile ("st64bv0 %0, %2, [%1]"
>   : "=r" (status), "=r" (addr)
>   : "r" (*input)
>   : "memory");
> return status;
> }
>
> ls64_store_v0:
>   ldp x8, x9, [x0]
>   ldp x10, x11, [x0, 16]
>   ldp x12, x13, [x0, 32]
>   ldp x14, x15, [x0, 48]
>   st64bv0 x0, x8, [x1]
>   ret
>
> Regtested on aarch64-elf cross and no issues.
>
> OK for master?
>
> gcc/ChangeLog:
>
> 2021-11-11  Przemyslaw Wirkus  
>
>   * config/aarch64/aarch64-builtins.c (enum aarch64_builtins):
>   Define AARCH64_LS64_BUILTIN_LD64B, AARCH64_LS64_BUILTIN_ST64B,
>   AARCH64_LS64_BUILTIN_ST64BV, AARCH64_LS64_BUILTIN_ST64BV0.
>   (aarch64_init_ls64_builtin_decl): Helper function.
>   (aarch64_init_ls64_builtins): Helper function.
>   (aarch64_init_ls64_builtins_types): Helper function.
>   (aarch64_general_init_builtins): Init LS64 intrisics for
>   TARGET_LS64.
>   (aarch64_expand_builtin_ls64): LS64 intrinsics expander.
>   (aarch64_general_expand_builtin): Handle aarch64_expand_builtin_ls64.
>   (ls64_builtins_data): New helper struct.
>   (v8di_UP): New define.
>   * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
>   __ARM_FEATURE_LS64.
>   * config/aarch64/aarch64.h (AARCH64_ISA_LS64): New define.
>   (AARCH64_ISA_V8_7): New define.
>   (TARGET_LS64): New define.
>   * config/aarch64/aarch64.md: Add UNSPEC_LD64B, UNSPEC_ST64B,
>   UNSPEC_ST64BV and UNSPEC_ST64BV0.
>   (ld64b): New define_insn.
>   (st64b): New define_insn.
>   (st64bv): New define_insn.
>   (st64bv0): New define_insn.
>   * config/aarch64/arm_acle.h (target):
>   (data512_t): New type derived from __arm_data512_t.
>   (__arm_data512_t): New internal type.
>   (__arm_ld64b): New intrinsic.
>   (__arm_st64b): New intrinsic.
>   (__arm_st64bv): New intrinsic.
>   (__arm_st64bv0): New intrinsic.
>   * config/arm/types.md: Add new type ls64.
>
> gcc/testsuite/ChangeLog:
>
> 2021-11-11  Przemyslaw Wirkus  
>
>   * gcc.target/aarch64/acle/ls64_asm.c: New test.
>   * gcc.target/aarch64/acle/ls64_ld64b-2.c: New test.
>   * gcc.target/aarch64/acle/ls64_ld64b.c: New test.
>   * gcc.target/aarch64/acle/ls64_st64b.c: New test.
>   * gcc.target/aarch64/acle/ls64_st64bv-2.c: New test.
>   * gcc.target/aarch64/acle/ls64_st64bv.c: New test.
>   * gcc.target/aarch64/acle/ls64_st64bv0-2.c: New test.
>   * gcc.target/aarch64/acle/ls64_st64bv0.c: New test.
>   * gcc.target/aarch64/pragma_cpp_predefs_2.c: Add checks
>   for __ARM_FEATURE_LS64.
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 
> 5053bf0f8fd6638bf84a6df06c0987a0216b69e7..d4a82eec3b26bfd1cb976d0870d60ee7d10b689a
>  100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -49,6 +49,7 @@
>  #include "gimple-fold.h"
>  
>  #define v8qi_UP  E_V8QImode
> +#define v8di_UP  E_V8DImode
>  #define v4hi_UP  E_V4HImode
>  #define v4hf_UP  E_V4HFmode
>  #define v2si_UP  E_V2SImode
> @@ -615,6 +616,11 @@ enum aarch64_builtins
>   

[PATCH] x86: Add gcc.target/i386/pr103205-2.c

2021-11-15 Thread H.J. Lu via Gcc-patches
PR target/103205
* gcc.target/i386/pr103205-2.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr103205-2.c | 46 ++
 1 file changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103205-2.c

diff --git a/gcc/testsuite/gcc.target/i386/pr103205-2.c 
b/gcc/testsuite/gcc.target/i386/pr103205-2.c
new file mode 100644
index 000..705081e51d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103205-2.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune-ctrl=^himode_math" } */
+
+extern short foo;
+extern unsigned short bar;
+
+int
+foo1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+int
+foo2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+int
+foo3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+unsigned short
+bar1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+unsigned short
+bar2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+unsigned short
+bar3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+/* { dg-final { scan-assembler-times "lock;?\[ \t\]*btrw" 2 } } */
+/* { dg-final { scan-assembler-times "lock;?\[ \t\]*btsw" 2 } } */
+/* { dg-final { scan-assembler-times "lock;?\[ \t\]*btcw" 2 } } */
+/* { dg-final { scan-assembler-not "cmpxchgw" } } */
-- 
2.33.1



[PATCH] Check optab before transforming atomic bit test and operations

2021-11-15 Thread H.J. Lu via Gcc-patches
Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

PR middle-end/103184
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
before transforming equivalent, but slighly different cases to
their canonical forms.

gcc/testsuite/

PR middle-end/103184
* gcc.dg/pr103184-1.c: New test.
* gcc.dg/pr103184-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/pr103184-1.c | 43 +++
 gcc/testsuite/gcc.dg/pr103184-2.c | 12 +
 gcc/tree-ssa-ccp.c| 34 +---
 3 files changed, 74 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103184-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr103184-2.c

diff --git a/gcc/testsuite/gcc.dg/pr103184-1.c 
b/gcc/testsuite/gcc.dg/pr103184-1.c
new file mode 100644
index 000..e567f95f63f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103184-1.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern char foo;
+extern unsigned char bar;
+
+int
+foo1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+int
+foo2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+int
+foo3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+unsigned short
+bar1 (void)
+{
+  return __sync_fetch_and_and (, ~1) & 1;
+}
+
+unsigned short
+bar2 (void)
+{
+  return __sync_fetch_and_or (, 1) & 1;
+}
+
+unsigned short
+bar3 (void)
+{
+  return __sync_fetch_and_xor (, 1) & 1;
+}
+
+/* { dg-final { scan-assembler-times "lock;?\[ \t\]*cmpxchgb" 6 { target { 
x86_64-*-* i?86-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/pr103184-2.c 
b/gcc/testsuite/gcc.dg/pr103184-2.c
new file mode 100644
index 000..499761fdbfd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103184-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include 
+
+int
+tbit0 (_Atomic int* a, int n)
+{
+#define BIT (0x1 << n)
+  return atomic_fetch_or (a, BIT) & BIT;
+#undef BIT
+}
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 0f79e9f05bd..fec68b5fc73 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3366,6 +3366,21 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
   || !gimple_vdef (call))
 return;
 
+  switch (fn)
+{
+case IFN_ATOMIC_BIT_TEST_AND_SET:
+  optab = atomic_bit_test_and_set_optab;
+  break;
+case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
+  optab = atomic_bit_test_and_complement_optab;
+  break;
+case IFN_ATOMIC_BIT_TEST_AND_RESET:
+  optab = atomic_bit_test_and_reset_optab;
+  break;
+default:
+  return;
+}
+
   tree bit = nullptr;
 
   mask = gimple_call_arg (call, 1);
@@ -3384,6 +3399,10 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
   if (lhs != use_rhs)
return;
 
+  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)))
+ == CODE_FOR_nothing)
+   return;
+
   gimple *g;
   gimple_stmt_iterator gsi;
   tree var;
@@ -3628,21 +3647,6 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
}
 }
 
-  switch (fn)
-{
-case IFN_ATOMIC_BIT_TEST_AND_SET:
-  optab = atomic_bit_test_and_set_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
-  optab = atomic_bit_test_and_complement_optab;
-  break;
-case IFN_ATOMIC_BIT_TEST_AND_RESET:
-  optab = atomic_bit_test_and_reset_optab;
-  break;
-default:
-  return;
-}
-
   if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == CODE_FOR_nothing)
 return;
 
-- 
2.33.1



Re: GCC 12.0.0 Status Report (2021-11-15), Stage 3 in effect NOW

2021-11-15 Thread H.J. Lu via Gcc-patches
On Mon, Nov 15, 2021 at 4:05 AM Richard Biener via Gcc-patches
 wrote:
>
>
> Status
> ==
>
> The GCC development branch now is open for general bugfixing (Stage 3).
>
> Take the quality data below with a big grain of salt - most of the
> new P3 classified bugs will become P1 or P2 (generally every
> regression against GCC 11 is to be considered P1 if it concerns
> primary or secondary platforms).
>
>
> Quality Data
> 
>
> Priority  #   Change from last report
> ---   ---
> P1   34   +  19
> P2  306   +  24
> P3  237   +  44
> P4  207   +   5
> P5   25
> ---   ---
> Total P1-P3 577   +  87
> Total   809   +  92
>
>
> Previous Report
> ===
>
> https://gcc.gnu.org/pipermail/gcc/2021-October/237464.html

Hi,

I'd like to add an option to disable copy relocation:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583022.html

Thanks.

-- 
H.J.


[PATCH] ivopts: Improve code generated for very simple loops.

2021-11-15 Thread Roger Sayle

This patch tidies up the code that GCC generates for simple loops,
by selecting/generating a simpler loop bound expression in ivopts.
The original motivation came from looking at the following loop (from
gcc.target/i386/pr90178.c)

int *find_ptr (int* mem, int sz, int val)
{
  for (int i = 0; i < sz; i++)
if (mem[i] == val)
  return [i];
  return 0;
}

which GCC currently compiles to:

find_ptr:
movq%rdi, %rax
testl   %esi, %esi
jle .L4
leal-1(%rsi), %ecx
leaq4(%rdi,%rcx,4), %rcx
jmp .L3
.L7:addq$4, %rax
cmpq%rcx, %rax
je  .L4
.L3:cmpl%edx, (%rax)
jne .L7
ret
.L4:xorl%eax, %eax
ret

Notice the relatively complex leal/leaq instructions, that result
from ivopts using the following expression for the loop bound:
inv_expr 2: ((unsigned long) ((unsigned int) sz_8(D) + 4294967295)
* 4 + (unsigned long) mem_9(D)) + 4

which results from NITERS being (unsigned int) sz_8(D) + 4294967295,
i.e. (sz - 1), and the logic in cand_value_at determining the bound
as BASE + NITERS*STEP at the start of the final iteration and as
BASE + NITERS*STEP + STEP at the end of the final iteration.

Ideally, we'd like the middle-end optimizers to simplify
BASE + NITERS*STEP + STEP as BASE + (NITERS+1)*STEP, especially
when NITERS already has the form BOUND-1, but with type conversions
and possible overflow to worry about, the above "inv_expr 2" is the
best that can be done by fold (without additional context information).

This patch improves ivopts' cand_value_at by instead of using just
the tree expression for NITERS, passing the data structure that
explains how that expression was derived.  This allows us to peek
under the surface to check that NITERS+1 doesn't overflow, and in
this patch to use the SSA_NAME already holding the required value.

In the motivating loop above, inv_expr 2 now becomes:
(unsigned long) sz_8(D) * 4 + (unsigned long) mem_9(D)

And as a result, on x86_64 we now generate:

find_ptr:
movq%rdi, %rax
testl   %esi, %esi
jle .L4
movslq  %esi, %rsi
leaq(%rdi,%rsi,4), %rcx
jmp .L3
.L7:addq$4, %rax
cmpq%rcx, %rax
je  .L4
.L3:cmpl%edx, (%rax)
jne .L7
ret
.L4:xorl%eax, %eax
ret


This improvement required one minor tweak to GCC's testsuite for
gcc.dg/wrapped-binop-simplify.c, where we again generate better
code, and therefore no longer find as many optimization opportunities
in later passes (vrp2).

Previously:

void v1 (unsigned long *in, unsigned long *out, unsigned int n)
{
  int i;
  for (i = 0; i < n; i++) {
out[i] = in[i];
  }
}

on x86_64 generated:
v1: testl   %edx, %edx
je  .L1
movl%edx, %edx
xorl%eax, %eax
.L3:movq(%rdi,%rax,8), %rcx
movq%rcx, (%rsi,%rax,8)
addq$1, %rax
cmpq%rax, %rdx
jne .L3
.L1:ret

and now instead generates:
v1: testl   %edx, %edx
je  .L1
movl%edx, %edx
xorl%eax, %eax
leaq0(,%rdx,8), %rcx
.L3:movq(%rdi,%rax), %rdx
movq%rdx, (%rsi,%rax)
addq$8, %rax
cmpq%rax, %rcx
jne .L3
.L1:ret


This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
and make -k check with no new failures.  Ok for mainline?


2021-11-15  Roger Sayle  

gcc/ChangeLog
* tree-ssa-loop-ivopts.c (cand_value_at): Take a class
tree_niter_desc* argument instead of just a tree for NITER.
If we require the iv candidate value at the end of the final
loop iteration, try using the original loop bound as the
NITER for sufficiently simple loops.
(may_eliminate_iv): Update (only) call to cand_value_at.

gcc/testsuite
* gcc.dg/wrapped-binop-simplify.c: Update expected test result.


Thanks in advance,
Roger
--

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4a498ab..cc81196 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -5034,28 +5034,48 @@ determine_group_iv_cost_address (struct ivopts_data 
*data,
   return !sum_cost.infinite_cost_p ();
 }
 
-/* Computes value of candidate CAND at position AT in iteration NITER, and
-   stores it to VAL.  */
+/* Computes value of candidate CAND at position AT in iteration DESC->NITER,
+   and stores it to VAL.  */
 
 static void
-cand_value_at (class loop *loop, struct iv_cand *cand, gimple *at, tree niter,
-  aff_tree *val)
+cand_value_at (class loop *loop, struct iv_cand *cand, gimple *at,
+  class tree_niter_desc *desc, aff_tree *val)
 {
   aff_tree step, delta, nit;
   struct iv *iv = cand->iv;
   tree type = TREE_TYPE (iv->base);
+  tree niter = desc->niter;
+  bool after_adjust = stmt_after_increment (loop, cand, at);
   tree 

Re: [PATCH] openmp: Add support for thread_limit clause on target

2021-11-15 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 15, 2021 at 02:00:42PM +0100, Tobias Burnus wrote:
> Hi,
> 
> On 15.11.21 13:05, Jakub Jelinek wrote:
> > OpenMP 5.1 says that thread_limit clause can also appear on target,
> > and similarly to teams should affect the thread-limit-var ICV.
> > On combined target teams, the clause goes to both.
> 
> This patch does this also for Fortran.
> 
> OK, once the post-bootstap testing finished successfully?

Ok, thanks.

> gcc/fortran/ChangeLog:
> 
>   * openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
>   * trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
>   teams.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.fortran/thread-limit-1.f90: New test.

Jakub



Re: [PATCH] openmp: Add support for thread_limit clause on target

2021-11-15 Thread Tobias Burnus

Hi,

On 15.11.21 13:05, Jakub Jelinek wrote:

OpenMP 5.1 says that thread_limit clause can also appear on target,
and similarly to teams should affect the thread-limit-var ICV.
On combined target teams, the clause goes to both.


This patch does this also for Fortran.

OK, once the post-bootstap testing finished successfully?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: openmp: Add support for thread_limit clause on target

gcc/fortran/ChangeLog:

	* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
	* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
	teams.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/thread-limit-1.f90: New test.

 gcc/fortran/openmp.c   |  3 +-
 gcc/fortran/trans-openmp.c |  2 ++
 .../testsuite/libgomp.fortran/thread-limit-1.f90   | 41 ++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 2893ab2befb..d120be81467 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -3563,7 +3563,8 @@ cleanup:
   (omp_mask (OMP_CLAUSE_DEVICE) | OMP_CLAUSE_MAP | OMP_CLAUSE_IF	\
| OMP_CLAUSE_DEPEND | OMP_CLAUSE_NOWAIT | OMP_CLAUSE_PRIVATE		\
| OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULTMAP			\
-   | OMP_CLAUSE_IS_DEVICE_PTR | OMP_CLAUSE_IN_REDUCTION)
+   | OMP_CLAUSE_IS_DEVICE_PTR | OMP_CLAUSE_IN_REDUCTION			\
+   | OMP_CLAUSE_THREAD_LIMIT)
 #define OMP_TARGET_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_DEVICE) | OMP_CLAUSE_MAP | OMP_CLAUSE_IF	\
| OMP_CLAUSE_USE_DEVICE_PTR | OMP_CLAUSE_USE_DEVICE_ADDR)
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index b86c7cf9833..5b3c310ba59 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -5870,6 +5870,8 @@ gfc_split_omp_clauses (gfc_code *code,
 	= code->ext.omp_clauses->lists[OMP_LIST_IS_DEVICE_PTR];
 	  clausesa[GFC_OMP_SPLIT_TARGET].device
 	= code->ext.omp_clauses->device;
+	  clausesa[GFC_OMP_SPLIT_TARGET].thread_limit
+	= code->ext.omp_clauses->thread_limit;
 	  for (int i = 0; i < OMP_DEFAULTMAP_CAT_NUM; i++)
 	clausesa[GFC_OMP_SPLIT_TARGET].defaultmap[i]
 	  = code->ext.omp_clauses->defaultmap[i];
diff --git a/libgomp/testsuite/libgomp.fortran/thread-limit-1.f90 b/libgomp/testsuite/libgomp.fortran/thread-limit-1.f90
new file mode 100644
index 000..bca69fbb466
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/thread-limit-1.f90
@@ -0,0 +1,41 @@
+! { dg-additional-options "-fdump-tree-original" }
+
+! { dg-final { scan-tree-dump-times "#pragma omp teams thread_limit\\(9\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp target thread_limit\\(9\\)" 1 "original" } }
+
+! { dg-final { scan-tree-dump-times "#pragma omp target nowait thread_limit\\(4\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp parallel num_threads\\(1\\)" 1 "original" } }
+
+! { dg-final { scan-tree-dump-times "#pragma omp target thread_limit\\(6\\)" 1 "original" } }
+
+
+module m
+  use omp_lib
+  implicit none
+contains
+
+subroutine uncalled()
+!$omp target teams thread_limit (9)
+!$omp end target teams
+end
+
+subroutine foo ()
+  block
+!$omp target parallel nowait thread_limit (4) num_threads (1)
+if (omp_get_thread_limit () > 4) &
+  stop 1
+!$omp end target parallel
+  end block
+  !$omp taskwait
+end
+end module
+
+program main
+  use m
+  implicit none
+  !$omp target thread_limit (6)
+if (omp_get_thread_limit () > 6) &
+  stop 2
+  !$omp end target
+  call foo ()
+end


[PATCH] libffi: Update LOCAL_PATCHES

2021-11-15 Thread H.J. Lu via Gcc-patches
Add

commit a91f844ef449d0dd1cf2e0e47b0ade0d8a6304e1
Author: Rainer Orth 
Date:   Mon Nov 15 10:24:27 2021 +0100

libffi: Use #define instead of .macro in  src/x86/win64.S [PR102874]

to LOCAL_PATCHES.

* LOCAL_PATCHES: Add commit a91f844ef44.
---
 libffi/LOCAL_PATCHES | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libffi/LOCAL_PATCHES b/libffi/LOCAL_PATCHES
index f9e74660950..63151905dcf 100644
--- a/libffi/LOCAL_PATCHES
+++ b/libffi/LOCAL_PATCHES
@@ -1,3 +1,4 @@
 5be7b66998127286fada45e4f23bd8a2056d553e
 4824ed41ba7cd63e60fd9f8769a58b79935a90d1
 90205f67e465ae7dfcf733c2b2b177ca7ff68da0
+a91f844ef449d0dd1cf2e0e47b0ade0d8a6304e1
-- 
2.33.1



Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-15 Thread Richard Biener via Gcc-patches
On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
 wrote:
>
> This patch is a prerequisite for a later one.  At the moment,
> if-conversion converts predicated POINTER_PLUS_EXPRs into
> non-wrapping forms, which for:
>
> … = base + offset
>
> becomes:
>
> tmp = (unsigned long) base
> … = tmp + offset
>
> It then hoists these conversions out of the loop where possible.
>
> However, because “base” is a valid gimple operand, there can be
> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
> lead to multiple instances of the same conversion.  The later VN pass
> is (and I think needs to be) restricted to the new if-converted code,
> whereas here we're deliberately inserting the conversions before the
> .LOOP_VECTORIZED condition:
>
> /* If we versioned loop then make sure to insert invariant
>stmts before the .LOOP_VECTORIZED check since the vectorizer
>will re-use that for things like runtime alias versioning
>whose condition can end up using those invariants.  */
>
> We can therefore enter the vectoriser with redundant conversions.
>
> The easiest fix seemed to be to defer the hoisting until after VN.
> This catches other hoisting opportunities too.
>
> Hoisting the code from the (artificial) loop in pr99102.c means
> that it's no longer worth vectorising.  The patch forces vectorisation
> instead of relying on the cost model.
>
> The patch also reverts pr87007-4.c and pr87007-5.c back to their
> original forms, undoing changes in 783dc66f9ccb0019c3dad.
> The code at the time the tests were added was:
>
> testl   %edi, %edi
> je  .L10
> vxorps  %xmm1, %xmm1, %xmm1
> vsqrtsd d3(%rip), %xmm1, %xmm0
> vsqrtsd d2(%rip), %xmm1, %xmm1
> ...
> .L10:
> ret
>
> with the operations being hoisted, and the vxorps was specifically
> wanted (compared to the previous code).  This patch restores the code
> to that form, with the hoisted operations and the vxorps.
>
> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Richard
>
>
> gcc/
> * tree-if-conv.c: Include tree-eh.h.
> (predicate_statements): Remove pe argument.  Don't hoist
> statements here.
> (combine_blocks): Remove pe argument.
> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
> (ifcvt_hoist_invariants): Likewise.
> (tree_if_conversion): Update call to combine_blocks.  Call
> ifcvt_hoist_invariants after VN.
>
> gcc/testsuite/
> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
>
> Revert:
>
> 2020-09-09  Richard Biener  
>
> * gcc.target/i386/pr87007-4.c: Adjust.
> * gcc.target/i386/pr87007-5.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
>  gcc/tree-if-conv.c| 122 --
>  4 files changed, 114 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
> b/gcc/testsuite/gcc.dg/vect/pr99102.c
> index 6c1a13f0783..0d030d15c86 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> @@ -1,4 +1,4 @@
> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
> -fdump-tree-vect-details" } */
>  /* { dg-additional-options "-msve-vector-bits=256" { target 
> aarch64_sve256_hw } } */
>  long a[44];
>  short d, e = -7;
> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> index 9c4b8005af3..e91bdcbac44 100644
> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> @@ -15,4 +15,4 @@ foo (int n, int k)
>d1 = ceil (d3);
>  }
>
> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c 
> b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> index e4d956a5d7f..20d13cf650b 100644
> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
> @@ -15,4 +15,4 @@ foo (int n, int k)
>d1 = sqrt (d3);
>  }
>
> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } */
> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } */
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index e88ddc9f788..0ad557a2f4d 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -121,6 +121,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-cfgcleanup.h"
>  #include "tree-ssa-dse.h"
>  #include "tree-vectorizer.h"
> +#include "tree-eh.h"
>
>  /* Only handle PHIs with no more arguments unless we are asked to by
> simd pragma.  */
> @@ 

Re: [PATCH 3/5] vect: Support gather loads with SLP

2021-11-15 Thread Richard Biener via Gcc-patches
On Fri, Nov 12, 2021 at 7:01 PM Richard Sandiford via Gcc-patches
 wrote:
>
> This patch adds SLP support for IFN_GATHER_LOAD.  Like the SLP
> support for IFN_MASK_LOAD, it works by treating only some of the
> arguments as child nodes.  Unlike IFN_MASK_LOAD, it requires the
> other arguments (base, scale, and extension type) to be the same
> for all calls in the group.  It does not require/expect the loads
> to be in a group (which probably wouldn't make sense for gathers).
>
> I was worried about the possible alias effect of moving gathers
> around to be part of the same SLP group.  The patch therefore
> makes vect_analyze_data_ref_dependence treat gathers and scatters
> as a top-level concern, punting if the accesses aren't completely
> independent and if the user hasn't told us that a particular
> VF is safe.  I think in practice we already punted in the same
> circumstances; the idea is just to make it more explicit.
>
> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Btw, I filed PR102467 for this a while ago, you might want to mention
that in the ChangeLog.

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> * doc/sourcebuild.texi (vect_gather_load_ifn): Document.
> * tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
> Commonize safelen handling.  Punt for anything involving
> gathers and scatters unless safelen says otherwise.
> * tree-vect-slp.c (arg1_map): New variable.
> (vect_get_operand_map): Handle IFN_GATHER_LOAD.
> (vect_build_slp_tree_1): Likewise.
> (vect_build_slp_tree_2): Likewise.
> (compatible_calls_p): If vect_get_operand_map returns nonnull,
> check that any skipped arguments are equal.
> (vect_slp_analyze_node_operations_1): Tighten reduction check.
> * tree-vect-stmts.c (check_load_store_for_partial_vectors): Take
> an ncopies argument.
> (vect_get_gather_scatter_ops): Take slp_node and ncopies arguments.
> Handle SLP nodes.
> (vectorizable_store, vectorizable_load): Adjust accordingly.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_vect_gather_load_ifn): New target test.
> * gcc.dg/vect/vect-gather-1.c: New test.
> * gcc.dg/vect/vect-gather-2.c: Likewise.
> * gcc.target/aarch64/sve/gather_load_11.c: Likewise.
> ---
>  gcc/doc/sourcebuild.texi  |  4 ++
>  gcc/testsuite/gcc.dg/vect/vect-gather-1.c | 60 +
>  gcc/testsuite/gcc.dg/vect/vect-gather-2.c | 36 +++
>  .../gcc.target/aarch64/sve/gather_load_11.c   | 49 ++
>  gcc/testsuite/lib/target-supports.exp |  6 ++
>  gcc/tree-vect-data-refs.c | 64 +--
>  gcc/tree-vect-slp.c   | 29 +++--
>  gcc/tree-vect-stmts.c | 26 
>  8 files changed, 223 insertions(+), 51 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gather_load_11.c
>
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 40b1e0d8167..702cd0c53e4 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1639,6 +1639,10 @@ Target supports vector masked loads.
>  @item vect_masked_store
>  Target supports vector masked stores.
>
> +@item vect_gather_load_ifn
> +Target supports vector gather loads using internal functions
> +(rather than via built-in functions or emulation).
> +
>  @item vect_scatter_store
>  Target supports vector scatter stores.
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-gather-1.c
> new file mode 100644
> index 000..4cee73fc775
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-1.c
> @@ -0,0 +1,60 @@
> +#include "tree-vect.h"
> +
> +#define N 16
> +
> +void __attribute__((noipa))
> +f (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = x[indices[i * 2]] + 1;
> +  y[i * 2 + 1] = x[indices[i * 2 + 1]] + 2;
> +}
> +}
> +
> +int y[N * 2];
> +int x[N * 2] = {
> +  72704, 52152, 51301, 96681,
> +  57937, 60490, 34504, 60944,
> +  42225, 28333, 88336, 74300,
> +  29250, 20484, 38852, 91536,
> +  86917, 63941, 31590, 21998,
> +  22419, 26974, 28668, 13968,
> +  3451, 20247, 44089, 85521,
> +  22871, 87362, 50555, 85939
> +};
> +int indices[N * 2] = {
> +  15, 16, 9, 19,
> +  7, 22, 19, 1,
> +  22, 13, 15, 30,
> +  5, 12, 11, 11,
> +  10, 25, 5, 20,
> +  22, 24, 24, 28,
> +  30, 19, 6, 4,
> +  7, 12, 8, 21
> +};
> +int expected[N * 2] = {
> +  91537, 86919, 28334, 22000,
> +  60945, 28670, 21999, 52154,
> +  28669, 20486, 91537, 50557,
> +  60491, 29252, 74301, 74302,
> +  88337, 20249, 60491, 22421,
> +  28669, 3453, 3452, 22873,
> +  50556, 22000, 

[COMMITTED] path solver: Default to global range if nothing found.

2021-11-15 Thread Aldy Hernandez via Gcc-patches
This has been a long time coming, but we weren't able to make the
change because of some unrelated regressions.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Default to global range if nothing found.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr31146-2.C: Add -fno-thread-jumps.
---
 gcc/gimple-range-path.cc  | 2 +-
 gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 9957ac9b6c7..f6e31999293 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -212,7 +212,7 @@ path_range_query::internal_range_of_expr (irange , tree 
name, gimple *stmt)
   return true;
 }
 
-  r.set_varying (TREE_TYPE (name));
+  r = gimple_range_global (name);
   return true;
 }
 
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
index 9fb5dc1b60c..fc268578f69 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fcheck-new -fno-tree-vrp -fdump-tree-forwprop1" } */
+/* { dg-options "-O -fcheck-new -fno-tree-vrp -fdump-tree-forwprop1 
-fno-thread-jumps" } */
 
 #include 
 
-- 
2.31.1



[COMMITTED] Fix PHI ordering problems in the path solver.

2021-11-15 Thread Aldy Hernandez via Gcc-patches
After auditing the PHI range calculations, I'm not convinced we've
caught all the corner cases.  They haven't shown up in the wild (yet),
but better safe than sorry.

We shouldn't write anything to the cache or trigger additional
lookups while calculating a PHI, as this may cause ordering problems.
We should resolve the PHI with either the cache as it stands, or by
asking for ranges on entry to the path.  I've documented this.

There was one dubious case where we called fold_range in
ssa_range_in_phi, which mostly by luck wasn't triggering lookups,
because fold_range solves a PHI by calling range_on_edge, which is set
to pick up global ranges by default in path_range_query.  This is
fragile, so I've rewritten the call to explicitly use cached or global
ranges.

Also, the cache should be avoided in ssa_range_in_phi when the arg is
defined in the PHI's block, as not doing so could create an ordering
problem.  We have a similar check when calculating relations in PHIs.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Remove useless code.
(path_range_query::ssa_defined_in_bb): New.
(path_range_query::ssa_range_in_phi): Avoid fold_range call that
could trigger additional lookups.
Do not use the cache for ARGs defined in this block.
(path_range_query::compute_ranges_in_block): Use ssa_defined_in_bb.
(path_range_query::maybe_register_phi_relation): Same.
(path_range_query::range_of_stmt): Adjust comment.
* gimple-range-path.h (ssa_defined_in_bb): New.
---
 gcc/gimple-range-path.cc | 61 +++-
 gcc/gimple-range-path.h  |  1 +
 2 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index f6e31999293..4aa666d2c8b 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -202,8 +202,8 @@ path_range_query::internal_range_of_expr (irange , tree 
name, gimple *stmt)
   return true;
 }
 
-  basic_block bb = stmt ? gimple_bb (stmt) : exit_bb ();
-  if (stmt && range_defined_in_block (r, name, bb))
+  if (stmt
+  && range_defined_in_block (r, name, gimple_bb (stmt)))
 {
   if (TREE_CODE (name) == SSA_NAME)
r.intersect (gimple_range_global (name));
@@ -250,36 +250,62 @@ path_range_query::set_path (const vec )
   bitmap_clear (m_has_cache_entry);
 }
 
+bool
+path_range_query::ssa_defined_in_bb (tree name, basic_block bb)
+{
+  return (TREE_CODE (name) == SSA_NAME
+ && SSA_NAME_DEF_STMT (name)
+ && gimple_bb (SSA_NAME_DEF_STMT (name)) == bb);
+}
+
 // Return the range of the result of PHI in R.
+//
+// Since PHIs are calculated in parallel at the beginning of the
+// block, we must be careful to never save anything to the cache here.
+// It is the caller's responsibility to adjust the cache.  Also,
+// calculating the PHI's range must not trigger additional lookups.
 
 void
 path_range_query::ssa_range_in_phi (irange , gphi *phi)
 {
   tree name = gimple_phi_result (phi);
   basic_block bb = gimple_bb (phi);
+  unsigned nargs = gimple_phi_num_args (phi);
 
   if (at_entry ())
 {
   if (m_resolve && m_ranger->range_of_expr (r, name, phi))
return;
 
-  // Try fold just in case we can resolve simple things like PHI <5(99), 
6(88)>.
-  if (!fold_range (r, phi, this))
-   r.set_varying (TREE_TYPE (name));
-
+  // Try to fold the phi exclusively with global or cached values.
+  // This will get things like PHI <5(99), 6(88)>.  We do this by
+  // calling range_of_expr with no context.
+  int_range_max arg_range;
+  r.set_undefined ();
+  for (size_t i = 0; i < nargs; ++i)
+   {
+ tree arg = gimple_phi_arg_def (phi, i);
+ if (range_of_expr (arg_range, arg, /*stmt=*/NULL))
+   r.union_ (arg_range);
+ else
+   {
+ r.set_varying (TREE_TYPE (name));
+ return;
+   }
+   }
   return;
 }
 
   basic_block prev = prev_bb ();
   edge e_in = find_edge (prev, bb);
-  unsigned nargs = gimple_phi_num_args (phi);
 
   for (size_t i = 0; i < nargs; ++i)
 if (e_in == gimple_phi_arg_edge (phi, i))
   {
tree arg = gimple_phi_arg_def (phi, i);
-
-   if (!get_cache (r, arg))
+   // Avoid using the cache for ARGs defined in this block, as
+   // that could create an ordering problem.
+   if (ssa_defined_in_bb (arg, bb) || !get_cache (r, arg))
  {
if (m_resolve)
  {
@@ -393,10 +419,7 @@ path_range_query::compute_ranges_in_block (basic_block bb)
   EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi)
 {
   tree name = ssa_name (i);
-  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
-  basic_block def_bb = gimple_bb (def_stmt);
-
-  if (def_bb == bb)
+  if (ssa_defined_in_bb (name, bb))
clear_cache (name);
 }
 
@@ -705,8 +728,8 @@ 

[PATCH] tree-optimization/103237 - avoid vectorizing unhandled double reductions

2021-11-15 Thread Richard Biener via Gcc-patches
Double reductions which have multiple LC PHIs in the inner loop
are not handled correctly during transformation since those PHIs
are not properly classified as reduction.  The following disables
vectorizing them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-15  Richard Biener  

PR tree-optimization/103237
* tree-vect-loop.c (vect_is_simple_reduction): Fail for
double reductions with multiple inner loop LC PHI nodes.

* gcc.dg/torture/pr103237.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103237.c | 24 
 gcc/tree-vect-loop.c| 11 +++
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103237.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103237.c 
b/gcc/testsuite/gcc.dg/torture/pr103237.c
new file mode 100644
index 000..f2399f9586e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103237.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-vectorize" } */
+
+int g1;
+unsigned int g2 = -1U;
+static void __attribute__((noipa))
+func_1()
+{
+  int *l_1 = 
+  for (int g3a = 0; g3a != 4; g3a++)
+for (int l_2 = 0; l_2 <= 3; l_2++)
+  {
+unsigned int *l_3 = 
+*l_1 = *l_3 ^= 1;
+  }
+}
+int
+main()
+{
+  func_1();
+  if (g1 != -1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 1cd5dbcb6f7..73efdb96bad 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3567,6 +3567,17 @@ vect_is_simple_reduction (loop_vec_info loop_info, 
stmt_vec_info phi_info,
   return def_stmt_info;
 }
 
+  /* When the inner loop of a double reduction ends up with more than
+ one loop-closed PHI we have failed to classify alternate such
+ PHIs as double reduction, leading to wrong code.  See PR103237.  */
+  if (inner_loop_of_double_reduc && lcphis.length () != 1)
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"unhandle double reduction\n");
+  return NULL;
+}
+
   /* If this isn't a nested cycle or if the nested cycle reduction value
  is used ouside of the inner loop we cannot handle uses of the reduction
  value.  */
-- 
2.31.1


  1   2   >