date:20220207

Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth

2022-02-07 Thread Richard Biener via Gcc-patches

On Mon, 7 Feb 2022, Joseph Myers wrote:

> On Fri, 4 Feb 2022, Richard Biener via Gcc-patches wrote:
> 
> > This adds explicit variable birth CLOBBERs in an attempt to fix
> > PR90348 and duplicates.  The birth / death CLOBBER pairs are
> > used to compute liveness and conflicts for stack variable
> > coalescing where the lack of an explicit birth but instead
> > use of first mention causes misrepresentation of variable life
> > ranges when optimization moves the first mention upwards the
> > original birth point at the variables bind expression start.
> 
> I'm not clear on exactly where you consider a variable to be born, but 
> note that non-VLA C variables have a lifetime that starts at the beginning 
> of their block, not at their declaration: it's valid to jump backward from 
> after the declaration to before it and then access the variable via a 
> pointer, or jump forward again over the declaration and access it by name.  
> (Most programs of course don't do that sort of thing.)

Oh, interesting.  So the following is valid

int foo(int always_true_at_runtime)
{
  {
int *p;
if (always_true_at_runtime)
  goto after;
lab:
return *p;
after:
int i = 0;
p = 
if (always_true_at_runtime)
  goto lab;
  }
  return 0;
}

For the implementation I considered starting lifetime at a DECL_EXPR
if it exists and otherwise at the start of the BIND_EXPR.  Note the
complication is that control flow has to reach the lifetime-start
clobber before the first access.  But that might also save us here,
since for the example above 'i' would be live via the backedge
from goto lab.

That said, the existing complication is that when the gimplifier
visits the BIND_EXPR it has no way to know whether there will be
a following DECL_EXPR or not (and the gimplifier notes there are
cases a DECL_EXPR variable is not in a BIND_EXPR).  My plan is to
compute this during one of the body walks the gimplifier performs
before gimplifying.

Another complication is that at least the C frontend + gimplifier
combination for

  switch (i)
  {
  case 1:
int i;
break;
  }

will end up having the BIND_EXPR mentioning 'i' containing the
case label, so just placing the birth at the BIND will make it
unreachable as

  i = BIRTH;
case_label_for_1:
  ...

conveniently at least the C frontend has a DECL_EXPR for 'i'
which avoids this and I did not find a way (yet) in the gimplifier
to rectify this when gimplifying switches.

So there's still work to do but I think that starting the lifetime
at a DECL_EXPR if it exists is the way to go?

Richard.

[PATCH] tree-optimization/104373 - early uninit diagnostic on unreachable code

2022-02-07 Thread Richard Biener via Gcc-patches

The following improves early uninit diagnostics by computing edge
reachability using our value-numbering framework in its cheapest
mode and ignoring unreachable blocks when looking
for uninitialized uses.  To not ICE with -fdump-tree-all the
early uninit pass needs a dumpfile since VN tries to dump statistics.

For gimple-match.c at -O0 -g this causes a 2% increase in compile-time.

In theory all early diagnostic passes could benefit from a VN run but
that would require more refactoring that's not appropriate at this stage.
This patch addresses a GCC 12 diagnostic regression and also happens to
fix one XFAIL in gcc.dg/uninit-pr20644-O0.c

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

2022-02-04  Richard Biener  

PR tree-optimization/104373
* tree-ssa-sccvn.h (do_rpo_vn): New export exposing the
walk kind.
* tree-ssa-sccvn.cc (do_rpo_vn): Export, get the default
walk kind as argument.
(run_rpo_vn): Adjust.
(pass_fre::execute): Likewise.
* tree-ssa-uninit.cc (warn_uninitialized_vars): Skip
blocks not reachable.
(execute_late_warn_uninitialized): Mark all edges as
executable.
(execute_early_warn_uninitialized): Use VN to compute
executable edges.
(pass_data_early_warn_uninitialized): Enable a dump file,
change dump name to warn_uninit.

* g++.dg/warn/Wuninitialized-32.C: New testcase.
* gcc.dg/uninit-pr20644-O0.c: Remove XFAIL.
---
 gcc/testsuite/g++.dg/warn/Wuninitialized-32.C | 14 +++
 gcc/testsuite/gcc.dg/uninit-pr20644-O0.c  |  2 +-
 gcc/tree-ssa-sccvn.cc | 18 -
 gcc/tree-ssa-sccvn.h  |  1 +
 gcc/tree-ssa-uninit.cc| 39 ---
 5 files changed, 58 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wuninitialized-32.C

diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-32.C 
b/gcc/testsuite/g++.dg/warn/Wuninitialized-32.C
new file mode 100644
index 000..8b02b5c6adb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-32.C
@@ -0,0 +1,14 @@
+// { dg-do compile }
+// { dg-additional-options "-Wall" }
+
+void* operator new[](unsigned long, void* __p);
+
+struct allocator
+{
+  ~allocator();
+};
+
+void *foo (void *p)
+{
+  return p ? new(p) allocator[1] : new allocator[1]; // { dg-bogus 
"uninitialized" }
+}
diff --git a/gcc/testsuite/gcc.dg/uninit-pr20644-O0.c 
b/gcc/testsuite/gcc.dg/uninit-pr20644-O0.c
index 8ae697abcdf..a335d8cb4eb 100644
--- a/gcc/testsuite/gcc.dg/uninit-pr20644-O0.c
+++ b/gcc/testsuite/gcc.dg/uninit-pr20644-O0.c
@@ -7,7 +7,7 @@ int foo ()
   int j;
 
   if (1 == i)
-return j; /* { dg-bogus "uninitialized" "uninitialized" { xfail *-*-* } } 
*/
+return j; /* { dg-bogus "uninitialized" "uninitialized" } */
 
   return 0;
 }
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index a03f0aae924..eb17549c185 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -7034,15 +7034,14 @@ eliminate_with_rpo_vn (bitmap inserted_exprs)
   return walker.eliminate_cleanup ();
 }
 
-static unsigned
+unsigned
 do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
-  bool iterate, bool eliminate);
+  bool iterate, bool eliminate, vn_lookup_kind kind);
 
 void
 run_rpo_vn (vn_lookup_kind kind)
 {
-  default_vn_walk_kind = kind;
-  do_rpo_vn (cfun, NULL, NULL, true, false);
+  do_rpo_vn (cfun, NULL, NULL, true, false, kind);
 
   /* ???  Prune requirement of these.  */
   constant_to_value_id = new hash_table (23);
@@ -7740,11 +7739,12 @@ do_unwind (unwind_state *to, rpo_elim )
executed and iterate.  If ELIMINATE is true then perform
elimination, otherwise leave that to the caller.  */
 
-static unsigned
+unsigned
 do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
-  bool iterate, bool eliminate)
+  bool iterate, bool eliminate, vn_lookup_kind kind)
 {
   unsigned todo = 0;
+  default_vn_walk_kind = kind;
 
   /* We currently do not support region-based iteration when
  elimination is requested.  */
@@ -8164,8 +8164,7 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
 unsigned
 do_rpo_vn (function *fn, edge entry, bitmap exit_bbs)
 {
-  default_vn_walk_kind = VN_WALKREWRITE;
-  unsigned todo = do_rpo_vn (fn, entry, exit_bbs, false, true);
+  unsigned todo = do_rpo_vn (fn, entry, exit_bbs, false, true, VN_WALKREWRITE);
   free_rpo_vn ();
   return todo;
 }
@@ -8221,8 +8220,7 @@ pass_fre::execute (function *fun)
   if (iterate_p)
 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  default_vn_walk_kind = VN_WALKREWRITE;
-  todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true);
+  todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true, VN_WALKREWRITE);
   free_rpo_vn ();
 
   if (iterate_p)
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index f161b80417b..aeed07039b7 100644
--- a/gcc/tree-ssa-sccvn.h
+++

[PATCH] doc: invoke: RISC-V: Clean up the -mstrict-align wording

2022-02-07 Thread Palmer Dabbelt

The polarity of do/do not was reversed for this option when compored to
the rest of them.  This seems to have been copied from PowerPC, when the
polarity of the arguments in the docs was reversed (presumably to match
the default), but appears to have never made sense on RISC-V.

gcc/ChangeLog

* doc/invoke.texi (RISC-V -mstrict-align): Re-word the do/do not
language.

Signed-off-by: Palmer Dabbelt 
---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0ebe538ccdc..5e8af05e359 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27702,7 +27702,7 @@ integer load/stores only.
 @item -mstrict-align
 @itemx -mno-strict-align
 @opindex mstrict-align
-Do not or do generate unaligned memory accesses.  The default is set depending
+Do or do not generate unaligned memory accesses.  The default is set depending
 on whether the processor we are optimizing for supports fast unaligned access
 or not.
 
-- 
2.34.1

[PATCH] doc: invoke: RISC-V: Clean up the -memit-attribute wording

2022-02-07 Thread Palmer Dabbelt

The previous wording makes it sound like "do not emit" is a
clarification of "emit", which is not accurate.  This cleans up the
wording to match (most) of the other arguments.

gcc/ChangeLog

* doc/invoke.texi (RISC-V -memit-attribute): Re-word the do/do
not language.

Signed-off-by: Palmer Dabbelt 
---
 gcc/doc/invoke.texi | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8bd5293fce7..0ebe538ccdc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -26294,12 +26294,11 @@ assembler and the linker alone without help from the 
compiler.
 @itemx -mno-mcount-ra-address
 @opindex mmcount-ra-address
 @opindex mno-mcount-ra-address
-Emit (do not emit) code that allows @code{_mcount} to modify the
-calling function's return address.  When enabled, this option extends
-the usual @code{_mcount} interface with a new @var{ra-address}
-parameter, which has type @code{intptr_t *} and is passed in register
-@code{$12}.  @code{_mcount} can then modify the return address by
-doing both of the following:
+Do or do not emit code that allows @code{_mcount} to modify the calling
+function's return address.  When enabled, this option extends the usual
+@code{_mcount} interface with a new @var{ra-address} parameter, which has type
+@code{intptr_t *} and is passed in register @code{$12}.  @code{_mcount} can
+then modify the return address by doing both of the following:
 @itemize
 @item
 Returning the new address in register @code{$31}.
-- 
2.34.1

Re: [PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-07 Thread Bill Schmidt via Gcc-patches

Hi!

On 2/7/22 5:05 PM, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Feb 07, 2022 at 04:20:24PM -0600, Bill Schmidt wrote:
>> I observed recently that a couple of Power10 instructions and built-in 
>> functions
>> were somehow not implemented.  This patch adds one of them (vmsumcud).  
>> Although
>> this isn't normally stage-4 material, this is really simple and carries no
>> discernible risk, so I hope it can be considered.
> But what is the advantage?  That will be very tiny as well, afaics?
>
> Ah, this implements a builtin as well.  But that builtin is not in the
> PVIPR, so no one yet uses it most likely?

It's in the yet unpublished version of PVIPR that adds ISA 3.1 support,
currently awaiting public review.  It should have been implemented with
the rest of the ISA 3.1 built-ins.  (There are two more that were missed
as well, which I haven't yet addressed.)

>> gcc/
>>  * config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
>>  * config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
>>  * config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
>>  (vmsumcud): New define_insn.
>>
>> gcc/testsuite/
>>  * gcc.target/powerpc/vec-msumc.c: New test.
>> +;; vmsumcud
>> +(define_insn "vmsumcud"
>> +[(set (match_operand:V1TI 0 "register_operand" "+v")
>> +  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
>> +(match_operand:V2DI 2 "register_operand" "v")
>> +(match_operand:V1TI 3 "register_operand" "v")]
>> +   UNSPEC_VMSUMCUD))]
>> +  "TARGET_POWER10"
>> +  "vmsumcud %0,%1,%2,%3"
>> +  [(set_attr "type" "vecsimple")]
>> +)
> This can be properly described in RTL instead of using an unspec.  This
> is much preferable.  I would say compare to maddhd[u], but those insns
> aren't implemented either (maddld is though).

Is it?  Note that vmsumcud produces the carry out of the final
result, not the result itself.  I couldn't immediately see how
to express this in RTL.

The full operation multiplies the corresponding lanes of each
doubleword of arguments 1 and 2, adds them together with the
128-bit value in argument 3, and produces the carry out of the
result as a 128-bit value in the result.  I think I'd need to
have a 256-bit mode to express this properly in RTL, right?

Thanks,
Bill

>
>
> Segher

[PATCH 5/4] tree-optimization/104288 - An alternative approach

2022-02-07 Thread Andrew MacLeod via Gcc-patches


On 2/7/22 09:29, Andrew MacLeod wrote:

I have a proposal for PR 104288.

ALL patches (in sequence) bootstrap on their own and each cause no 
regressions.


I've been continuing to work with this towards the GCC13 solution for 
statement side effects.  And There is another possibility we could 
pursue which is less intrusive


I adapted the portions of patch 2/4 which process nonnull, but changes 
it from being in the nonnull class to being in the cache.


THe main difference is it continues to use a single bit, just changing 
all uses of it to *always* mean its non-null on exit to the block.


Range-on-entry is changed to only check dominator blocks, and 
range_of_expr doesn't check the bit at all.. in both ranger and the cache.


When we are doing the block walk and process side effects, the on-entry 
*cache* range for the block is adjusted to be non-null instead.   The 
statements which precede this stmt have already been processed, and all 
remaining statements in this block will now pick up this new non-value 
from the on-entry cache.  This should be perfectly safe.


The final tweak is when range_of_expr is looking the def block, it 
normally does not have an on entry cache value.  (the def is in the 
block, so there is no entry cache value).  It now checks to see if we 
have set one, which can only happen when we have been doing the 
statement walk and set the on-entry cache with  a non-null value.  This 
allows us to pick up the non-null range in the def block too... (once we 
have passed a statement which set nonnull of course).


THis patch provides much less change to trunk, and is probably a better 
approach as its closer to what is going to happen in GCC13.


Im still working with it, so the patch isnt fully refined yet... but it 
bootstraps and passes all test with no regressions.  And passes the new 
tests too.   I figured I would mention this before you look to hard at 
the other patchset.    the GCC11 patch doesn't change.


Let me know if you prefer this I think I do :-)  less churn, and 
closer to end state.


Andrew





commit da6185993cf85f38b08e4c2801ca8e5fe914e40d
Author: Andrew MacLeod 
Date:   Mon Feb 7 15:52:16 2022 -0500

Change nonnull to mean nonnull at exit.

* gimple-range-cache.cc (non_null_ref::set_nonnull): New.
(ranger_cache::range_of_def): Don't check non-null.
(ranger_cache::entry_range): Don't check non-null.
(ranger_cache::try_update_to_always_nonnull): New.
(ranger_cache::infer_nonnull_call_attributes): New.
(non_null_loadstore): New.
(ranger_cache::block_apply_nonnull): New.
* ranger-cache.h (class non_null_ref): Update prototypes.
(class ranger_cache): Update prototypes.
* gimple-range-path.cc (path_range_query::range_defined_in_block): Do
not search dominators.
(path_range_query::adjust_for_non_null_uses): Ditto.
* gimple-range.cc (gimple_ranger::range_of_expr): Check on-entry for
def overrides.  Do not check nonnull.
(gimple_ranger::range_on_entry): Check dominators for nonnull.
(gimple_ranger::range_on_exit): Check for nonnull.
* gimple-range.cc (gimple_ranger::register_side_effects): New.
* gimple-range.h (gimple_ranger::register_side_effects): New.
* tree-vrp.cc (rvrp_folder::fold_stmt): Call register_side_effects.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 16c881b13e1..c6aefb98b42 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -29,6 +29,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "gimple-range.h"
 #include "tree-cfg.h"
+#include "target.h"
+#include "attribs.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
 
 #define DEBUG_RANGE_CACHE (dump_file	\
 			   && (param_ranger_debug & RANGER_DEBUG_CACHE))
@@ -50,6 +54,22 @@ non_null_ref::~non_null_ref ()
   m_nn.release ();
 }
 
+// This routine will update NAME in BB to be nonnull if it is not already.
+// return TRUE if the update happens.,
+
+bool
+non_null_ref::set_nonnull (basic_block bb, tree name)
+{
+  gcc_checking_assert (gimple_range_ssa_p (name)
+		   && POINTER_TYPE_P (TREE_TYPE (name)));
+  // Only process when its not already set.
+  if (non_null_deref_p  (name, bb, false))
+return false;
+  // From this point forward, we'll be NON-NULL for sure if we see one.
+  bitmap_set_bit (m_nn[SSA_NAME_VERSION (name)], bb->index);
+  return true;
+}
+
 // Return true if NAME has a non-null dereference in block bb.  If this is the
 // first query for NAME, calculate the summary first.
 // If SEARCH_DOM is true, the search the dominator tree as well.
@@ -1014,9 +1034,6 @@ ranger_cache::range_of_def (irange , tree name, basic_block bb)
   else
 	r = gimple_range_global (name);
 }
-
-  if (bb)
-m_non_null.adjust_range (r, name,

[PATCH v1] RISC-V: Add support for inlining subword atomic operations

2022-02-07 Thread Patrick O'Neill

RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.

This patch changes the default behavior to inline subword atomic calls 
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.

gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.

2022-02-07 Patrick O'Neill 

PR target/104338
* riscv.opt: Add command-line flag.
* sync.md (atomic_fetch_): logic for 
expanding subword atomic operations.
* sync.md (subword_atomic_fetch_strong_): LR/SC
block for performing atomic operation
* atomic.c: Add reference to duplicate logic.
* inline-atomics-1.c: New test.
* inline-atomics-2.c: Likewise.
* inline-atomics-3.c: Likewise.
* inline-atomics-4.c: Likewise.
* inline-atomics-5.c: Likewise.
* inline-atomics-6.c: Likewise.
* inline-atomics-7.c: Likewise.
* inline-atomics-8.c: Likewise.
* inline-atomics-9.c: Likewise.

Signed-off-by: Patrick O'Neill 
---
There may be further concerns about the memory consistency of these 
operations, but this patch focuses on simply moving the logic inline.
Those concerns can be addressed in a future patch.
---
 gcc/config/riscv/riscv.opt|   4 +
 gcc/config/riscv/sync.md  |  96 +++
 .../gcc.target/riscv/inline-atomics-1.c   |  11 +
 .../gcc.target/riscv/inline-atomics-2.c   |  12 +
 .../gcc.target/riscv/inline-atomics-3.c   | 569 ++
 .../gcc.target/riscv/inline-atomics-4.c   | 566 +
 .../gcc.target/riscv/inline-atomics-5.c   |  13 +
 .../gcc.target/riscv/inline-atomics-6.c   |  12 +
 .../gcc.target/riscv/inline-atomics-7.c   |  12 +
 .../gcc.target/riscv/inline-atomics-8.c   |  17 +
 .../gcc.target/riscv/inline-atomics-9.c   |  17 +
 libgcc/config/riscv/atomic.c  |   2 +
 12 files changed, 1331 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-9.c

diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index e294e223151..fb702317233 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -211,3 +211,7 @@ Enum(isa_spec_class) String(20191213) 
Value(ISA_SPEC_CLASS_20191213)
 misa-spec=
 Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) 
Init(TARGET_DEFAULT_ISA_SPEC)
 Set the version of RISC-V ISA spec.
+
+minline-atomics
+Target Bool Var(ALWAYS_INLINE_SUBWORD_ATOMIC) Init(-1)
+Always inline subword atomic operations.
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 747a799e237..e19b4157d3c 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -92,6 +92,102 @@
   "%F3amo.%A3 %0,%z2,%1"
   [(set (attr "length") (const_int 8))])
 
+(define_expand "atomic_fetch_"
+  [(set (match_operand:SHORT 0 "register_operand" "=") ;; old 
value at mem
+   (match_operand:SHORT 1 "memory_operand" "+A"));; mem 
location
+   (set (match_dup 1)
+   (unspec_volatile:SHORT
+ [(any_atomic:SHORT (match_dup 1)
+(match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; value 
for op
+  (match_operand:SI 3 "const_int_operand")]  ;; model
+UNSPEC_SYNC_OLD_OP))]
+  "TARGET_ATOMIC && ALWAYS_INLINE_SUBWORD_ATOMIC"
+{
+  /* We have no QImode/HImode atomics, so form a mask, then use
+ subword_atomic_fetch_strong_ to implement a LR/SC version of the
+ operation. */
+
+  /* Logic duplicated in gcc/libgcc/config/riscv/atomic.c for use when inlining
+ is disabled */
+
+  rtx old = gen_reg_rtx (SImode);
+  rtx mem = operands[1];
+  rtx value = operands[2];
+  rtx mask = gen_reg_rtx (SImode);
+  rtx notmask = gen_reg_rtx (SImode);
+
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
+ gen_int_mode (-4, Pmode)));
+
+  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
+
+  rtx shift = gen_reg_rtx (SImode);
+  emit_move_insn (shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
+

Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth

2022-02-07 Thread Joseph Myers

On Fri, 4 Feb 2022, Richard Biener via Gcc-patches wrote:

> This adds explicit variable birth CLOBBERs in an attempt to fix
> PR90348 and duplicates.  The birth / death CLOBBER pairs are
> used to compute liveness and conflicts for stack variable
> coalescing where the lack of an explicit birth but instead
> use of first mention causes misrepresentation of variable life
> ranges when optimization moves the first mention upwards the
> original birth point at the variables bind expression start.

I'm not clear on exactly where you consider a variable to be born, but 
note that non-VLA C variables have a lifetime that starts at the beginning 
of their block, not at their declaration: it's valid to jump backward from 
after the declaration to before it and then access the variable via a 
pointer, or jump forward again over the declaration and access it by name.  
(Most programs of course don't do that sort of thing.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[committed] analyzer: fix ICE on realloc of non-heap [PR104417]

2022-02-07 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7087-g0c04ac0e15830b.

gcc/analyzer/ChangeLog:
PR analyzer/104417
* sm-taint.cc (tainted_allocation_size::tainted_allocation_size):
Remove overzealous assertion.
(tainted_allocation_size::emit): Likewise.
(region_model::check_dynamic_size_for_taint): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/104417
* gcc.dg/analyzer/pr104417.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-taint.cc | 9 -
 gcc/testsuite/gcc.dg/analyzer/pr104417.c | 7 +++
 2 files changed, 7 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr104417.c

diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 573a9124286..57a0ac684ab 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -496,9 +496,6 @@ public:
   : taint_diagnostic (sm, arg, has_bounds),
 m_mem_space (mem_space)
   {
-gcc_assert (mem_space == MEMSPACE_STACK
-   || mem_space == MEMSPACE_HEAP
-   || mem_space == MEMSPACE_UNKNOWN);
   }
 
   const char *get_kind () const FINAL OVERRIDE
@@ -511,9 +508,6 @@ public:
 diagnostic_metadata m;
 /* "CWE-789: Memory Allocation with Excessive Size Value".  */
 m.add_cwe (789);
-gcc_assert (m_mem_space == MEMSPACE_STACK
-   || m_mem_space == MEMSPACE_HEAP
-   || m_mem_space == MEMSPACE_UNKNOWN);
 // TODO: make use of m_mem_space
 if (m_arg)
   switch (m_has_bounds)
@@ -1055,9 +1049,6 @@ region_model::check_dynamic_size_for_taint (enum 
memory_space mem_space,
const svalue *size_in_bytes,
region_model_context *ctxt) const
 {
-  gcc_assert (mem_space == MEMSPACE_STACK
- || mem_space == MEMSPACE_HEAP
- || mem_space == MEMSPACE_UNKNOWN);
   gcc_assert (size_in_bytes);
   gcc_assert (ctxt);
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104417.c 
b/gcc/testsuite/gcc.dg/analyzer/pr104417.c
new file mode 100644
index 000..41ea5dbd0c7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr104417.c
@@ -0,0 +1,7 @@
+char eparr[1];
+
+void *
+warn_realloc_extern_ptrarr_offset (int i, int n)
+{
+  return __builtin_realloc (eparr + i, n); /* { dg-warning 
"'__builtin_realloc' called on unallocated object 'eparr'" } */
+}
-- 
2.26.3

[committed] analyzer: fixes to memcpy [PR103872]

2022-02-07 Thread David Malcolm via Gcc-patches

PR analyzer/103872 reports a failure of gcc.dg/analyzer/pr103526.c on
riscv64-unknown-elf-gcc.  The issue is that I wrote the test on x86_64
where a memcpy in the test is optimized to a write to a read/write pair,
whereas due to alignment differences the analyzer can see it as a
memcpy call, revealing problems with the analyzer's implementation
of memcpy.

This patch reimplements region_model::impl_call_memcpy in terms of a
get_store_value followed by a set_value, fixing the issue.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7086-g9d2c0fad59745b.

gcc/analyzer/ChangeLog:
PR analyzer/103872
* region-model-impl-calls.cc (region_model::impl_call_memcpy):
Reimplement in terms of a get_store_value followed by a set_value.

gcc/testsuite/ChangeLog:
PR analyzer/103872
* gcc.dg/analyzer/memcpy-1.c: Add alternate versions of test cases
in which the calls to memcpy are hidden from the optimizer.  Add
further test cases.
* gcc.dg/analyzer/taint-size-1.c: Add test coverage for memcpy
with tainted size.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-impl-calls.cc  |  28 ++---
 gcc/testsuite/gcc.dg/analyzer/memcpy-1.c | 125 +++
 gcc/testsuite/gcc.dg/analyzer/taint-size-1.c |   9 ++
 3 files changed, 148 insertions(+), 14 deletions(-)

diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index e9592975332..95d9921c61d 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -491,29 +491,29 @@ region_model::impl_call_malloc (const call_details )
 }
 
 /* Handle the on_call_pre part of "memcpy" and "__builtin_memcpy".  */
+// TODO: complain about overlapping src and dest.
 
 void
 region_model::impl_call_memcpy (const call_details )
 {
-  const svalue *dest_sval = cd.get_arg_svalue (0);
+  const svalue *dest_ptr_sval = cd.get_arg_svalue (0);
+  const svalue *src_ptr_sval = cd.get_arg_svalue (1);
   const svalue *num_bytes_sval = cd.get_arg_svalue (2);
 
-  const region *dest_reg = deref_rvalue (dest_sval, cd.get_arg_tree (0),
+  const region *dest_reg = deref_rvalue (dest_ptr_sval, cd.get_arg_tree (0),
 cd.get_ctxt ());
+  const region *src_reg = deref_rvalue (src_ptr_sval, cd.get_arg_tree (1),
+   cd.get_ctxt ());
 
-  cd.maybe_set_lhs (dest_sval);
-
-  if (tree num_bytes = num_bytes_sval->maybe_get_constant ())
-{
-  /* "memcpy" of zero size is a no-op.  */
-  if (zerop (num_bytes))
-   return;
-}
-
-  check_region_for_write (dest_reg, cd.get_ctxt ());
+  cd.maybe_set_lhs (dest_ptr_sval);
 
-  /* Otherwise, mark region's contents as unknown.  */
-  mark_region_as_unknown (dest_reg, cd.get_uncertainty ());
+  const region *sized_src_reg
+= m_mgr->get_sized_region (src_reg, NULL_TREE, num_bytes_sval);
+  const region *sized_dest_reg
+= m_mgr->get_sized_region (dest_reg, NULL_TREE, num_bytes_sval);
+  const svalue *src_contents_sval
+= get_store_value (sized_src_reg, cd.get_ctxt ());
+  set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
 }
 
 /* Handle the on_call_pre part of "memset" and "__builtin_memset".  */
diff --git a/gcc/testsuite/gcc.dg/analyzer/memcpy-1.c 
b/gcc/testsuite/gcc.dg/analyzer/memcpy-1.c
index f120eac19b3..a9368d3307d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/memcpy-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/memcpy-1.c
@@ -1,6 +1,16 @@
 #include 
 #include "analyzer-decls.h"
 
+/* Function for thwarting expansion of memcpy by optimizer.  */
+
+typedef void * (*memcpy_t) (void *dst, const void *src, size_t n);
+  
+static memcpy_t __attribute__((noinline))
+get_memcpy (void)
+{
+  return memcpy;
+}
+
 void *test_1 (void *dst, void *src, size_t n)
 {
   void *result = memcpy (dst, src, n);
@@ -15,6 +25,14 @@ void *test_1a (void *dst, void *src, size_t n)
   return result;
 }
 
+void *test_1b (void *dst, void *src, size_t n)
+{
+  memcpy_t fn = get_memcpy ();
+  void *result = fn (dst, src, n);
+  __analyzer_eval (result == dst); /* { dg-warning "TRUE" } */
+  return result;
+}
+
 void test_2 (int i)
 {
   int j;
@@ -29,6 +47,14 @@ void test_2a (int i)
   __analyzer_eval (i == j);  /* { dg-warning "TRUE" } */
 }
 
+void test_2b (int i)
+{
+  int j;
+  memcpy_t fn = get_memcpy ();
+  fn (, , sizeof (int));
+  __analyzer_eval (i == j); /* { dg-warning "TRUE" } */
+}
+
 void test_3 (void *src, size_t n)
 {
   char buf[40], other[40];
@@ -41,3 +67,102 @@ void test_3 (void *src, size_t n)
   __analyzer_eval (buf[0] == 'a');/* { dg-warning "UNKNOWN" } */
   __analyzer_eval (other[0] == 'b');  /* { dg-warning "TRUE" } */
 }
+
+void test_3b (void *src, size_t n)
+{
+  char buf[40], other[40];
+  memcpy_t fn = get_memcpy ();
+  buf[0] = 'a';
+  other[0] = 'b';
+  __analyzer_eval (buf[0] == 'a');/* { dg-warning "TRUE" } */
+  __analyzer_eval (other[0] ==

Re: [PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-07 Thread Segher Boessenkool

Hi!

On Mon, Feb 07, 2022 at 04:20:24PM -0600, Bill Schmidt wrote:
> I observed recently that a couple of Power10 instructions and built-in 
> functions
> were somehow not implemented.  This patch adds one of them (vmsumcud).  
> Although
> this isn't normally stage-4 material, this is really simple and carries no
> discernible risk, so I hope it can be considered.

But what is the advantage?  That will be very tiny as well, afaics?

Ah, this implements a builtin as well.  But that builtin is not in the
PVIPR, so no one yet uses it most likely?

> gcc/
>   * config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
>   * config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
>   * config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
>   (vmsumcud): New define_insn.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/vec-msumc.c: New test.

> +;; vmsumcud
> +(define_insn "vmsumcud"
> +[(set (match_operand:V1TI 0 "register_operand" "+v")
> +  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> +(match_operand:V2DI 2 "register_operand" "v")
> + (match_operand:V1TI 3 "register_operand" "v")]
> +UNSPEC_VMSUMCUD))]
> +  "TARGET_POWER10"
> +  "vmsumcud %0,%1,%2,%3"
> +  [(set_attr "type" "vecsimple")]
> +)

This can be properly described in RTL instead of using an unspec.  This
is much preferable.  I would say compare to maddhd[u], but those insns
aren't implemented either (maddld is though).


Segher

[PATCH] rs6000: Add support for vmsumcud and vec_msumc

2022-02-07 Thread Bill Schmidt via Gcc-patches

Hi!

I observed recently that a couple of Power10 instructions and built-in functions
were somehow not implemented.  This patch adds one of them (vmsumcud).  Although
this isn't normally stage-4 material, this is really simple and carries no
discernible risk, so I hope it can be considered.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2022-02-07  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
* config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
* config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
(vmsumcud): New define_insn.

gcc/testsuite/
* gcc.target/powerpc/vec-msumc.c: New test.
---
 gcc/config/rs6000/rs6000-builtins.def|  3 ++
 gcc/config/rs6000/rs6000-overload.def|  4 ++
 gcc/config/rs6000/vsx.md | 13 +++
 gcc/testsuite/gcc.target/powerpc/vec-msumc.c | 39 
 4 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-msumc.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index d0ea54d77e4..846c0bafd45 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3497,6 +3497,9 @@
   const signed int __builtin_altivec_vstrihr_p (vss);
 VSTRIHR_P vstrir_p_v8hi {}
 
+  const vuq __builtin_vsx_vmsumcud (vull, vull, vuq);
+VMSUMCUD vmsumcud {}
+
   const signed int __builtin_vsx_xvtlsbb_all_ones (vsc);
 XVTLSBB_ONES xvtlsbbo {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 5e38d597722..44e2945aaa0 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -2456,6 +2456,10 @@
   vuq __builtin_vec_msum (vull, vull, vuq);
 VMSUMUDM  VMSUMUDM_U
 
+[VEC_MSUMC, vec_msumc, __builtin_vec_msumc]
+  vuq __builtin_vec_msumc (vull, vull, vuq);
+VMSUMCUD
+
 [VEC_MSUMS, vec_msums, __builtin_vec_msums]
   vui __builtin_vec_msums (vus, vus, vui);
 VMSUMUHS
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 88053f11e29..e4904102526 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -372,6 +372,7 @@ (define_c_enum "unspec"
UNSPEC_REPLACE_UN
UNSPEC_VDIVES
UNSPEC_VDIVEU
+   UNSPEC_VMSUMCUD
UNSPEC_XXEVAL
UNSPEC_XXSPLTIW
UNSPEC_XXSPLTIDP
@@ -6615,3 +6616,15 @@ (define_split
   emit_move_insn (operands[0], tmp4);
   DONE;
 })
+
+;; vmsumcud
+(define_insn "vmsumcud"
+[(set (match_operand:V1TI 0 "register_operand" "+v")
+  (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
+(match_operand:V2DI 2 "register_operand" "v")
+   (match_operand:V1TI 3 "register_operand" "v")]
+  UNSPEC_VMSUMCUD))]
+  "TARGET_POWER10"
+  "vmsumcud %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")]
+)
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-msumc.c 
b/gcc/testsuite/gcc.target/powerpc/vec-msumc.c
new file mode 100644
index 000..524a2225c6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-msumc.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+#include 
+
+#define DEBUG 0
+
+#if DEBUG
+#include 
+#endif
+
+extern void abort (void);
+
+int
+main ()
+{
+  vector unsigned long long arg1, arg2;
+  vector unsigned __int128 arg3, result, expected;
+  unsigned __int128 c = (unsigned __int128) (-1); /* 2^128 - 1 */
+
+  arg1 = (vector unsigned long long) { 111ULL, 300ULL };
+  arg2 = (vector unsigned long long) { 700ULL, 222ULL };
+  arg3 = (vector unsigned __int128) { c };
+  expected = (vector unsigned __int128) { 1 };
+
+  result = vec_msumc (arg1, arg2, arg3);
+  if (result[0] != expected[0])
+{
+#if DEBUG
+  printf ("ERROR, expected %d, result %d\n",
+ (unsigned int) expected[0],
+ (unsigned int) result[0]);
+#else
+  abort ();
+#endif
+}
+
+  return 0;
+}
-- 
2.27.0

Re: [PATCH] RISC-V: Enable TARGET_SUPPORTS_WIDE_INT

2022-02-07 Thread Vineet Gupta





On 2/7/22 10:58, Palmer Dabbelt wrote:

On Mon, 07 Feb 2022 09:41:10 PST (-0800), Vineet Gupta wrote:



On 2/7/22 01:28, Philipp Tomsich wrote:

Vineet,

On Mon, 7 Feb 2022 at 07:06, Vineet Gupta  wrote:

This is at par with other major arches such as aarch64, i386, s390 ...

No testsuite regressions: same numbers w/ w/o

Putting that check seems a good idea, but I haven't seen any cases
related to this get through anyway.
Do you have seen any instances where the backend got this wrong? If
so, please share, so we can run a fuller regression and see any
performance impact.


No, there were no failures which this fixes. Seems like other arches did
this back in 2015.
When aarch64 did similar change, commit 2ca5b4303bd5, a directed generic
test pr68129_1.c was added which doesn't fail before/after.


The only offending MD pattern we had was for for constant 0, which 
IIUC should be a const_int now (and has been for some time) so 
shouldn't even have been matching anything.  I was worried about the 
fcvt-based moves on rv32, but my trivial test indicates those still 
work fine


   double foo(void)
   {
   return 0;
   }
      foo:
   fcvt.d.w    fa0,x0
   ret

so I'm assuming they're coming in through const_int as well. Probably 
worth a full rv32 testsuite run, but as far as I can tell we were 
essentially TARGET_SUPPORTS_WIDE_INT clean already so this should be 
pretty safe.


Ok I'll go off and run the rv32 suite just to be safe.



Unfortunately the patch isn't trivially applying on trunk: it's 
targeting the wrong files and is showing some whitespace issues 
(though those may have been a result of me attempting to clean stuff 
up).  I assuming that means that the tests weren't run on trunk, though.


I tested both gcc 11 and trunk. Both were clean. My bad that I posted 
the patch off of internal gcc 11 tree.




I put a cleaned up version over here 
 
in case that helps anyone.  I haven't run the regressions, but otherwise


Reviewed-by: Palmer Dabbelt 

LMK if you want me to run the test suite. 


I'd be great if you can verify. I'll go off and setup a rc32 test setup 
as well.


IIUC we're still a bit away from the GCC 12 branch, and given this 
doesn't fix any manifestable bugs it should be held for 13.


Sure thing.

Thx,
-Vineet

Re: [Patch, fortran] PR37336 (Finalization) - [F03] Finish derived-type finalization

2022-02-07 Thread Harald Anlauf via Gcc-patches


Hi Paul,

thanks for attacking this.

I haven't looked at the actual patch, only tried to check the new
testcases with other compilers.

Am 03.02.22 um 18:14 schrieb Paul Richard Thomas via Fortran:

I have tried to interpret F2018 7.5.6.2 and 7.5.6.3 as well as possible.
This is not always straightforward and has involved a lot of head
scratching! I have used the Intel compiler as a litmus test for the
outcomes. This was largely motivated by the observation that, in the user
survey conducted by Steve Lionel, gfortran and ifort are often used
together . Therefore, quite aside from wishing to comply with the standard
as far as possible, it is more than reasonable that the two compilers
comply. On application of this patch, only exception to this is the
treatment of finalization of arrays of extended types, where the Intel
takes "If the entity is of extended type and the parent type is
finalizable, the parent component is finalized" such that the parent
component is finalized one element at a time, whereas gfortran finalises
the parent components as an array. I strongly suspect that, from reading
7.5.6.2 paragraphs 2 and 3 closely, that ifort has it right. However, this
is another issue to come back to in the future.


Could you specify which version of Intel you tried?

Testcase finalize_38.f90 fails for me with ifort 2021.5.0 with:

131

This test also fails with crayftn 11 & 12 and nagfor 7.0,
but in a different place.

(Also finalize_45.f90 fails with that version with something that
looks like memory corruption, but that might be just a compiler bug.)

Thanks,
Harald

Re: [PATCH] libstdc++: Decouple HAVE_FCNTL_H from HAVE_DIRENT_H check

2022-02-07 Thread Jonathan Wakely via Gcc-patches

On Mon, 7 Feb 2022 at 21:01, Jonathan Wakely  wrote:

>
>
> On Mon, 7 Feb 2022 at 20:12, Dimitar Dimitrov  wrote:
>
>> On PRU target with newlib, we have the following combination in config.h:
>>   /* #undef HAVE_DIRENT_H */
>>   #define HAVE_FCNTL_H 1
>>   #define HAVE_UNLINKAT 1
>>
>> In newlib, targets which do not define dirent.h, get a build error when
>> including :
>>
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/include/sys/dirent.h;hb=HEAD
>>
>> While fs_dir.cc correctly checks for HAVE_FCNTL_H, dir-common.h doesn't,
>> and instead uses HAVE_DIRENT_H. This results in unlinkat() function call
>> in fs_dir.cc without the needed  include in dir-common.h. Thus
>> a build failure:
>>   .../gcc/libstdc++-v3/src/c++17/fs_dir.cc:151:11: error: ‘::unlinkat’
>> has not been declared; did you mean ‘unlink’?
>>
>> Fix by encapsulating  include with the correct check.
>>
>
> But there's no point doing anything in that file if we don't have
> , the whole thing is unusable. There's no point making the
> members using unlinkat compile if you can't ever construct the type.
>
> So I think we want a different fix.
>


Maybe something like:

--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -70,6 +70,8 @@ struct DIR { };
inline DIR* opendir(const char*) { return nullptr; }
inline dirent* readdir(DIR*) { return nullptr; }
inline int closedir(DIR*) { return -1; }
+#undef _GLIBCXX_HAVE_DIRFD
+#undef _GLIBCXX_HAVE_UNLINKAT
#endif
} // namespace __gnu_posix

Or adding wrappers for those in namespace posix.

Re: [PATCH] libstdc++: Decouple HAVE_FCNTL_H from HAVE_DIRENT_H check

2022-02-07 Thread Jonathan Wakely via Gcc-patches

On Mon, 7 Feb 2022 at 20:12, Dimitar Dimitrov  wrote:

> On PRU target with newlib, we have the following combination in config.h:
>   /* #undef HAVE_DIRENT_H */
>   #define HAVE_FCNTL_H 1
>   #define HAVE_UNLINKAT 1
>
> In newlib, targets which do not define dirent.h, get a build error when
> including :
>
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/include/sys/dirent.h;hb=HEAD
>
> While fs_dir.cc correctly checks for HAVE_FCNTL_H, dir-common.h doesn't,
> and instead uses HAVE_DIRENT_H. This results in unlinkat() function call
> in fs_dir.cc without the needed  include in dir-common.h. Thus
> a build failure:
>   .../gcc/libstdc++-v3/src/c++17/fs_dir.cc:151:11: error: ‘::unlinkat’ has
> not been declared; did you mean ‘unlink’?
>
> Fix by encapsulating  include with the correct check.
>

But there's no point doing anything in that file if we don't have
, the whole thing is unusable. There's no point making the
members using unlinkat compile if you can't ever construct the type.

So I think we want a different fix.


> Regtested x86_64-pc-linux-gnu and no new failures detected.
>
> Fixes commit r12-7062-gebf61754647689 (libstdc++: Fix
> filesystem::remove_all races).
>
> libstdc++-v3/ChangeLog:
>
> * src/filesystem/dir-common.h (_GLIBCXX_HAVE_FCNTL_H): Move the
> check outside the HAVE_DIRENT_H check.
>
> Signed-off-by: Dimitar Dimitrov 
> ---
>  libstdc++-v3/src/filesystem/dir-common.h | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/src/filesystem/dir-common.h
> b/libstdc++-v3/src/filesystem/dir-common.h
> index 0b7665a3f70..cfce4fae9a4 100644
> --- a/libstdc++-v3/src/filesystem/dir-common.h
> +++ b/libstdc++-v3/src/filesystem/dir-common.h
> @@ -35,10 +35,11 @@
>  #  include 
>  # endif
>  # include  // opendir, readdir, fdopendir, dirfd
> -# ifdef _GLIBCXX_HAVE_FCNTL_H
> -#  include   // open, openat, fcntl, AT_FDCWD, O_NOFOLLOW etc.
> -#  include  // close, unlinkat
> -# endif
> +#endif
> +
> +#ifdef _GLIBCXX_HAVE_FCNTL_H
> +# include   // open, openat, fcntl, AT_FDCWD, O_NOFOLLOW etc.
> +# include  // close, unlinkat
>  #endif
>
>  namespace std _GLIBCXX_VISIBILITY(default)
> --
> 2.34.1
>
>

Re: [PATCH] c++: satisfaction value of type const bool [PR104410]

2022-02-07 Thread Jason Merrill via Gcc-patches


On 2/7/22 09:57, Patrick Palka wrote:

Here constant evaluation of the atomic constraint use_func_v with T=int
sensibly yields an INTEGER_CST of type const bool, but the assert in
satisfaction_value expects unqualified bool.  Let's just relax the
assert to accept cv-qualified bool more generally.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and 11?


OK for both.


PR c++/104410

gcc/cp/ChangeLog:

* constraint.cc (satisfaction_value): Relax assert to accept
cv-qualified bool.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-pr104410.C: New test.
---
  gcc/cp/constraint.cc   | 3 ++-
  gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C | 6 ++
  2 files changed, 8 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index cc4df4216ef..06ebbd1d19e 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2818,7 +2818,8 @@ satisfaction_value (tree t)
  return t;
  
gcc_assert (TREE_CODE (t) == INTEGER_CST

- && same_type_p (TREE_TYPE (t), boolean_type_node));
+ && same_type_ignoring_top_level_qualifiers_p (TREE_TYPE (t),
+   boolean_type_node));
if (integer_zerop (t))
  return boolean_false_node;
else
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C
new file mode 100644
index 000..dac08e10a0f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C
@@ -0,0 +1,6 @@
+// PR c++/104410
+// { dg-do compile { target c++20 } }
+
+template constexpr bool use_func_v{};
+template void f() requires use_func_v || true { }
+template void f();

Re: [PATCH] c++: deducing only from noexcept-spec [PR80951]

2022-02-07 Thread Jason Merrill via Gcc-patches


On 2/7/22 09:57, Patrick Palka wrote:

The fail-fast predicate uses_deducible_template_parms used by
unify_one_argument is neglecting to look inside the noexcept-spec of a
function type, and this causes deduction to fail for the below testcase
since the noexcept-spec is the only part of the type which contains a
deducible template parameter.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/80951

gcc/cp/ChangeLog:

* pt.cc (uses_deducible_template_parms): Consider the
noexcept-spec of a function type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/noexcept-type25.C: New test.
---
  gcc/cp/pt.cc |  5 +
  gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C | 13 +
  2 files changed, 18 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c7af4712d8b..97d247f6684 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -22398,6 +22398,11 @@ uses_deducible_template_parms (tree type)
for (; parm; parm = TREE_CHAIN (parm))
if (uses_deducible_template_parms (TREE_VALUE (parm)))
  return true;
+  if (flag_noexcept_type
+ && TYPE_RAISES_EXCEPTIONS (type)
+ && TREE_PURPOSE (TYPE_RAISES_EXCEPTIONS (type))
+ && deducible_expression (TREE_PURPOSE (TYPE_RAISES_EXCEPTIONS 
(type
+   return true;
  }
  
return false;

diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C 
b/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C
new file mode 100644
index 000..ef5c8265541
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C
@@ -0,0 +1,13 @@
+// PR c++/80951
+// { dg-do compile { target c++17 } }
+
+void f() noexcept;
+void g();
+
+template
+constexpr bool h(void (*)() noexcept(E)) {
+  return E;
+}
+
+static_assert(h(f));
+static_assert(!h(g));

Re: [PATCH] c++: lambda in pack expansion using outer pack in constraint [PR103706]

2022-02-07 Thread Jason Merrill via Gcc-patches


On 2/7/22 12:34, Patrick Palka wrote:

On Thu, 3 Feb 2022, Jason Merrill wrote:


On 2/3/22 15:55, Patrick Palka wrote:

On Thu, Feb 3, 2022 at 3:20 PM Jason Merrill  wrote:


On 2/3/22 12:04, Patrick Palka wrote:

On Wed, 2 Feb 2022, Jason Merrill wrote:


On 2/2/22 12:09, Patrick Palka wrote:

The satisfaction cache needs to look through ARGUMENT_PACK_SELECT
template arguments before calling iterative_hash_template_arg and
template_args_equal, which would otherwise crash.


Maybe we should handle ARGUMENT_PACK_SELECT in
iterative_hash_template_arg and
template_args_equal, rather than their callers?


Like so?  I can't think of a good reason not to handle
ARGUMENT_PACK_SELECT
ther, but FWIW it looks like we had such handling in the past, until it
was removed in r7-2082.


Ah, right.  That came after my fix for 67164,

https://gcc.gnu.org/pipermail/gcc-patches/2016-March/443660.html

with that PR, we were seeing ARGUMENT_PACK_SELECT unexpectedly because
of excessive sharing of template argument lists.  Why are we seeing it
in satisfaction?


When instantiating the lambda inside the pack expansion, we call
tsubst_lambda_expr with args={ARGUMENT_PACK_SELECT}, wherein we add
these args to the lambda's LAMBDA_EXPR_REGEN_INFO for use later during
satisfaction.


Yes, that seems like the problem: We're remembering a transient artifact in
LAMBDA_EXPR_REGEN_INFO.  Presumably if the pack is longer than 1, the lambdas
for all elements of the expansion will have the same ARGUMENT_PACK_SELECT in
their LAMBDA_EXPR_REGEN_INFO, so they will all think they're in the final
element.


Aha, I think I get it now.  So ARGUMENT_PACK_SELECT is for reusing the
same template argument vector across different elements of a pack
expansion.



I guess tsubst_lambda_expr needs to replace an 'args' with
ARGUMENT_PACK_SELECT in it with a copy.  Or we could do away with
ARGUMENT_PACK_SELECT entirely in favor of tsubst_pack_expansion using separate
TREE_VECs for each element substitution.


Since ARGUMENT_PACK_SELECT seems like an important optimization I
think the other more localized approach would be more appropriate for
now.  How does the following look?


OK, thanks.


-- >8 --

Subject: [PATCH] c++: lambda in pack expansion using pack in constraint
  [PR103706]

Here when expanding the pack expansion pattern containing a constrained
lambda, the template argument for each Ts is an ARGUMENT_PACK_SELECT,
which we store inside the lambda's LAMBDA_EXPR_REGEN_INFO.  Then during
satisfaction of the lambda's constraint C the satisfaction cache
uses this argument as part of the key to the corresponding sat_entry, but
iterative_hash_template_arg and template_args_equal deliberately don't
handle ARGUMENT_PACK_SELECT.

Since it's wrong to preserve ARGUMENT_PACK_SELECT inside a hash table
due to its instability (as documented in iterative_hash_template_arg),
this patch helps make sure the satisfaction cache doesn't see such trees
by resolving ARGUMENT_PACK_SELECT arguments before adding them to
LAMBDA_EXPR_REGEN_INFO.

PR c++/103706

gcc/cp/ChangeLog:

* pt.cc (preserve_args): New function.
(tsubst_lambda_expr): Use it when setting LAMBDA_EXPR_REGEN_INFO.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-lambda19.C: New test.
---
  gcc/cp/pt.cc  | 41 ++-
  .../g++.dg/cpp2a/concepts-lambda19.C  | 11 +
  2 files changed, 50 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index b58067d50e9..07dd544aa52 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3743,6 +3743,42 @@ argument_pack_select_arg (tree t)
return arg;
  }
  
+/* Return a modification of ARGS (if necessary) that's suitable for

+   preserving inside a hash table.  In particular, this replaces each
+   ARGUMENT_PACK_SELECT with its underlying argument.  ARGS is copied
+   (upon modifying it) iff COW_P.  */
+
+static tree
+preserve_args (tree args, bool cow_p = true)
+{
+  if (args == NULL_TREE)
+return NULL_TREE;
+
+  for (int i = 0; i < TREE_VEC_LENGTH (args); ++i)
+{
+  tree t = TREE_VEC_ELT (args, i);
+  tree r;
+  if (!t)
+   r = NULL_TREE;
+  else if (TREE_CODE (t) == ARGUMENT_PACK_SELECT)
+   r = argument_pack_select_arg (t);
+  else if (TREE_CODE (t) == TREE_VEC)
+   r = preserve_args (t, cow_p);
+  else
+   r = t;
+  if (r != t)
+   {
+ if (cow_p)
+   {
+ args = copy_template_args (args);
+ cow_p = false;
+   }
+ TREE_VEC_ELT (args, i) = r;
+   }
+}
+
+  return args;
+}
  
  /* True iff FN is a function representing a built-in variadic parameter

 pack.  */
@@ -19511,10 +19547,11 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
LAMBDA_EXPR_MUTABLE_P (r) = LAMBDA_EXPR_MUTABLE_P (t);
if (tree ti = LAMBDA_EXPR_REGEN_INFO (t))

[PATCH] libstdc++: Decouple HAVE_FCNTL_H from HAVE_DIRENT_H check

2022-02-07 Thread Dimitar Dimitrov

On PRU target with newlib, we have the following combination in config.h:
  /* #undef HAVE_DIRENT_H */
  #define HAVE_FCNTL_H 1
  #define HAVE_UNLINKAT 1

In newlib, targets which do not define dirent.h, get a build error when
including :
  
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/include/sys/dirent.h;hb=HEAD

While fs_dir.cc correctly checks for HAVE_FCNTL_H, dir-common.h doesn't,
and instead uses HAVE_DIRENT_H. This results in unlinkat() function call
in fs_dir.cc without the needed  include in dir-common.h. Thus
a build failure:
  .../gcc/libstdc++-v3/src/c++17/fs_dir.cc:151:11: error: ‘::unlinkat’ has not 
been declared; did you mean ‘unlink’?

Fix by encapsulating  include with the correct check.

Regtested x86_64-pc-linux-gnu and no new failures detected.

Fixes commit r12-7062-gebf61754647689 (libstdc++: Fix filesystem::remove_all 
races).

libstdc++-v3/ChangeLog:

* src/filesystem/dir-common.h (_GLIBCXX_HAVE_FCNTL_H): Move the
check outside the HAVE_DIRENT_H check.

Signed-off-by: Dimitar Dimitrov 
---
 libstdc++-v3/src/filesystem/dir-common.h | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/src/filesystem/dir-common.h 
b/libstdc++-v3/src/filesystem/dir-common.h
index 0b7665a3f70..cfce4fae9a4 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -35,10 +35,11 @@
 #  include 
 # endif
 # include  // opendir, readdir, fdopendir, dirfd
-# ifdef _GLIBCXX_HAVE_FCNTL_H
-#  include   // open, openat, fcntl, AT_FDCWD, O_NOFOLLOW etc.
-#  include  // close, unlinkat
-# endif
+#endif
+
+#ifdef _GLIBCXX_HAVE_FCNTL_H
+# include   // open, openat, fcntl, AT_FDCWD, O_NOFOLLOW etc.
+# include  // close, unlinkat
 #endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
-- 
2.34.1

Re: [PATCH v2] doc: RISC-V: Document the `-misa-spec=' option

2022-02-07 Thread Palmer Dabbelt


On Sat, 05 Feb 2022 00:14:26 PST (-0800), gcc-patches@gcc.gnu.org wrote:

Thanks, LGTM :)

On Sat, Feb 5, 2022 at 7:56 AM Maciej W. Rozycki  wrote:


We have recently updated the default for the `-misa-spec=' option, yet
we still have not documented it nor its `--with-isa-spec=' counterpart
in the GCC manuals.  Fix that.

gcc/
* doc/install.texi (Configuration): Document `--with-isa-spec='
RISC-V option.
* doc/invoke.texi (Option Summary): List `-misa-spec=' RISC-V
option.
(RISC-V Options): Document it.
---
> Thanks.  I have a version of this floating around somewhere.  I probably 
forgot
> to post it and it's generally inferior to yours, so this LGTM.

 Thank you for your review.

> The only thing I'd point out is that this specifically controls the version of
> the Unprivileged (formally called user) specification, as there are many 
RISC-V
> specifications and it can be a bit ambiguous what folks mean when they just 
say
> "specification".  Not sure exactly what the right wording is there, maybe
> "version of the RISC-V Unprivileged (formerly user-level) ISA specification"?

 Good point.  I have updated the text to your suggested wording, which is
also what I would use if I were to propose it (modulo capitalisation).  I
will commit the change as included here shortly then unless I hear an
objection.


I wasn't sure about the capitalization either, but that's how it is on 
the website.  It's in a regular-looking sentence, not a title where I'd 
expect the extra capital, so I guess the whole spec name is a proper 
noun?


I don't really care either way, just chiming in as I don't see this on 
trunk.  The original message sounded to me like you intended on 
committing it, but just in case I misunderstood and you're waiting for 
me


Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

Thanks!



  Maciej

Changes from v1:

- Clarify it is the Unprivileged (formerly User-Level) ISA specification
  the options concerned refer to.

- Fix a typo `-misa-spec' vs `-misa-spec=' in ChangeLog.
---
 gcc/doc/install.texi |   14 ++
 gcc/doc/invoke.texi  |   17 +
 2 files changed, 31 insertions(+)

gcc-riscv-misa-doc.diff
Index: gcc/gcc/doc/install.texi
===
--- gcc.orig/gcc/doc/install.texi
+++ gcc/gcc/doc/install.texi
@@ -1599,6 +1599,20 @@ On certain targets this option sets the
 size as a power of two in bytes.  On AArch64 @var{size} is required to be 
either
 12 (4KB) or 16 (64KB).

+@item --with-isa-spec=@var{ISA-spec-string}
+On RISC-V targets specify the default version of the RISC-V Unprivileged
+(formerly User-Level) ISA specification to produce code conforming to.
+The possibilities for @var{ISA-spec-string} are:
+@table @code
+@item 2.2
+Produce code conforming to version 2.2.
+@item 20190608
+Produce code conforming to version 20190608.
+@item 20191213
+Produce code conforming to version 20191213.
+@end table
+In the absence of this configuration option the default version is 20191213.
+
 @item --enable-__cxa_atexit
 Define if you want to use __cxa_atexit, rather than atexit, to
 register C++ destructors for local statics and global objects.
Index: gcc/gcc/doc/invoke.texi
===
--- gcc.orig/gcc/doc/invoke.texi
+++ gcc/gcc/doc/invoke.texi
@@ -1184,6 +1184,7 @@ See RS/6000 and PowerPC Options.
 -mabi=@var{ABI-string} @gol
 -mfdiv  -mno-fdiv @gol
 -mdiv  -mno-div @gol
+-misa-spec=@var{ISA-spec-string} @gol
 -march=@var{ISA-string} @gol
 -mtune=@var{processor-string} @gol
 -mpreferred-stack-boundary=@var{num} @gol
@@ -27632,6 +27633,22 @@ Do or don't use hardware instructions fo
 M extension.  The default is to use them if the specified architecture has
 these instructions.

+@item -misa-spec=@var{ISA-spec-string}
+@opindex misa-spec
+Specify the version of the RISC-V Unprivileged (formerly User-Level)
+ISA specification to produce code conforming to.  The possibilities
+for @var{ISA-spec-string} are:
+@table @code
+@item 2.2
+Produce code conforming to version 2.2.
+@item 20190608
+Produce code conforming to version 20190608.
+@item 20191213
+Produce code conforming to version 20191213.
+@end table
+The default is @option{-misa-spec=20191213} unless GCC has been configured
+with @option{--with-isa-spec=} specifying a different default version.
+
 @item -march=@var{ISA-string}
 @opindex march
 Generate code for given RISC-V ISA (e.g.@: @samp{rv64im}).  ISA strings must be

Re: [PATCH] RISC-V: Enable TARGET_SUPPORTS_WIDE_INT

2022-02-07 Thread Palmer Dabbelt


On Mon, 07 Feb 2022 09:41:10 PST (-0800), Vineet Gupta wrote:



On 2/7/22 01:28, Philipp Tomsich wrote:

Vineet,

On Mon, 7 Feb 2022 at 07:06, Vineet Gupta  wrote:

This is at par with other major arches such as aarch64, i386, s390 ...

No testsuite regressions: same numbers w/ w/o

Putting that check seems a good idea, but I haven't seen any cases
related to this get through anyway.
Do you have seen any instances where the backend got this wrong? If
so, please share, so we can run a fuller regression and see any
performance impact.


No, there were no failures which this fixes. Seems like other arches did
this back in 2015.
When aarch64 did similar change, commit 2ca5b4303bd5, a directed generic
test pr68129_1.c was added which doesn't fail before/after.


The only offending MD pattern we had was for for constant 0, which IIUC 
should be a const_int now (and has been for some time) so shouldn't even 
have been matching anything.  I was worried about the fcvt-based moves 
on rv32, but my trivial test indicates those still work fine


   double foo(void)
   {
   return 0;
   }
   
   foo:

   fcvt.d.wfa0,x0
   ret

so I'm assuming they're coming in through const_int as well.  Probably 
worth a full rv32 testsuite run, but as far as I can tell we were 
essentially TARGET_SUPPORTS_WIDE_INT clean already so this should be 
pretty safe.


Unfortunately the patch isn't trivially applying on trunk: it's 
targeting the wrong files and is showing some whitespace issues (though 
those may have been a result of me attempting to clean stuff up).  I 
assuming that means that the tests weren't run on trunk, though.


I put a cleaned up version over here 
 
in case that helps anyone.  I haven't run the regressions, but otherwise


Reviewed-by: Palmer Dabbelt 

LMK if you want me to run the test suite.  IIUC we're still a bit away 
from the GCC 12 branch, and given this doesn't fix any manifestable bugs 
it should be held for 13.


Thanks!

Re: [PATCH] RISC-V: Enable TARGET_SUPPORTS_WIDE_INT

2022-02-07 Thread Vineet Gupta





On 2/7/22 01:28, Philipp Tomsich wrote:

Vineet,

On Mon, 7 Feb 2022 at 07:06, Vineet Gupta  wrote:

This is at par with other major arches such as aarch64, i386, s390 ...

No testsuite regressions: same numbers w/ w/o

Putting that check seems a good idea, but I haven't seen any cases
related to this get through anyway.
Do you have seen any instances where the backend got this wrong? If
so, please share, so we can run a fuller regression and see any
performance impact.


No, there were no failures which this fixes. Seems like other arches did 
this back in 2015.
When aarch64 did similar change, commit 2ca5b4303bd5, a directed generic 
test pr68129_1.c was added which doesn't fail before/after.


-Vineet

Re: [PATCH] c++: lambda in pack expansion using outer pack in constraint [PR103706]

2022-02-07 Thread Patrick Palka via Gcc-patches

On Thu, 3 Feb 2022, Jason Merrill wrote:

> On 2/3/22 15:55, Patrick Palka wrote:
> > On Thu, Feb 3, 2022 at 3:20 PM Jason Merrill  wrote:
> > > 
> > > On 2/3/22 12:04, Patrick Palka wrote:
> > > > On Wed, 2 Feb 2022, Jason Merrill wrote:
> > > > 
> > > > > On 2/2/22 12:09, Patrick Palka wrote:
> > > > > > The satisfaction cache needs to look through ARGUMENT_PACK_SELECT
> > > > > > template arguments before calling iterative_hash_template_arg and
> > > > > > template_args_equal, which would otherwise crash.
> > > > > 
> > > > > Maybe we should handle ARGUMENT_PACK_SELECT in
> > > > > iterative_hash_template_arg and
> > > > > template_args_equal, rather than their callers?
> > > > 
> > > > Like so?  I can't think of a good reason not to handle
> > > > ARGUMENT_PACK_SELECT
> > > > ther, but FWIW it looks like we had such handling in the past, until it
> > > > was removed in r7-2082.
> > > 
> > > Ah, right.  That came after my fix for 67164,
> > > 
> > > https://gcc.gnu.org/pipermail/gcc-patches/2016-March/443660.html
> > > 
> > > with that PR, we were seeing ARGUMENT_PACK_SELECT unexpectedly because
> > > of excessive sharing of template argument lists.  Why are we seeing it
> > > in satisfaction?
> > 
> > When instantiating the lambda inside the pack expansion, we call
> > tsubst_lambda_expr with args={ARGUMENT_PACK_SELECT}, wherein we add
> > these args to the lambda's LAMBDA_EXPR_REGEN_INFO for use later during
> > satisfaction.
> 
> Yes, that seems like the problem: We're remembering a transient artifact in
> LAMBDA_EXPR_REGEN_INFO.  Presumably if the pack is longer than 1, the lambdas
> for all elements of the expansion will have the same ARGUMENT_PACK_SELECT in
> their LAMBDA_EXPR_REGEN_INFO, so they will all think they're in the final
> element.

Aha, I think I get it now.  So ARGUMENT_PACK_SELECT is for reusing the
same template argument vector across different elements of a pack
expansion.

> 
> I guess tsubst_lambda_expr needs to replace an 'args' with
> ARGUMENT_PACK_SELECT in it with a copy.  Or we could do away with
> ARGUMENT_PACK_SELECT entirely in favor of tsubst_pack_expansion using separate
> TREE_VECs for each element substitution.

Since ARGUMENT_PACK_SELECT seems like an important optimization I
think the other more localized approach would be more appropriate for
now.  How does the following look?

-- >8 --

Subject: [PATCH] c++: lambda in pack expansion using pack in constraint
 [PR103706]

Here when expanding the pack expansion pattern containing a constrained
lambda, the template argument for each Ts is an ARGUMENT_PACK_SELECT,
which we store inside the lambda's LAMBDA_EXPR_REGEN_INFO.  Then during
satisfaction of the lambda's constraint C the satisfaction cache
uses this argument as part of the key to the corresponding sat_entry, but
iterative_hash_template_arg and template_args_equal deliberately don't
handle ARGUMENT_PACK_SELECT.

Since it's wrong to preserve ARGUMENT_PACK_SELECT inside a hash table
due to its instability (as documented in iterative_hash_template_arg),
this patch helps make sure the satisfaction cache doesn't see such trees
by resolving ARGUMENT_PACK_SELECT arguments before adding them to
LAMBDA_EXPR_REGEN_INFO.

PR c++/103706

gcc/cp/ChangeLog:

* pt.cc (preserve_args): New function.
(tsubst_lambda_expr): Use it when setting LAMBDA_EXPR_REGEN_INFO.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-lambda19.C: New test.
---
 gcc/cp/pt.cc  | 41 ++-
 .../g++.dg/cpp2a/concepts-lambda19.C  | 11 +
 2 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda19.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index b58067d50e9..07dd544aa52 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3743,6 +3743,42 @@ argument_pack_select_arg (tree t)
   return arg;
 }
 
+/* Return a modification of ARGS (if necessary) that's suitable for
+   preserving inside a hash table.  In particular, this replaces each
+   ARGUMENT_PACK_SELECT with its underlying argument.  ARGS is copied
+   (upon modifying it) iff COW_P.  */
+
+static tree
+preserve_args (tree args, bool cow_p = true)
+{
+  if (args == NULL_TREE)
+return NULL_TREE;
+
+  for (int i = 0; i < TREE_VEC_LENGTH (args); ++i)
+{
+  tree t = TREE_VEC_ELT (args, i);
+  tree r;
+  if (!t)
+   r = NULL_TREE;
+  else if (TREE_CODE (t) == ARGUMENT_PACK_SELECT)
+   r = argument_pack_select_arg (t);
+  else if (TREE_CODE (t) == TREE_VEC)
+   r = preserve_args (t, cow_p);
+  else
+   r = t;
+  if (r != t)
+   {
+ if (cow_p)
+   {
+ args = copy_template_args (args);
+ cow_p = false;
+   }
+ TREE_VEC_ELT (args, i) = r;
+   }
+}
+
+  return args;
+}
 
 /* True iff FN is a function representing a built-in variadic parameter
pack.  */
@@ -19511,10 +19547,11 @@ tsubst_lambda_expr (tree

Re: [PATCH] RISC-V/testsuite: Run target testing over all the usual optimization levels

2022-02-07 Thread Kito Cheng via Gcc-patches

Hi Maciej:

Thanks for doing this, OK to trunk.

On Tue, Feb 1, 2022 at 7:04 AM Maciej W. Rozycki  wrote:
>
> Use `gcc-dg-runtest' test driver rather than `dg-runtest' to run the
> RISC-V testsuite as several targets already do.  Adjust test options
> across individual test cases accordingly where required.
>
> As some tests want to be run at `-Og', add a suitable optimization
> variant via ADDITIONAL_TORTURE_OPTIONS, and include the moderately
> recent `-Oz' variant as well.
>
> * testsuite/gcc.target/riscv/riscv.exp: Use `gcc-dg-runtest'
> rather than `dg-runtest'.  Add `-Og -g' and `-Oz' variants via
> ADDITIONAL_TORTURE_OPTIONS.
> * testsuite/gcc.target/riscv/arch-1.c: Adjust test options
> accordingly.
> * testsuite/gcc.target/riscv/arch-10.c: Likewise.
> * testsuite/gcc.target/riscv/arch-11.c: Likewise.
> * testsuite/gcc.target/riscv/arch-12.c: Likewise.
> * testsuite/gcc.target/riscv/arch-2.c: Likewise.
> * testsuite/gcc.target/riscv/arch-3.c: Likewise.
> * testsuite/gcc.target/riscv/arch-4.c: Likewise.
> * testsuite/gcc.target/riscv/arch-5.c: Likewise.
> * testsuite/gcc.target/riscv/arch-6.c: Likewise.
> * testsuite/gcc.target/riscv/arch-7.c: Likewise.
> * testsuite/gcc.target/riscv/arch-8.c: Likewise.
> * testsuite/gcc.target/riscv/arch-9.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-1.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-10.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-11.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-12.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-13.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-14.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-15.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-16.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-17.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-2.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-3.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-4.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-5.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-7.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-8.c: Likewise.
> * testsuite/gcc.target/riscv/attribute-9.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-1.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-2.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-3.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-4.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-conflict-mode.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-debug.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-mmode.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-smode.c: Likewise.
> * testsuite/gcc.target/riscv/interrupt-umode.c: Likewise.
> * testsuite/gcc.target/riscv/li.c: Likewise.
> * testsuite/gcc.target/riscv/load-immediate.c: Likewise.
> * testsuite/gcc.target/riscv/losum-overflow.c: Likewise.
> * testsuite/gcc.target/riscv/mcpu-6.c: Likewise.
> * testsuite/gcc.target/riscv/mcpu-7.c: Likewise.
> * testsuite/gcc.target/riscv/pr102957.c: Likewise.
> * testsuite/gcc.target/riscv/pr103302.c: Likewise.
> * testsuite/gcc.target/riscv/pr104140.c: Likewise.
> * testsuite/gcc.target/riscv/pr84660.c: Likewise.
> * testsuite/gcc.target/riscv/pr93202.c: Likewise.
> * testsuite/gcc.target/riscv/pr93304.c: Likewise.
> * testsuite/gcc.target/riscv/pr95252.c: Likewise.
> * testsuite/gcc.target/riscv/pr95683.c: Likewise.
> * testsuite/gcc.target/riscv/pr98777.c: Likewise.
> * testsuite/gcc.target/riscv/pr99702.c: Likewise.
> * testsuite/gcc.target/riscv/predef-1.c: Likewise.
> * testsuite/gcc.target/riscv/predef-10.c: Likewise.
> * testsuite/gcc.target/riscv/predef-11.c: Likewise.
> * testsuite/gcc.target/riscv/predef-12.c: Likewise.
> * testsuite/gcc.target/riscv/predef-13.c: Likewise.
> * testsuite/gcc.target/riscv/predef-14.c: Likewise.
> * testsuite/gcc.target/riscv/predef-15.c: Likewise.
> * testsuite/gcc.target/riscv/predef-16.c: Likewise.
> * testsuite/gcc.target/riscv/predef-2.c: Likewise.
> * testsuite/gcc.target/riscv/predef-3.c: Likewise.
> * testsuite/gcc.target/riscv/predef-4.c: Likewise.
> * testsuite/gcc.target/riscv/predef-5.c: Likewise.
> * testsuite/gcc.target/riscv/predef-6.c: Likewise.
> * testsuite/gcc.target/riscv/predef-7.c: Likewise.
> * testsuite/gcc.target/riscv/predef-8.c: Likewise.
> * testsuite/gcc.target/riscv/promote-type-for-libcall.c: Likewise.
> *

Re: [Patch]middle-end: updating the reg use in exit block for -fzero-call-used-regs [PR100775]

2022-02-07 Thread Qing Zhao via Gcc-patches

Ping.

thanks.

Qing

> On Jan 28, 2022, at 12:03 PM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> Hi,
> 
> PR 100775 ( ICE: in df_exit_block_bitmap_verify, at df-scan.c:4164 with 
> -mthumb -fzero-call-used-regs=used)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100775
> 
> Although the ICE only happens on arm, but this is a bug in the middle end. 
> So, I think this bug has higher priority, 
> Need to be included into gcc12, and also need to be back ported to gcc11. 
> 
> In the pass_zero_call_used_regs, when updating dataflow info after adding
> the register zeroing sequence in the epilogue of the function, we should
> call "df_update_exit_block_uses" to update the register use information in
> the exit block to include all the registers that have been zeroed.
> 
> The change has been bootstrapped and reg-tested on both x86 and aarch64 (with 
> -enable-checking=yes,rtl,df). 
> Since I cannot find an arm machine,  no bootstrap and reg-tested on arm yet.
> 
> For the arm failure, I just tested it with the cross build and it has no 
> issue withe the fix.
> 
> (One question here:
> Previously, I though “df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun))” and a 
> later “df_analyze()” should rescan 
> the changed exit block of the function, and update all the df info 
> automatically, it apparently not the case, the register
> use info at exit block is not automatically updated, we have to add an 
> explicitly call to “df_update_exit_block_uses”.
> I checked the pass_thread_prologue_and_epilogue, looks like it also 
> explicitly calls “df_update_entry_exit_and_calls” 
> to update the register use info.
> Shall the “df_set_bb_dirty” + “df_analyze” automatically update the reg use 
> info of the dirty block?).
> 
> Let me know whether there is any issue with the fix?
> 
> Thanks
> 
> Qing
> 
> ===
> 
> From e1cca5659c85e7c536f5016a2c75c615e65dba75 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Fri, 28 Jan 2022 16:29:51 +
> Subject: [PATCH] middle-end: updating the reg use in exit block for
> -fzero-call-used-regs [PR100775]
> 
> In the pass_zero_call_used_regs, when updating dataflow info after adding
> the register zeroing sequence in the epilogue of the function, we should
> call "df_update_exit_block_uses" to update the register use information in
> the exit block to include all the registers that have been zeroed.
> 
> 2022-01-27  Qing Zhao  
> 
> gcc/ChangeLog:
> 
>   * function.cc (gen_call_used_regs_seq): Call
>   df_update_exit_block_uses when updating df.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/pr100775.c: New test.
> ---
> gcc/function.cc | 1 +
> gcc/testsuite/gcc.target/arm/pr100775.c | 8 
> 2 files changed, 9 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/arm/pr100775.c
> 
> diff --git a/gcc/function.cc b/gcc/function.cc
> index e1d2565f8d92..c8a77c9a6246 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -5942,6 +5942,7 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int 
> zero_regs_type)
>   /* Update the data flow information.  */
>   crtl->must_be_zero_on_return |= zeroed_hardregs;
>   df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
> +  df_update_exit_block_uses ();
> }
> }
> 
> diff --git a/gcc/testsuite/gcc.target/arm/pr100775.c 
> b/gcc/testsuite/gcc.target/arm/pr100775.c
> new file mode 100644
> index ..dd2255a95492
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr100775.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mthumb -fzero-call-used-regs=used" } */
> +
> +int
> +foo (int x)
> +{
> +  return x;
> +}
> -- 
> 2.27.0
> 
> 
> 
> 
>

Re: [PATCH 7/8] rs6000: vec_neg built-ins wrongly require POWER8

2022-02-07 Thread Bill Schmidt via Gcc-patches

Hi Segher,

Thanks for all the reviews for this series!  I'd like to gently ping the last 
two patches.

BR,
Bill

On 1/28/22 11:50 AM, Bill Schmidt via Gcc-patches wrote:
> As the subject states.  Fixing this is accomplished by moving the built-ins
> to the correct stanzas, [altivec] and [vsx].
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
> Is this okay for trunk?
>
> Thanks,
> Bill
>
>
> 2022-01-27  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin.def (NEG_V16QI): Move to [altivec]
>   stanza.
>   (NEG_V4SF): Likewise.
>   (NEG_V4SI): Likewise.
>   (NEG_V8HI): Likewise.
>   (NEG_V2DF): Move to [vsx] stanza.
>   (NEG_V2DI): Likewise.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 36 +--
>  1 file changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 2bb997a5279..c8f0cf332eb 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -410,6 +410,18 @@
>const vss __builtin_altivec_nabs_v8hi (vss);
>  NABS_V8HI nabsv8hi2 {}
>
> +  const vsc __builtin_altivec_neg_v16qi (vsc);
> +NEG_V16QI negv16qi2 {}
> +
> +  const vf __builtin_altivec_neg_v4sf (vf);
> +NEG_V4SF negv4sf2 {}
> +
> +  const vsi __builtin_altivec_neg_v4si (vsi);
> +NEG_V4SI negv4si2 {}
> +
> +  const vss __builtin_altivec_neg_v8hi (vss);
> +NEG_V8HI negv8hi2 {}
> +
>void __builtin_altivec_stvebx (vsc, signed long, void *);
>  STVEBX altivec_stvebx {stvec}
>
> @@ -1175,6 +1187,12 @@
>const vsll __builtin_altivec_nabs_v2di (vsll);
>  NABS_V2DI nabsv2di2 {}
>
> +  const vd __builtin_altivec_neg_v2df (vd);
> +NEG_V2DF negv2df2 {}
> +
> +  const vsll __builtin_altivec_neg_v2di (vsll);
> +NEG_V2DI negv2di2 {}
> +
>void __builtin_altivec_stvx_v2df (vd, signed long, void *);
>  STVX_V2DF altivec_stvx_v2df {stvec}
>
> @@ -2118,24 +2136,6 @@
>const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
>  NAND_V8HI_UNS nandv8hi3 {}
>
> -  const vsc __builtin_altivec_neg_v16qi (vsc);
> -NEG_V16QI negv16qi2 {}
> -
> -  const vd __builtin_altivec_neg_v2df (vd);
> -NEG_V2DF negv2df2 {}
> -
> -  const vsll __builtin_altivec_neg_v2di (vsll);
> -NEG_V2DI negv2di2 {}
> -
> -  const vf __builtin_altivec_neg_v4sf (vf);
> -NEG_V4SF negv4sf2 {}
> -
> -  const vsi __builtin_altivec_neg_v4si (vsi);
> -NEG_V4SI negv4si2 {}
> -
> -  const vss __builtin_altivec_neg_v8hi (vss);
> -NEG_V8HI negv8hi2 {}
> -
>const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
>  ORC_V16QI orcv16qi3 {}
>

Re: [PATCH] testsuite: Fix up testsuite/gcc.c-torture/execute/builtins/lib/chk.c for powerpc [PR104380]

2022-02-07 Thread David Edelsohn via Gcc-patches

On Mon, Feb 7, 2022 at 8:20 AM Jakub Jelinek  wrote:
>
> On Fri, Feb 04, 2022 at 12:00:57PM -0500, David Edelsohn via Gcc-patches 
> wrote:
> > > The following testcase FAILs when configured with
> > > --with-long-double-format=ieee .  Only happens in the -std=c* modes, not 
> > > the
> > > GNU modes; while the glibc headers have __asm redirects of
> > > vsnprintf and __vsnprinf_chk to __vsnprintfieee128 and
> > > __vsnprintf_chkieee128, the vsnprintf fortification extern inline 
> > > gnu_inline
> > > always_inline wrapper calls __builtin_vsnprintf_chk and we actually emit
> > > a call to __vsnprinf_chk (i.e. with IBM extended long double) instead of
> > > __vsnprintf_chkieee128.
> > >
> > > rs6000_mangle_decl_assembler_name already had cases for *printf and 
> > > *scanf,
> > > so this just adds another case for *printf_chk.  *scanf_chk doesn't exist.
> > > __ prefixing isn't done because *printf_chk already starts with __.
> > >
> > > Bootstrapped/regtested on powerpc64le-linux, ok for trunk?
> >
> > Okay.
>
> Unfortunately, while I've tested the testcase also with -mabi=ieeelongdouble
> by hand, the full bootstrap/regtest was on GCCFarm where glibc is too old
> to test with --with-long-double-format=ieee.
> I've done full bootstrap/regtest with that option during the weekend and
> the patch regressed:
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O1
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O3 
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -Og -g
> FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -Os
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O1
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O3 
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -Og -g
> FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -Os
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O1
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O3 
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -Og -g
> FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -Os
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O1
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O3 
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O3 -g
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -Og -g
> FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -Os
>
> The problem is that the execute/builtins/ testsuite wants to override some
> of the library functions and with the change we (correctly) call
> __*printf_chkieee128 and so lib/chk.c is no longer called but the glibc
> APIs are.
>
> The following patch fixes it.
>
> Tested on powerpc64le-linux, ok for trunk?

Okay.

Thanks, David

>
> 2022-02-07  Jakub Jelinek  
>
> PR target/104380
> * gcc.c-torture/execute/builtins/lib/chk.c (__sprintf_chkieee128,
> __vsprintf_chkieee128, __snprintf_chkieee128,
> __vsnprintf_chkieee128): New aliases to non-ieee128 suffixed functions
> for powerpc -mabi=ieeelongdouble.
>
> ---

[PATCH] c++: deducing only from noexcept-spec [PR80951]

2022-02-07 Thread Patrick Palka via Gcc-patches

The fail-fast predicate uses_deducible_template_parms used by
unify_one_argument is neglecting to look inside the noexcept-spec of a
function type, and this causes deduction to fail for the below testcase
since the noexcept-spec is the only part of the type which contains a
deducible template parameter.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/80951

gcc/cp/ChangeLog:

* pt.cc (uses_deducible_template_parms): Consider the
noexcept-spec of a function type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/noexcept-type25.C: New test.
---
 gcc/cp/pt.cc |  5 +
 gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C | 13 +
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c7af4712d8b..97d247f6684 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -22398,6 +22398,11 @@ uses_deducible_template_parms (tree type)
   for (; parm; parm = TREE_CHAIN (parm))
if (uses_deducible_template_parms (TREE_VALUE (parm)))
  return true;
+  if (flag_noexcept_type
+ && TYPE_RAISES_EXCEPTIONS (type)
+ && TREE_PURPOSE (TYPE_RAISES_EXCEPTIONS (type))
+ && deducible_expression (TREE_PURPOSE (TYPE_RAISES_EXCEPTIONS 
(type
+   return true;
 }
 
   return false;
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C 
b/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C
new file mode 100644
index 000..ef5c8265541
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type25.C
@@ -0,0 +1,13 @@
+// PR c++/80951
+// { dg-do compile { target c++17 } }
+
+void f() noexcept;
+void g();
+
+template
+constexpr bool h(void (*)() noexcept(E)) {
+  return E;
+}
+
+static_assert(h(f));
+static_assert(!h(g));
-- 
2.35.1.46.g38062e73e0

[PATCH] c++: satisfaction value of type const bool [PR104410]

2022-02-07 Thread Patrick Palka via Gcc-patches

Here constant evaluation of the atomic constraint use_func_v with T=int
sensibly yields an INTEGER_CST of type const bool, but the assert in
satisfaction_value expects unqualified bool.  Let's just relax the
assert to accept cv-qualified bool more generally.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and 11?

PR c++/104410

gcc/cp/ChangeLog:

* constraint.cc (satisfaction_value): Relax assert to accept
cv-qualified bool.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-pr104410.C: New test.
---
 gcc/cp/constraint.cc   | 3 ++-
 gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C | 6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index cc4df4216ef..06ebbd1d19e 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2818,7 +2818,8 @@ satisfaction_value (tree t)
 return t;
 
   gcc_assert (TREE_CODE (t) == INTEGER_CST
- && same_type_p (TREE_TYPE (t), boolean_type_node));
+ && same_type_ignoring_top_level_qualifiers_p (TREE_TYPE (t),
+   boolean_type_node));
   if (integer_zerop (t))
 return boolean_false_node;
   else
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C
new file mode 100644
index 000..dac08e10a0f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-pr104410.C
@@ -0,0 +1,6 @@
+// PR c++/104410
+// { dg-do compile { target c++20 } }
+
+template constexpr bool use_func_v{};
+template void f() requires use_func_v || true { }
+template void f();
-- 
2.35.1.46.g38062e73e0

[PATCH 4/4] [GCC11] tree-optimization/104288 - range on entry should check dominators for non-null.

2022-02-07 Thread Andrew MacLeod via Gcc-patches


The patches resolves the issue for GCC11 in a much simpler way.

By default, ranger and EVRP are running in hybrid mode. This means if 
ranger misses something, evrp will pick up the slack. This enables us to 
change the 2 places which check for non-null to ignore potentially 
incorrect block-wide information and only query dominator blocks for 
on-entry ranges.  This allows ranger to be conservative, and EVRP will 
pick up the intra-block changes.


Bootstraps with no regressions.  OK for gcc 11 branch?


From b807249b310153349a2593203eba44c456b6208e Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 31 Jan 2022 11:37:16 -0500
Subject: [PATCH] Range on entry should only check dominators for non-null.

Range-on-entry checks should no check the state of non-null within the current
block.   If dominators are present, use the dominator.

	PR tree-optimization/104288
	gcc/
	* gimple-range-cache.cc (ssa_range_in_bb): Only use non-null from the
	dominator entry ranges.
	* gimple-range.cc (gimple_ranger::range_of_expr): Ditto.
	gcc/testsuite/
	* gcc.dg/pr104288.c: New.
---
 gcc/gimple-range-cache.cc   | 19 ---
 gcc/gimple-range.cc | 16 ++--
 gcc/testsuite/gcc.dg/pr104288.c | 23 +++
 3 files changed, 45 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104288.c

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 6ddeca3766d..b05b804d513 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -905,14 +905,19 @@ ranger_cache::ssa_range_in_bb (irange , tree name, basic_block bb)
   // Try to pick up any known global value as a best guess for now.
   if (!m_globals.get_global_range (r, name))
 	r = gimple_range_global (name);
+  // Check for non-null on entry if dominators are available by
+  // resetting def_bb to the block we want to search.
+  if (dom_info_available_p (CDI_DOMINATORS))
+	{
+	  basic_block dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+	  // Check if pointers have any non-null dereferences.  Non-call
+	  // exceptions mean we could throw in the middle of the block, so just
+	  // punt for now on those.
+	  if (dom_bb && r.varying_p () && !cfun->can_throw_non_call_exceptions
+	  && m_non_null.non_null_deref_p (name, dom_bb))
+	r = range_nonzero (TREE_TYPE (name));
+	}
 }
-
-  // Check if pointers have any non-null dereferences.  Non-call
-  // exceptions mean we could throw in the middle of the block, so just
-  // punt for now on those.
-  if (r.varying_p () && m_non_null.non_null_deref_p (name, bb) &&
-  !cfun->can_throw_non_call_exceptions)
-r = range_nonzero (TREE_TYPE (name));
 }
 
 // Return a static range for NAME on entry to basic block BB in R.  If
diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index f861459ed96..42c637458ad 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -979,10 +979,14 @@ gimple_ranger::range_of_expr (irange , tree expr, gimple *stmt)
 
   // If name is defined in this block, try to get an range from S.
   if (def_stmt && gimple_bb (def_stmt) == bb)
-range_of_stmt (r, def_stmt, expr);
-  else
-// Otherwise OP comes from outside this block, use range on entry.
-range_on_entry (r, bb, expr);
+return range_of_stmt (r, def_stmt, expr);
+
+  // Otherwise OP comes from outside this block, use range on entry.
+  range_on_entry (r, bb, expr);
+  // Check for non-null in the predecessor if dominators are available.
+  if (!dom_info_available_p (CDI_DOMINATORS))
+return true;
+  basic_block dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
 
   // No range yet, see if there is a dereference in the block.
   // We don't care if it's between the def and a use within a block
@@ -992,8 +996,8 @@ gimple_ranger::range_of_expr (irange , tree expr, gimple *stmt)
   // in which case we may need to walk from S back to the def/top of block
   // to make sure the deref happens between S and there before claiming
   // there is a deref.   Punt for now.
-  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
-  m_cache.m_non_null.non_null_deref_p (expr, bb))
+  if (dom_bb && !cfun->can_throw_non_call_exceptions && r.varying_p ()
+  && m_cache.m_non_null.non_null_deref_p (expr, dom_bb))
 r = range_nonzero (TREE_TYPE (expr));
 
   return true;
diff --git a/gcc/testsuite/gcc.dg/pr104288.c b/gcc/testsuite/gcc.dg/pr104288.c
new file mode 100644
index 000..95eb196f9e4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104288.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdelete-null-pointer-checks" } */
+/* { dg-skip-if "" { keeps_null_pointer_checks } } */
+
+void keep(int result) __attribute__((noipa));
+void keep(int result)
+{
+if (result)
+__builtin_exit(0);
+}
+
+void foo (void *p) __attribute__((nonnull(1)));
+
+void bar (void *p)
+{
+  keep (p == 0);
+  foo (p);
+  if (!p)
+__builtin_abort

[PATCH 3/4] tree-optimization/104288 - Update non-null interface to provide better precision.

2022-02-07 Thread Andrew MacLeod via Gcc-patches

This changes the non-null interface to provide more granularity and 
facilitate more precise queries.


Previously, we could only ask if NAME was nonnull anywhere in the block, 
and it applied to the entire block.  now we can ask:
  bool always_nonnull_p (tree name, basic_block bb);    // all uses are 
nonnull in the block
  bool contains_nonnull_p (tree name, basic_block bb); // there is a 
nonnull use somewhere in the block.


Ranger was using this query for on-entry and range_of_expr within 
block.  Range-on-entry was simply wrong.. it should have been asking if 
non-null was true in the dominator of this block.   And this PR 
demonstrate the issue with range_of_expr.


This patch changes all the places which checked the state of non-null.  
An Audit of every use now:


ranger_cache::exit_range - Applies non-null if contains_nonnull (bb) is true
ranger_cache::range_of_expr -  Applies non-null only if always_nonnull (bb).
ranger_cache::fill_block_cache - marks a block to have on-entry 
calculated if predecessor contains_nonnull.
ranger_cache::range_from_dom - Applies non-null if contains_nonnull() is 
true in a visited dominator block.


gimple_ranger::range_of_expr - Applies non-null only if always_nonnull 
in the block.
gimple_ranger::range_on_entry - Checks if any dominator block 
contains_nonnull.
gimple_range::range_on_exit - Calls one of the above 2 routines, then 
checks if contains_nonull in this block


gimple_range_path has 2 uses in range_defined_in_block() and  and 
adjust_for_non_null_uses.  Aldy audited both, and as the values are used 
at the end of the path only, so both are correct in using 
contains_nonull.  So there should be no impact to threading from this.


This patch allows us to properly optimize:

void h(int result) __attribute__((noipa));
void h(int result)
{
    if (result)
    __builtin_exit(0);
}

void foo (void *p) __attribute__((nonnull(1)));

void bar (void *p)
{
  h (p == 0);
  foo (p);
  if (!p)
    __builtin_abort ();
}

to

  _1 = p_3(D) == 0B;
  _2 = (int) _1;
  h (_2);
  foo (p_3(D));
  return;


From a risk perspective, I don't think this patch can make us more 
incorrect than we were before.


Oh, and compilation speed. All the patches combined with checking a 
little less than 1% regression overall in EVRP/VRP.  The audit exposed 
that the threader was invoking an unnecessary DOM lookup version of the 
query, and as a result, we see a 1.3% reduction in the time spent in the 
threader.  the overall compile time impact turned out to be a slight 
improvement of 0.02% overall in 380 GCC files (according to valgrind)    
so its basically all good.


Bootstraps with no regressions.  OK for trunk?

Andrew
From ba55e50356632210db368c5f7a84b3e608d21b60 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 4 Feb 2022 16:50:31 -0500
Subject: [PATCH 3/3] Update non-null interface to provide better precision.

This changes the non-null interface to provide more granularity and faciltate
more precise queries using the new 2 bit system.  This enables the new
register_side_effects code to more precisely track the non-null property.

	PR tree-optimization/104288
	gcc/
	* gimple-range-cache.cc (non_null_ref::non_null_deref_p): Delete.
	(non_null_ref::always_nonnull_p): New.
	(non_null_ref::contains_nonnull_p): New.
	(non_null_ref::is_nonnull_dominated_p): New.
	(non_null_ref::adjust_range): Delete.
	(non_null_ref::maybe_adjust_range: New.
	(ranger_cache::range_of_def): Remove call to adjust_range.
	(ranger_cache::entry_range): Remove call to adjust_range.
	(ranger_cache::exit_range): Use contains_nonnull_p.
	(ranger_cache::range_of_expr): Use always_nonnull_p.
	(ranger_cache::fill_block_cache): Call contains_nonnull_p.
	(ranger_cache::range_from_dom): Call contains_nonnull_p.
	* gimple-range-cache.h (class non_null_ref): Change prototypes.
	* gimple-range-path.cc (path_range_query::range_defined_in_block): Use
	contains_nonnull_p.
	(path_range_query::adjust_for_non_null_uses): Use contains_nonnull_p.
	* gimple-range.cc (gimple_ranger::range_of_expr): Use always_nonnull_p.
	(gimple_ranger::range_on_entry): Use is_nonnull_dominated_p.
	(gimple_ranger::range_on_exit): Use contains_nonnull_p.

	testsuite/gcc/
	* gcc.dg/pr104288.c: New.
---
 gcc/gimple-range-cache.cc   | 89 +++--
 gcc/gimple-range-cache.h|  7 +--
 gcc/gimple-range-path.cc|  7 +--
 gcc/gimple-range.cc | 16 --
 gcc/testsuite/gcc.dg/pr104288.c | 23 +
 5 files changed, 94 insertions(+), 48 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104288.c

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index d14dc1284f2..72751194c25 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -92,64 +92,79 @@ non_null_ref::set_bb_state (tree name, basic_block bb, unsigned long val)
   bitmap_set_aligned_chunk (m_nn[v], bb->index, 2, val);
 }
 
-// Return true if NAME has a non-null dereference in block bb.  If

[PATCH 2/4] tree-optimization/104288 - Register side effects during ranger vrp domwalk.

2022-02-07 Thread Andrew MacLeod via Gcc-patches

This patch adds the ability to register side effects within a block to 
ranger. This is currently only for tracking non-null values.


the rvrp_folder performs a DOM walk and then attempts to simplify and 
the fold each stmt during the walk.  This patch adds a call to 
register_side_effects() to record any effects the stmt might have before 
returning.


This checks if the non-null property is set for any ssa-names on the 
statement, and if they are, moves the state to only non-null for the 
block...  This allows us to adjust the property within the block as we 
do the DOM walk.  All further queries within the block will then come 
back as non-null.   ALl other queries will be from outside the block, so 
they will see the same results anyway.


Unfortunately, none of the non-null routines in gimple.cc appear to work 
"in bulk", but rather, on a specific tree-ssa operand request at a 
time.  For this patch, I need to scan the entire statement looking for 
any operand that is nonnull (or the performance impact is awful).


I took the code in the various infer_nonnull routines, and packed it all 
together into the two routines block_apply_nonnull && 
infer_nonnull_call_attributes,  which basically perform the same 
functionality as  infer_nonnull_range_by_dereference and 
infer_nonnull_range_by_attribute, only on all the operands at once.  I 
think its right, but you may want to have a look.   I intend to leverage 
this code in GCC13  for a more generalized range_after_stmt mechanism 
that will do away with the immediate-use visits of the current non-null


This patch also has zero effect on generated code as none of the 
nonnull_deref_p() queries make use of the new granularity available 
yet.  It does set the table for the next patch. Performance impact of 
this is also marginal, about a 1% hit to EVRP/VRP.


Bootstraps with no regressions.  OK for trunk?

Andrew



From c4a1e28b5b1dd8f22e7d87f34961596a816c7ad7 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 2 Feb 2022 14:44:57 -0500
Subject: [PATCH 2/3] Register side effects during ranger block walk.

This patch adds the ability to register statement side effects within a block
for ranger.  This is currently only for tracking non-null values.

	* gimple-range-cache.cc (non_null_ref::try_update_to_always_nonnull):
	New.
	(non_null_ref::infer_nonnull_call_attributes): New.
	(non_null_loadstore): New.
	(non_null_ref::block_apply_nonnull): New.
	* gimple-range-cache.h (class non_null_ref): New prototypes.
	* gimple-range.cc (gimple_ranger::register_side_effects): New.
	* gimple-range.h (gimple_ranger::register_side_effects): New.
	* tree-vrp.cc (rvrp_folder::fold_stmt): Call register_side_effects.
---
 gcc/gimple-range-cache.cc | 106 ++
 gcc/gimple-range-cache.h  |   3 ++
 gcc/gimple-range.cc   |  11 
 gcc/gimple-range.h|   1 +
 gcc/tree-vrp.cc   |   8 +--
 5 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 52094f0bc6a..d14dc1284f2 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -29,6 +29,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "gimple-range.h"
 #include "tree-cfg.h"
+#include "target.h"
+#include "attribs.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
 
 #define DEBUG_RANGE_CACHE (dump_file	\
 			   && (param_ranger_debug & RANGER_DEBUG_CACHE))
@@ -197,6 +201,108 @@ non_null_ref::process_name (tree name)
 }
 }
 
+// This routine will update NAME on stmt S to always_nonnull state, if it
+// is in contains_nonull state.  This is used during a block walk to move
+// NAME to known nonnull.
+
+void
+non_null_ref::try_update_to_always_nonnull (gimple *s, tree name)
+{
+  if (gimple_range_ssa_p (name) && POINTER_TYPE_P (TREE_TYPE (name)))
+{
+  unsigned long state = get_bb_state (name, gimple_bb (s));
+  // Only process when both conditions exist in the BB.
+  if (state != NN_BOTH)
+	return;
+  // From this point forward, we'll be NON-NULL for sure if we see one.
+  set_bb_state (name, gimple_bb (s), NN_NON_ZERO);
+}
+}
+
+// Adapted from infer_nonnull_range_by_attribute for calls to process
+// all operands which have the nonnull attribute set in stmt.
+
+void
+non_null_ref::infer_nonnull_call_attributes (gcall *stmt)
+{
+  if (gimple_call_internal_p (stmt))
+return;
+
+  tree fntype = gimple_call_fntype (stmt);
+  tree attrs = TYPE_ATTRIBUTES (fntype);
+  for (; attrs; attrs = TREE_CHAIN (attrs))
+{
+  attrs = lookup_attribute ("nonnull", attrs);
+
+  /* If "nonnull" wasn't specified, we know nothing about
+	 the argument.  */
+  if (attrs == NULL_TREE)
+	return;
+
+  /* Check if "nonnull" applies to all the arguments.  */
+  if (TREE_VALUE (attrs) == NULL_TREE)
+	{
+	  for (unsigned int i = 0; i < gimple_call_num_args (stmt); i++)
+	{
+	  tree op =

[PATCH 1/4] tree-optimization/104288 - Change non-null tracking from one bit to a pair.

2022-02-07 Thread Andrew MacLeod via Gcc-patches


This patch changes the non-null tracking in ranger to use 2 bits in order
to distinguish between blocks which contain uses of NAME in:
  1) non-null generating statements,  (1 bit)
  2) statements which are not non-null generating (the other bit),  
THis is the new bit.

  3) and the existence of both.  (both bits)

It utilizes the aligned_chunk sparse bitmap code to track these states 
for each name, and has zero impact on functionality otherwise as the 
nonnull_deref_p routine still keys off a non-null access anywhere in the 
block.   This ensure we add the new capabilities without affected code 
generation or anything else.


There is a get/set routines, but internally i usually defer directly to 
the get/set_aligned_chunk routine for efficiency.  The overall impact is 
under .5% of EVRP/VRP compile time.  Most of the additional time is due 
to now setting a bit for all uses, even if the non-null property isnt set.


Bootstraps with no regressions.  OK for trunk?

Andrew
From 6cd863984a6bf49dc5d6ff00a510003679842e74 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 2 Feb 2022 14:43:29 -0500
Subject: [PATCH 1/3] Change non-null tracking from one bit to a pair.

This patch changes the non-null tracking in ranger to use 2 bits in order
to distinguish between blocks which contain uses of NAME in:
  1) non-null generating statements,
  2) statements which are not non-null,
  3) and the existance of both.

	* gimple-range-cache.cc (non_null_ref::get_bb_state): New.
	(non_null_ref::set_bb_state): New.
	(non_null_ref::non_null_deref_p): Use new 2 bit values.
	(non_null_ref::process_name): Use new 2 bit values.
	* gimple-range-cache.h (class non_null_ref): New prototypes.
---
 gcc/gimple-range-cache.cc | 82 ---
 gcc/gimple-range-cache.h  |  2 +
 2 files changed, 62 insertions(+), 22 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 16c881b13e1..52094f0bc6a 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -33,6 +33,13 @@ along with GCC; see the file COPYING3.  If not see
 #define DEBUG_RANGE_CACHE (dump_file	\
 			   && (param_ranger_debug & RANGER_DEBUG_CACHE))
 
+// 2 bits represent whether a non_zero or varying use were seen in a block.
+
+#define NN_NONE		0
+#define NN_VARYING	1
+#define NN_NON_ZERO	2
+#define NN_BOTH		(NN_VARYING | NN_NON_ZERO)
+
 // During contructor, allocate the vector of ssa_names.
 
 non_null_ref::non_null_ref ()
@@ -50,16 +57,26 @@ non_null_ref::~non_null_ref ()
   m_nn.release ();
 }
 
-// Return true if NAME has a non-null dereference in block bb.  If this is the
-// first query for NAME, calculate the summary first.
-// If SEARCH_DOM is true, the search the dominator tree as well.
+// Return the non-null state of NAME in block BB.
 
-bool
-non_null_ref::non_null_deref_p (tree name, basic_block bb, bool search_dom)
+unsigned long
+non_null_ref::get_bb_state (tree name, basic_block bb)
 {
-  if (!POINTER_TYPE_P (TREE_TYPE (name)))
-return false;
+  unsigned v = SSA_NAME_VERSION (name);
+  if (v >= m_nn.length ())
+m_nn.safe_grow_cleared (num_ssa_names + 1);
+
+  if (!m_nn[v])
+process_name (name);
+
+  return bitmap_get_aligned_chunk (m_nn[v], bb->index, 2);
+}
 
+// Set the non-null state of NAME in block BB to VAL.
+
+void
+non_null_ref::set_bb_state (tree name, basic_block bb, unsigned long val)
+{
   unsigned v = SSA_NAME_VERSION (name);
   if (v >= m_nn.length ())
 m_nn.safe_grow_cleared (num_ssa_names + 1);
@@ -67,7 +84,22 @@ non_null_ref::non_null_deref_p (tree name, basic_block bb, bool search_dom)
   if (!m_nn[v])
 process_name (name);
 
-  if (bitmap_bit_p (m_nn[v], bb->index))
+  gcc_checking_assert (val <= NN_BOTH);
+  bitmap_set_aligned_chunk (m_nn[v], bb->index, 2, val);
+}
+
+// Return true if NAME has a non-null dereference in block bb.  If this is the
+// first query for NAME, calculate the summary first.
+// If SEARCH_DOM is true, the search the dominator tree as well.
+
+bool
+non_null_ref::non_null_deref_p (tree name, basic_block bb, bool search_dom)
+{
+  unsigned v = SSA_NAME_VERSION (name);
+  if (!POINTER_TYPE_P (TREE_TYPE (name)))
+return false;
+
+  if (get_bb_state (name, bb) >= NN_NON_ZERO)
 return true;
 
   // See if any dominator has set non-zero.
@@ -81,7 +113,7 @@ non_null_ref::non_null_deref_p (tree name, basic_block bb, bool search_dom)
   for ( ;
 	bb && bb != def_dom;
 	bb = get_immediate_dominator (CDI_DOMINATORS, bb))
-	if (bitmap_bit_p (m_nn[v], bb->index))
+	if (bitmap_get_aligned_chunk (m_nn[v], bb->index, 2) >= NN_NON_ZERO)
 	  return true;
 }
   return false;
@@ -116,9 +148,8 @@ non_null_ref::adjust_range (irange , tree name, basic_block bb,
   return false;
 }
 
-// Allocate an populate the bitmap for NAME.  An ON bit for a block
-// index indicates there is a non-null reference in that block.  In
-// order to populate the bitmap, a quick run of all the immediate uses
+// Allocate and

[PATCH 0/4] tree-optimization/104288 - Add more granularity to non-null tracking

2022-02-07 Thread Andrew MacLeod via Gcc-patches


I have a proposal for PR 104288.

This PR highlights the shortcomings of assuming that when the non-null 
property is seen within a block, that it applies to the entire block.


This is a 3 patch set, but is less dramatic than it sounds.  I just 
wanted to split it into small bits for safety and readability. The 
changes boil down to making the same queries as before, only with more 
precision.


patch set 1 changes the nonnull tracking system from a single bit to a 
pair of bits.  This uses the aligned_chunk code previously added to the 
sparse bitmaps, and the overall impact is minimal. This allows us to 
trace whether a block is non-null throughout the entire block, or just 
part of the block.


patch set 2 added a callback into ranger to register side effects of a 
statement after ranger_vrp has folded a statement during its domwalk.  
This simply checks if infer_nonnull_range is true on the statement just 
processed, and if it is, lets the non-null class know which then changes 
the state in that block to always non-null. There is no impact to the 
on-demand clients.


patch set 3 then added a new interface to non-null in which we can ask 
more refined question about the state of nonnull in a block.  All 
queries are then audited such that
  1) All queries within a block require that the nonnull property be 
100% guaranteed to be nonull in the block, or we dont adjust the range.
  2) Range on entry queries check if any dominator block has the 
non-null property at all.
This will ensure we don't make the assumption that non-null somewhere in 
the block always applies to the entire block.


The 3rd set also adds a more complicated test to the test suite to 
ensure we cannot remove a use early in the block even if we find it is 
non-null later in the same block, but then we will remove it later in 
the block.


the 4th patch is the GCC11 version.. much simpler, it only ever checks 
if non-null is true in a dominator, and doesn't check within the current 
block.  This works fine as EVRP runs in hybrid mode by default and evrp 
will pick up the inter-block granularity.


More details are provided with each patch. I think the risk is low since 
I do not believe we can do anything worse than we did before :-).  We 
can discuss alternatives if you think its too much at this stage.


ALL patches (in sequence) bootstrap on their own and each cause no 
regressions.


Andrew

Re: [PATCH] testsuite: Fix up testsuite/gcc.c-torture/execute/builtins/lib/chk.c for powerpc [PR104380]

2022-02-07 Thread Jakub Jelinek via Gcc-patches

On Fri, Feb 04, 2022 at 12:00:57PM -0500, David Edelsohn via Gcc-patches wrote:
> > The following testcase FAILs when configured with
> > --with-long-double-format=ieee .  Only happens in the -std=c* modes, not the
> > GNU modes; while the glibc headers have __asm redirects of
> > vsnprintf and __vsnprinf_chk to __vsnprintfieee128 and
> > __vsnprintf_chkieee128, the vsnprintf fortification extern inline gnu_inline
> > always_inline wrapper calls __builtin_vsnprintf_chk and we actually emit
> > a call to __vsnprinf_chk (i.e. with IBM extended long double) instead of
> > __vsnprintf_chkieee128.
> >
> > rs6000_mangle_decl_assembler_name already had cases for *printf and *scanf,
> > so this just adds another case for *printf_chk.  *scanf_chk doesn't exist.
> > __ prefixing isn't done because *printf_chk already starts with __.
> >
> > Bootstrapped/regtested on powerpc64le-linux, ok for trunk?
> 
> Okay.

Unfortunately, while I've tested the testcase also with -mabi=ieeelongdouble
by hand, the full bootstrap/regtest was on GCCFarm where glibc is too old
to test with --with-long-double-format=ieee.
I've done full bootstrap/regtest with that option during the weekend and
the patch regressed:
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O1 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2 -flto 
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -Og -g 
FAIL: gcc.c-torture/execute/builtins/snprintf-chk.c execution,  -Os 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O1 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2 -flto 
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -Og -g 
FAIL: gcc.c-torture/execute/builtins/sprintf-chk.c execution,  -Os 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O1 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2 -flto 
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -Og -g 
FAIL: gcc.c-torture/execute/builtins/vsnprintf-chk.c execution,  -Os 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O1 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2 -flto 
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -Og -g 
FAIL: gcc.c-torture/execute/builtins/vsprintf-chk.c execution,  -Os 

The problem is that the execute/builtins/ testsuite wants to override some
of the library functions and with the change we (correctly) call
__*printf_chkieee128 and so lib/chk.c is no longer called but the glibc
APIs are.

The following patch fixes it.

Tested on powerpc64le-linux, ok for trunk?

2022-02-07  Jakub Jelinek  

PR target/104380
* gcc.c-torture/execute/builtins/lib/chk.c (__sprintf_chkieee128,
__vsprintf_chkieee128, __snprintf_chkieee128,
__vsnprintf_chkieee128): New aliases to non-ieee128 suffixed functions
for powerpc -mabi=ieeelongdouble.

--- gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c.jj   2022-01-05 
20:30:08.852805055 +0100
+++ gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c  2022-02-07 
13:10:51.474447998 +0100
@@ -517,3 +517,14 @@ vsnprintf (char *str,

Re: [patch gcc13] middle-end/70090: Dynamic sizes for -fsanitize=object-size

2022-02-07 Thread Jakub Jelinek via Gcc-patches

On Mon, Feb 07, 2022 at 05:31:58PM +0530, Siddhesh Poyarekar wrote:
> Use __builtin_dynamic_object_size to get object sizes for ubsan.
> 
> gcc/ChangeLog:
> 
>   middle-end/70090
>   * ubsan.cc (ubsan_expand_objsize_ifn): Allow non-constant SIZE.
>   (instrument_object_size): Get dynamic object size expression.
> 
> gcc/testsuite/ChangeLog:
> 
>   middle-end/70090
>   * gcc.dg/ubsan/object-size-dyn.c: New test.
> 
> Signed-off-by: Siddhesh Poyarekar 
> ---
> Proposing for gcc13 since I reckoned this is not feasible for stage 4.

Ok for stage1.

Jakub

[patch gcc13] middle-end/70090: Dynamic sizes for -fsanitize=object-size

2022-02-07 Thread Siddhesh Poyarekar

Use __builtin_dynamic_object_size to get object sizes for ubsan.

gcc/ChangeLog:

middle-end/70090
* ubsan.cc (ubsan_expand_objsize_ifn): Allow non-constant SIZE.
(instrument_object_size): Get dynamic object size expression.

gcc/testsuite/ChangeLog:

middle-end/70090
* gcc.dg/ubsan/object-size-dyn.c: New test.

Signed-off-by: Siddhesh Poyarekar 
---
Proposing for gcc13 since I reckoned this is not feasible for stage 4.

Tested with:
- ubsan bootstrap config on x86_64
- bootstrap build and test on x86_64
- non-bootstrap build and test with i686

 gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c | 45 
 gcc/ubsan.cc | 13 +++---
 2 files changed, 52 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c

diff --git a/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c 
b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c
new file mode 100644
index 000..0159f5b9820
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-options "-fsanitize=undefined" } */
+#include 
+
+int
+__attribute__ ((noinline))
+dyn (int size, int i)
+{
+  __builtin_printf ("dyn\n");
+  fflush (stdout);
+  int *alloc = __builtin_calloc (size, sizeof (int));
+  int ret = alloc[i];
+  __builtin_free (alloc);
+  return ret;
+}
+
+int
+__attribute__ ((noinline))
+off (int size, int i, int ret)
+{
+  char *mem = __builtin_alloca (size);
+  mem += size - 1;
+
+  return (int) mem[i] & ret;
+}
+
+int
+main (void)
+{
+  int ret = dyn (2, 2);
+
+  ret |= off (4, 4, 0);
+
+  return ret;
+}
+
+/* { dg-output "load of address \[^\n\r]* with insufficient space for an 
object of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*\\^\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*load of address \[^\n\r]* with insufficient space for 
an object of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*\\^" } */
diff --git a/gcc/ubsan.cc b/gcc/ubsan.cc
index 5641d3cc3be..11dad4f1095 100644
--- a/gcc/ubsan.cc
+++ b/gcc/ubsan.cc
@@ -942,8 +942,8 @@ ubsan_expand_objsize_ifn (gimple_stmt_iterator *gsi)
   gimple *g;
 
   /* See if we can discard the check.  */
-  if (TREE_CODE (size) != INTEGER_CST
-  || integer_all_onesp (size))
+  if (TREE_CODE (size) == INTEGER_CST
+  && integer_all_onesp (size))
 /* Yes, __builtin_object_size couldn't determine the
object size.  */;
   else if (TREE_CODE (offset) == INTEGER_CST
@@ -2160,14 +2160,14 @@ instrument_object_size (gimple_stmt_iterator *gsi, tree 
t, bool is_lhs)
   if (decl_p)
 base_addr = build1 (ADDR_EXPR,
build_pointer_type (TREE_TYPE (base)), base);
-  if (compute_builtin_object_size (base_addr, 0, ))
+  if (compute_builtin_object_size (base_addr, OST_DYNAMIC, ))
 ;
   else if (optimize)
 {
   if (LOCATION_LOCUS (loc) == UNKNOWN_LOCATION)
loc = input_location;
-  /* Generate __builtin_object_size call.  */
-  sizet = builtin_decl_explicit (BUILT_IN_OBJECT_SIZE);
+  /* Generate __builtin_dynamic_object_size call.  */
+  sizet = builtin_decl_explicit (BUILT_IN_DYNAMIC_OBJECT_SIZE);
   sizet = build_call_expr_loc (loc, sizet, 2, base_addr,
   integer_zero_node);
   sizet = force_gimple_operand_gsi (gsi, sizet, false, NULL_TREE, true,
@@ -2219,7 +2219,8 @@ instrument_object_size (gimple_stmt_iterator *gsi, tree 
t, bool is_lhs)
}
 }
 
-  if (bos_stmt && gimple_call_builtin_p (bos_stmt, BUILT_IN_OBJECT_SIZE))
+  if (bos_stmt
+  && gimple_call_builtin_p (bos_stmt, BUILT_IN_DYNAMIC_OBJECT_SIZE))
 ubsan_create_edge (bos_stmt);
 
   /* We have to emit the check.  */
-- 
2.34.1

Re: [PATCH] Check always_inline flag in s390_can_inline_p [PR104327]

2022-02-07 Thread Andreas Krebbel via Gcc-patches

On 2/7/22 09:11, Jakub Jelinek wrote:
...
> 1) formatting, = should be at the start of next line rather than end of the
>line
> 2) all_masks, always_inline_safe_masks and caller_required_masks aren't
>ever modified, perhaps make them const?
> 3) I wonder if there is any advantage to have all_masks with all the masks
>enumerated, compared to
>const HOST_WIDE_INT all_masks
>  = (caller_required_masks | must_match_masks | always_inline_safe_masks
>   | MASK_DEBUG_ARG | MASK_PACKED_STACK | MASK_ZVECTOR);
>i.e. when you add a new mask, instead of listing it in all_masks
>and one or more of the other vars you'd just stick it either in one
>or more of those vars or in all_masks.

I've just committed the patch with these changes. Thanks Jakub!

Andreas


diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 5c2a830f9f0..c6cfe41ad7b 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -16091,6 +16091,23 @@ s390_valid_target_attribute_p (tree fndecl,
 static bool
 s390_can_inline_p (tree caller, tree callee)
 {
+  /* Flags which if present in the callee are required in the caller as well.  
*/
+  const unsigned HOST_WIDE_INT caller_required_masks = MASK_OPT_HTM;
+
+  /* Flags which affect the ABI and in general prevent inlining.  */
+  unsigned HOST_WIDE_INT must_match_masks
+= (MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP | MASK_SOFT_FLOAT
+   | MASK_LONG_DOUBLE_128 | MASK_OPT_VX);
+
+  /* Flags which we in general want to prevent inlining but accept for
+ always_inline.  */
+  const unsigned HOST_WIDE_INT always_inline_safe_masks
+= MASK_MVCLE | MASK_BACKCHAIN | MASK_SMALL_EXEC;
+
+  const HOST_WIDE_INT all_masks
+ = (caller_required_masks | must_match_masks | always_inline_safe_masks
+   | MASK_DEBUG_ARG | MASK_PACKED_STACK | MASK_ZVECTOR);
+
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);

@@ -16103,16 +16120,18 @@ s390_can_inline_p (tree caller, tree callee)

   struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
   struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-  bool ret = true;

-  if ((caller_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP))
-  != (callee_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP)))
-ret = false;
+  /* If one of these triggers make sure to add proper handling of your
+ new flag to this hook.  */
+  gcc_assert (!(caller_opts->x_target_flags & ~all_masks));
+  gcc_assert (!(callee_opts->x_target_flags & ~all_masks));

-  /* Don't inline functions to be compiled for a more recent arch into a
- function for an older arch.  */
-  else if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
-ret = false;
+  bool always_inline
+= (DECL_DISREGARD_INLINE_LIMITS (callee)
+   && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));
+
+  if (!always_inline)
+must_match_masks |= always_inline_safe_masks;

   /* Inlining a hard float function into a soft float function is only
  allowed if the hard float function doesn't actually make use of
@@ -16120,16 +16139,27 @@ s390_can_inline_p (tree caller, tree callee)

  We are called from FEs for multi-versioning call optimization, so
  beware of ipa_fn_summaries not available.  */
-  else if (((TARGET_SOFT_FLOAT_P (caller_opts->x_target_flags)
-&& !TARGET_SOFT_FLOAT_P (callee_opts->x_target_flags))
-   || (!TARGET_HARD_DFP_P (caller_opts->x_target_flags)
-   && TARGET_HARD_DFP_P (callee_opts->x_target_flags)))
-  && (! ipa_fn_summaries
-  || ipa_fn_summaries->get
-  (cgraph_node::get (callee))->fp_expressions))
-ret = false;
+  if (always_inline && ipa_fn_summaries
+  && !ipa_fn_summaries->get(cgraph_node::get (callee))->fp_expressions)
+must_match_masks &= ~(MASK_HARD_DFP | MASK_SOFT_FLOAT);

-  return ret;
+  if ((caller_opts->x_target_flags & must_match_masks)
+  != (callee_opts->x_target_flags & must_match_masks))
+return false;
+
+  if (~(caller_opts->x_target_flags & caller_required_masks)
+  & (callee_opts->x_target_flags & caller_required_masks))
+return false;
+
+  /* Don't inline functions to be compiled for a more recent arch into a
+ function for an older arch.  */
+  if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
+return false;
+
+  if (!always_inline && caller_opts->x_s390_tune != callee_opts->x_s390_tune)
+return false;
+
+  return true;
 }
 #endif

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr104327.c
b/gcc/testsuite/gcc.c-torture/compile/pr104327.c
new file mode 100644
index 000..d54e5d58cc4
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr104327.c
@@ -0,0 +1,15 @@
+/* PR target/104327 */
+
+void foo (int *);
+
+static inline __attribute__((always_inline)) void
+bar (int *x)
+{
+  foo (x);
+}
+
+__attribute__((cold,

[PATCH] middle-end/104402 - split out _Complex compares from COND_EXPRs

2022-02-07 Thread Richard Biener via Gcc-patches

This makes sure we always have a _Complex compare split to a
different stmt for the compare operand in a COND_EXPR on GIMPLE.
Complex lowering doesn't handle this and the change is something
we want for all kind of compares at some point.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk
sofar.

2022-02-07  Richard Biener  

PR middle-end/104402
* gimple-expr.cc (is_gimple_condexpr): _Complex typed
compares are not valid.
* tree-cfg.cc (verify_gimple_assign_ternary): For COND_EXPR
check is_gimple_condexpr.

* gcc.dg/torture/pr104402.c: New testcase.
---
 gcc/gimple-expr.cc  | 20 +---
 gcc/testsuite/gcc.dg/torture/pr104402.c |  8 
 gcc/tree-cfg.cc |  9 +
 3 files changed, 26 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr104402.c

diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc
index 05b12747499..f9a650b5daf 100644
--- a/gcc/gimple-expr.cc
+++ b/gcc/gimple-expr.cc
@@ -602,12 +602,16 @@ is_gimple_lvalue (tree t)
 /* Helper for is_gimple_condexpr and is_gimple_condexpr_for_cond.  */
 
 static bool
-is_gimple_condexpr_1 (tree t, bool allow_traps)
+is_gimple_condexpr_1 (tree t, bool allow_traps, bool allow_cplx)
 {
-  return (is_gimple_val (t) || (COMPARISON_CLASS_P (t)
-   && (allow_traps || !tree_could_throw_p (t))
-   && is_gimple_val (TREE_OPERAND (t, 0))
-   && is_gimple_val (TREE_OPERAND (t, 1;
+  tree op0;
+  return (is_gimple_val (t)
+ || (COMPARISON_CLASS_P (t)
+ && (allow_traps || !tree_could_throw_p (t))
+ && ((op0 = TREE_OPERAND (t, 0)), true)
+ && (allow_cplx || TREE_CODE (TREE_TYPE (op0)) != COMPLEX_TYPE)
+ && is_gimple_val (op0)
+ && is_gimple_val (TREE_OPERAND (t, 1;
 }
 
 /* Return true if T is a GIMPLE condition.  */
@@ -615,7 +619,9 @@ is_gimple_condexpr_1 (tree t, bool allow_traps)
 bool
 is_gimple_condexpr (tree t)
 {
-  return is_gimple_condexpr_1 (t, true);
+  /* Always split out _Complex type compares since complex lowering
+ doesn't handle this case.  */
+  return is_gimple_condexpr_1 (t, true, false);
 }
 
 /* Like is_gimple_condexpr, but does not allow T to trap.  */
@@ -623,7 +629,7 @@ is_gimple_condexpr (tree t)
 bool
 is_gimple_condexpr_for_cond (tree t)
 {
-  return is_gimple_condexpr_1 (t, false);
+  return is_gimple_condexpr_1 (t, false, true);
 }
 
 /* Return true if T is a gimple address.  */
diff --git a/gcc/testsuite/gcc.dg/torture/pr104402.c 
b/gcc/testsuite/gcc.dg/torture/pr104402.c
new file mode 100644
index 000..1cb0370e9b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr104402.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+
+_Complex int a;
+char b;
+void c() {
+  if (b != 2 + (long)(a != 0 ^ 0))
+__builtin_abort();
+}
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 260a7fb97c6..e321d929fd0 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4289,10 +4289,11 @@ verify_gimple_assign_ternary (gassign *stmt)
   /* Fallthrough.  */
 case COND_EXPR:
   if (!is_gimple_val (rhs1)
- && verify_gimple_comparison (TREE_TYPE (rhs1),
-  TREE_OPERAND (rhs1, 0),
-  TREE_OPERAND (rhs1, 1),
-  TREE_CODE (rhs1)))
+ && (!is_gimple_condexpr (rhs1)
+ || verify_gimple_comparison (TREE_TYPE (rhs1),
+  TREE_OPERAND (rhs1, 0),
+  TREE_OPERAND (rhs1, 1),
+  TREE_CODE (rhs1
return true;
   if (!useless_type_conversion_p (lhs_type, rhs2_type)
  || !useless_type_conversion_p (lhs_type, rhs3_type))
-- 
2.34.1

Re: [PATCH v3] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2022-02-07 Thread Kewen.Lin via Gcc-patches

Hi Segher,

Thanks for the comments!

on 2022/2/7 下午3:53, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jan 05, 2022 at 03:34:41PM +0800, Kewen.Lin wrote:
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's unexpected, we should use the command line options
>> from target_option_default_node as default.
> 
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -25379,55 +25379,87 @@ rs6000_update_ipa_fn_target_info (unsigned int 
>> , const gimple *stmt)
>>  static bool
>>  rs6000_can_inline_p (tree caller, tree callee)
>>  {
>> -  bool ret = false;
>>tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>>tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
>>  
>> -  /* If the callee has no option attributes, then it is ok to inline.  */
>> +  /* If the caller/callee has option attributes, then use them.
>> + Otherwise, use the command line options.  */
>>if (!callee_tree)
>> -ret = true;
>> -
>> -  else
>> -{
>> -  HOST_WIDE_INT caller_isa;
>> -  struct cl_target_option *callee_opts = TREE_TARGET_OPTION 
>> (callee_tree);
>> -  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
>> -  HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
>> +callee_tree = target_option_default_node;
>> +  if (!caller_tree)
>> +caller_tree = target_option_default_node;
>> +
>> +  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
>> +  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
>> +  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
>> +  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
>> +
>> +  bool always_inline
>> += DECL_DISREGARD_INLINE_LIMITS (callee)
>> +  && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
>> +
>> +  /* Some features can be tolerated for always inlines.  */
>> +  unsigned HOST_WIDE_INT always_inline_safe_mask
>> +/* Fusion option masks.  */
>> += OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
>> +  | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
>> +  | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
>> +  | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
>> +  | OPTION_MASK_P10_FUSION_2ADD
>> +  /* Like fusion, some option masks which are just for optimization.  */
>> +  | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;
> 
> Why is this?  I would expect only two or three options to be *not* safe!
> 
> You have a strange mix of internal options (the FUSION* things) and some
> other options that do not change the semantics at all.  But this is true
> for almost *all* options we have!  What would not allow a callee that is
> allowed to use some newer ISA's insns into a caller that does not allow
> them?  Both when it ends up using those insns and when it does not, the
> end result is valid.  If there is something internal to GCC that causes
> ICEs we need to do something about *that*!
> 

The reason why I used these options is that all of them are mainly for
performance purpose, they are not to control if some insns are available
to generate.  Sorry that if they look strangely mixed.

IMHO it's not "true for almost *all* options we have".  For the
options at the beginning of rs6000 options manual page.

  -mpowerpc-gpopt  [guarding bifs]
  -mpowerpc-gfxopt [guarding bifs]
  -mpowerpc64
  -mmfcrf
  -mpopcntb  [guarding bifs]
  -mpopcntd  [guarding bifs]
  -mfprnd
  -mcmpb  [guarding bifs]
  -mhard-dfp  [guarding bifs]
  ...

Imagining that if the callee has some built-in functions or some inline
assembly which requires the related hardware feature provided, I think we
shouldn't claim they are safe even with always_inline.  For example, for
built-in functions, there will be some error messages without related
features supported.

>> +  /* Some features are totally safe for inlining (or always inlines),
>> + let's exclude them from the following checkings.  */
>> +  HOST_WIDE_INT safe_mask = always_inline ? always_inline_safe_mask : 0;
>> +  cgraph_node *callee_node = cgraph_node::get (callee);
>> +  if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
>> +{
>> +  unsigned int info = ipa_fn_summaries->get (callee_node)->target_info;
>> +  if ((info & RS6000_FN_TARGET_INFO_HTM) == 0)
>> +safe_mask |= OPTION_MASK_HTM;
>> +}
> 
> always_inline means *always*.  If the compiler cannot do this it should
> not, but then error; in all other cases it should do it.
> 

Maybe there are some user cases we want to tolerate?  Like this PR?

> This patch is pretty much equal to not allowing any inlining if the
> caller and callee have not identical compilation options.  So sure, it
> causes us to not have any unwanted inlining, but it does

Re: [PATCH] RISC-V: Enable TARGET_SUPPORTS_WIDE_INT

2022-02-07 Thread Philipp Tomsich

Vineet,

On Mon, 7 Feb 2022 at 07:06, Vineet Gupta  wrote:
>
> This is at par with other major arches such as aarch64, i386, s390 ...
>
> No testsuite regressions: same numbers w/ w/o

Putting that check seems a good idea, but I haven't seen any cases
related to this get through anyway.
Do you have seen any instances where the backend got this wrong? If
so, please share, so we can run a fuller regression and see any
performance impact.

Thanks,
Philipp.

> |   === gcc Summary ===
> |
> |# of expected passes   113392
> |# of unexpected failures   27
> |# of unexpected successes  3
> |# of expected failures 605
> |# of unsupported tests 2523
> |
> |   === g++ Summary ===
> |
> |# of expected passes   172997
> |# of unexpected failures   26
> |# of expected failures 706
> |# of unsupported tests 9566
>
> Signed-off-by: Vineet Gupta 
> ---
>  gcc/config/riscv/predicates.md | 2 +-
>  gcc/config/riscv/riscv.c   | 6 ++
>  gcc/config/riscv/riscv.h   | 2 ++
>  3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index 3da6fd4c0491..cf902229954b 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -52,7 +52,7 @@
> (match_test "INTVAL (op) + 1 != 0")))
>
>  (define_predicate "const_0_operand"
> -  (and (match_code "const_int,const_wide_int,const_double,const_vector")
> +  (and (match_code "const_int,const_wide_int,const_vector")
> (match_test "op == CONST0_RTX (GET_MODE (op))")))
>
>  (define_predicate "reg_or_0_operand"
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index c830cd8f4ad1..d2f2d9e0276f 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -1774,6 +1774,12 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
> outer_code, int opno ATTRIBUTE_UN
>  case SYMBOL_REF:
>  case LABEL_REF:
>  case CONST_DOUBLE:
> +  /* With TARGET_SUPPORTS_WIDE_INT const int can't be in CONST_DOUBLE
> + rtl object. Weird recheck due to switch-case fall through above.  */
> +  if (GET_CODE (x) == CONST_DOUBLE)
> +gcc_assert (GET_MODE (x) != VOIDmode);
> +  /* Fall through.  */
> +
>  case CONST:
>if ((cost = riscv_const_insns (x)) > 0)
> {
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index ff6729aedac2..91cfc82b4aa4 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -997,4 +997,6 @@ extern void riscv_remove_unneeded_save_restore_calls 
> (void);
>
>  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok (FROM, TO)
>
> +#define TARGET_SUPPORTS_WIDE_INT 1
> +
>  #endif /* ! GCC_RISCV_H */
> --
> 2.32.0
>

[AArch64] Question about the condition of calls_alloca in aarch64_layout_frame

2022-02-07 Thread Dan Li via Gcc-patches


There is the following code in aarch64_layout_frame:

else if (crtl->outgoing_args_size.is_constant (_outgoing_args_size)
 && frame.saved_regs_size.is_constant (_saved_regs_size)
 && const_outgoing_args_size + const_saved_regs_size < 512
 && (!saves_below_hard_fp_p || const_outgoing_args_size == 0)
// 1)
 && !(cfun->calls_alloca 
  && frame.hard_fp_offset.is_constant (_fp_offset)
  && const_fp_offset < max_push_offset))
  {
/* Frame with small outgoing arguments:

   sub sp, sp, frame_size
   stp reg1, reg2, [sp, outgoing_args_size]
   stp reg3, reg4, [sp, outgoing_args_size + 16]  */
frame.initial_adjust = frame.frame_size;
frame.callee_offset = const_outgoing_args_size;
  }
..
else if (frame.hard_fp_offset.is_constant (_fp_offset)
 && const_fp_offset < max_push_offset)
  {
// 2)
/* Frame with large outgoing arguments or SVE saves, but with
   a small local area:

   stp reg1, reg2, [sp, -hard_fp_offset]!
   stp reg3, reg4, [sp, 16]
   [sub sp, sp, below_hard_fp_saved_regs_size]
   [save SVE registers relative to SP]
   sub sp, sp, outgoing_args_size  */


As described in 2), "Frame with large outgoing arguments or SVE saves,
but with a small local area".

But due to the condition at 1), the following code (small outgoing with
a small local area) also uses the insns in 2), which is slightly different
from the description:

//aarch64-linux-gnu-gcc   main.c -O0 -S main.s -g -w -fomit-frame-pointer

#include 
#include 
#define REP9(X) X,X,X,X,X,X,X,X,X

int alloc_size;

int main(void)
{
outgoing (REP9(1));
char * y = alloca(alloc_size);
return 0;
}


Could the condition in 1) be removed, or am I missing something?

Thanks,
Dan.

Re: [PATCH v2] rs6000: Disable MMA if no VSX support [PR103627]

2022-02-07 Thread Segher Boessenkool

On Mon, Feb 07, 2022 at 01:54:00PM +0800, Kewen.Lin wrote:
> on 2022/1/28 上午1:17, Segher Boessenkool wrote:
> > On Thu, Jan 27, 2022 at 07:21:33PM +0800, Kewen.Lin wrote:
> >>PR target/103627
> >>* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
> >>MMA if !TARGET_VSX.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>PR target/103627
> >>* gcc.target/powerpc/pr103627-1.c: New test.
> >>* gcc.target/powerpc/pr103627-2.c: New test.
> > 
> > Okay for trunk.  Thanks!
> 
> Thanks Segher!  Commit it as r12-7078 and its successor as r12-7079.
> 
> Since it's related to MMA support, I guess we want to backport it?

Probably, yes.  (Not because it is related to MMA, but more because the
ICE will also exist on older compilers).

> If so, is it ok to backport to GCC 10 and 11 after one week or so?

Okay.  Thanks!


Segher

Re: [PATCH] Check always_inline flag in s390_can_inline_p [PR104327]

2022-02-07 Thread Jakub Jelinek via Gcc-patches

On Mon, Feb 07, 2022 at 08:09:47AM +0100, Andreas Krebbel via Gcc-patches wrote:
> MASK_MVCLE is set for -Os but not for other optimization levels. In
> general it should not make much sense to inline across calls where the
> flag is different but we have to allow it for always_inline.
> 
> The patch also rearranges the hook implementation a bit based on the
> recommendations from Jakub und Martin in the PR.
> 
> Bootstrapped and regression tested on s390x with various arch flags.
> Will commit after giving a few days for comments.
> 
> gcc/ChangeLog:
> 
>   PR target/104327
>   * config/s390/s390.cc (s390_can_inline_p): Accept a few more flags
>   if always_inline is set. Don't inline when tune differs without
>   always_inline.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/104327
>   * gcc.c-torture/compile/pr104327.c: New test.

Looks mostly good to me, except:

> ---
>  gcc/config/s390/s390.cc   | 66 ++-
>  .../gcc.c-torture/compile/pr104327.c  | 15 +
>  2 files changed, 64 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr104327.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 5c2a830f9f0..bbf2dd8dfb4 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -16091,6 +16091,25 @@ s390_valid_target_attribute_p (tree fndecl,
>  static bool
>  s390_can_inline_p (tree caller, tree callee)
>  {
> +  unsigned HOST_WIDE_INT all_masks =
> +(MASK_64BIT | MASK_BACKCHAIN | MASK_DEBUG_ARG | MASK_ZARCH
> + | MASK_HARD_DFP | MASK_SOFT_FLOAT
> + | MASK_OPT_HTM | MASK_LONG_DOUBLE_128 | MASK_MVCLE | MASK_PACKED_STACK
> + | MASK_SMALL_EXEC | MASK_OPT_VX | MASK_ZVECTOR);
> +
> +  /* Flags which if present in the callee are required in the caller as 
> well.  */
> +  unsigned HOST_WIDE_INT caller_required_masks = MASK_OPT_HTM;
> +
> +  /* Flags which affect the ABI and in general prevent inlining.  */
> +  unsigned HOST_WIDE_INT must_match_masks =
> +(MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP | MASK_SOFT_FLOAT
> + | MASK_LONG_DOUBLE_128 | MASK_OPT_VX);
> +
> +  /* Flags which we in general want to prevent inlining but accept for
> + always_inline.  */
> +  unsigned HOST_WIDE_INT always_inline_safe_masks =
> +MASK_MVCLE | MASK_BACKCHAIN | MASK_SMALL_EXEC;

1) formatting, = should be at the start of next line rather than end of the
   line
2) all_masks, always_inline_safe_masks and caller_required_masks aren't
   ever modified, perhaps make them const?
3) I wonder if there is any advantage to have all_masks with all the masks
   enumerated, compared to
   const HOST_WIDE_INT all_masks
 = (caller_required_masks | must_match_masks | always_inline_safe_masks
| MASK_DEBUG_ARG | MASK_PACKED_STACK | MASK_ZVECTOR);
   i.e. when you add a new mask, instead of listing it in all_masks
   and one or more of the other vars you'd just stick it either in one
   or more of those vars or in all_masks.

Jakub

45 matches

Mail list logo