Re: [PATCH] RISC-V: Default to tuning for the thead-c906

2022-10-04 Thread Andrew Pinski via Gcc-patches
On Tue, Oct 4, 2022 at 8:55 PM Palmer Dabbelt  wrote:
>
> The C906 is by far the most widely available RISC-V processor, so let's
> default to tuning for it.
>
> gcc/ChangeLog
>
> * config/riscv/riscv.h (RISCV_TUNE_STRING_DEFAULT): Change to
> thead-c906.
> * doc/invoke.texi (RISC-V -mtune): Change the default to
> thead-c906.
>
> ---

I am ok with this as --with-tune and --with-arch works as ways of
changing the default still.

Thanks,
Andrew

>
> This has come up a handful of times, most recently during the Cauldron.
> It seems like a grey area to me: we're changing the behavior of some
> command-line arguments (ie, everything that doesn't specify -mtune), but
> we sort of change that anyway as the tuning parameters change between
> releases.
>
> I'm not really seeing much of a precedent from the other ports.  It
> looks like aarch64 sort of changed the default in 02fdbd5beb0
> ("[AArch64] [-mtune cleanup 2/5] Tune for Cortex-A53 by default.") but I
> think at that point -mtune=generic and -mtune=cortex-a53 were equivalent
> so I'm not sure that counts.  I can't quite sort out if the default x86
> tuning has ever changed, but the tuning parameters have changed.  I
> don't see any way around having the tuning parameters change as they're
> pretty tightly coupled to the GCC internals, but changing to a different
> tuning target is a bit bigger of a change.
>
> We also have a bit of a special case here: -mtune is in theory only a
> performance issue, but this change will emit a lot more misaligned
> accesses and we've seen those trigger bugs in the trap handlers before.
> Those bugs are elsewhere so it's sort of not a GCC problem, but I'm sure
> there's still users out there with broken firmware and this may cause
> visible fallout.  We can just tell those users their systems were always
> broken, but that's never a fun way to do things.
>
> I figured the easiest way to talk about this would be to just send the
> patch, but I definitely don't plan on committing it without some
> discussion.
> ---
>  gcc/config/riscv/riscv.h | 2 +-
>  gcc/doc/invoke.texi  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 363113c6511..1d9379fa5ee 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
>  #endif
>
>  #ifndef RISCV_TUNE_STRING_DEFAULT
> -#define RISCV_TUNE_STRING_DEFAULT "rocket"
> +#define RISCV_TUNE_STRING_DEFAULT "thead-c906"
>  #endif
>
>  extern const char *riscv_expand_arch (int argc, const char **argv);
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e0c2c57c9b2..2a9ea3455f6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -28529,7 +28529,7 @@ particular CPU name.  Permissible values for this 
> option are: @samp{rocket},
>  @samp{thead-c906}, @samp{size}, and all valid options for @option{-mcpu=}.
>
>  When @option{-mtune=} is not specified, use the setting from @option{-mcpu},
> -the default is @samp{rocket} if both are not specified.
> +the default is @samp{thead-c906} if both are not specified.
>
>  The @samp{size} choice is not intended for use by end-users.  This is used
>  when @option{-Os} is specified.  It overrides the instruction cost info
> --
> 2.34.1
>


[PATCH] RISC-V: Default to tuning for the thead-c906

2022-10-04 Thread Palmer Dabbelt
The C906 is by far the most widely available RISC-V processor, so let's
default to tuning for it.

gcc/ChangeLog

* config/riscv/riscv.h (RISCV_TUNE_STRING_DEFAULT): Change to
thead-c906.
* doc/invoke.texi (RISC-V -mtune): Change the default to
thead-c906.

---

This has come up a handful of times, most recently during the Cauldron.
It seems like a grey area to me: we're changing the behavior of some
command-line arguments (ie, everything that doesn't specify -mtune), but
we sort of change that anyway as the tuning parameters change between
releases.

I'm not really seeing much of a precedent from the other ports.  It
looks like aarch64 sort of changed the default in 02fdbd5beb0
("[AArch64] [-mtune cleanup 2/5] Tune for Cortex-A53 by default.") but I
think at that point -mtune=generic and -mtune=cortex-a53 were equivalent
so I'm not sure that counts.  I can't quite sort out if the default x86
tuning has ever changed, but the tuning parameters have changed.  I
don't see any way around having the tuning parameters change as they're
pretty tightly coupled to the GCC internals, but changing to a different
tuning target is a bit bigger of a change.

We also have a bit of a special case here: -mtune is in theory only a
performance issue, but this change will emit a lot more misaligned
accesses and we've seen those trigger bugs in the trap handlers before.
Those bugs are elsewhere so it's sort of not a GCC problem, but I'm sure
there's still users out there with broken firmware and this may cause
visible fallout.  We can just tell those users their systems were always
broken, but that's never a fun way to do things.

I figured the easiest way to talk about this would be to just send the
patch, but I definitely don't plan on committing it without some
discussion.
---
 gcc/config/riscv/riscv.h | 2 +-
 gcc/doc/invoke.texi  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 363113c6511..1d9379fa5ee 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #endif
 
 #ifndef RISCV_TUNE_STRING_DEFAULT
-#define RISCV_TUNE_STRING_DEFAULT "rocket"
+#define RISCV_TUNE_STRING_DEFAULT "thead-c906"
 #endif
 
 extern const char *riscv_expand_arch (int argc, const char **argv);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e0c2c57c9b2..2a9ea3455f6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -28529,7 +28529,7 @@ particular CPU name.  Permissible values for this 
option are: @samp{rocket},
 @samp{thead-c906}, @samp{size}, and all valid options for @option{-mcpu=}.
 
 When @option{-mtune=} is not specified, use the setting from @option{-mcpu},
-the default is @samp{rocket} if both are not specified.
+the default is @samp{thead-c906} if both are not specified.
 
 The @samp{size} choice is not intended for use by end-users.  This is used
 when @option{-Os} is specified.  It overrides the instruction cost info
-- 
2.34.1



Re: [PATCH v3] RISC-V: remove deprecate pic code model macro

2022-10-04 Thread Vineet Gupta

On 10/4/22 19:24, Kito Cheng wrote:

Committed, and added ChangeLog, remember to add that next time:)


Oops sorry, I will.

Thx,
-Vineet


Re: [PATCH v3] RISC-V: remove deprecate pic code model macro

2022-10-04 Thread Kito Cheng via Gcc-patches
Committed, and added ChangeLog, remember to add that next time :)

On Sat, Sep 24, 2022 at 2:08 AM Vineet Gupta  wrote:
>
> On 9/2/22 14:05, Vineet Gupta wrote:
> > Came across this deprecated symbol when looking around for
> > -mexplicit-relocs handling in code
> >
> > Signed-off-by: Vineet Gupta 
>
> No rush but looks like this got lost in the bigger thread about
> LOAD_ADDRESS_MACRO.
>
> Thx,
> -Vineet
>
> > ---
> >   gcc/config/riscv/riscv-c.cc   | 5 -
> >   gcc/testsuite/gcc.target/riscv/predef-1.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-2.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-3.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-4.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-5.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-6.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-7.c | 3 ---
> >   gcc/testsuite/gcc.target/riscv/predef-8.c | 3 ---
> >   9 files changed, 29 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> > index eb7ef09297e9..8d55ad598a9c 100644
> > --- a/gcc/config/riscv/riscv-c.cc
> > +++ b/gcc/config/riscv/riscv-c.cc
> > @@ -93,11 +93,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
> > break;
> >
> >   case CM_PIC:
> > -  /* __riscv_cmodel_pic is deprecated, and will removed in next GCC 
> > release.
> > -  see https://github.com/riscv/riscv-c-api-doc/pull/11  */
> > -  builtin_define ("__riscv_cmodel_pic");
> > -  /* FALLTHROUGH. */
> > -
> >   case CM_MEDANY:
> > builtin_define ("__riscv_cmodel_medany");
> > break;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-1.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-1.c
> > index 2e57ce6b3954..9dddc1849635 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-1.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-1.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medlow"
> > -#endif
> > -#if defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_medlow"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-2.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-2.c
> > index c85b3c9fd32a..755fe4ef7d8a 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-2.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-2.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if !defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medlow"
> > -#endif
> > -#if defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_medlow"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-3.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-3.c
> > index 82a89d415809..513645351c09 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-3.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-3.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if !defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medany"
> > -#endif
> > -#if !defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_pic"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-4.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-4.c
> > index 5868d39eb67a..76b6feec6b6f 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-4.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-4.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medlow"
> > -#endif
> > -#if defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_medlow"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-5.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-5.c
> > index 4b2bd3835061..54a51508afbd 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-5.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-5.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if !defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medlow"
> > -#endif
> > -#if defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_medlow"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-6.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-6.c
> > index 8e5ea366bd5e..f61709f7bf32 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-6.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-6.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if !defined(__riscv_cmodel_medany)
> >   #error "__riscv_cmodel_medany"
> > -#endif
> > -#if !defined(__riscv_cmodel_pic)
> > -#error "__riscv_cmodel_medpic"
> >   #endif
> >
> > return 0;
> > diff --git a/gcc/testsuite/gcc.target/riscv/predef-7.c 
> > b/gcc/testsuite/gcc.target/riscv/predef-7.c
> > index 0bde299aef1a..41217554c4db 100644
> > --- a/gcc/testsuite/gcc.target/riscv/predef-7.c
> > +++ b/gcc/testsuite/gcc.target/riscv/predef-7.c
> > @@ -57,9 +57,6 @@ int main () {
> >   #endif
> >   #if defined(__riscv_cmodel_medany)
> >   #error 

Re: Adding a new thread model to GCC

2022-10-04 Thread Xi Ruoyao via Gcc-patches
On Tue, 2022-10-04 at 21:45 +0800, LIU Hao wrote:
> 在 2022-10-04 21:13, Xi Ruoyao 写道:
> > 
> > In GCC development we usually include the configure regeneration in the
> > patch because the scripts are also version controlled.
> > 
> 
> There is a reason for not doing that: Generated contents can't be reviewed.
> 
> In mingw-w64 we do the opposite: The person who commits a patch is 
> responsible for update configure, 
> Makefile.in, etc. The patch itself doesn't include generated contents.

The reviewer can simply skip the changes in configure.  But including
the configure allows the potential testers to test the change without
autoconf-2.69 installed.

Maybe we can make a compromise: put the line "configure: Regenerate." in
the ChangeLog, but do not actually include the change.  Now if the
committer forgot to regenerate it, the git hook will reject the push
immediately.

(Just my 2 cents.)
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[committed] analyzer: move region_model_manager decl to its own header

2022-10-04 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-3076-g0167154cdd02c9.

gcc/analyzer/ChangeLog:
* region-model.h: Include "analyzer/region-model-manager.h"
(class region_model_manager): Move decl to...
* region-model-manager.h: ...this new file.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.h | 312 
 gcc/analyzer/region-model.h | 289 +-
 2 files changed, 313 insertions(+), 288 deletions(-)
 create mode 100644 gcc/analyzer/region-model-manager.h

diff --git a/gcc/analyzer/region-model-manager.h 
b/gcc/analyzer/region-model-manager.h
new file mode 100644
index 000..0057326b78f
--- /dev/null
+++ b/gcc/analyzer/region-model-manager.h
@@ -0,0 +1,312 @@
+/* Consolidation of svalues and regions.
+   Copyright (C) 2020-2022 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_ANALYZER_REGION_MODEL_MANAGER_H
+#define GCC_ANALYZER_REGION_MODEL_MANAGER_H
+
+namespace ana {
+
+/* A class responsible for owning and consolidating region and svalue
+   instances.
+   region and svalue instances are immutable as far as clients are
+   concerned, so they are provided as "const" ptrs.  */
+
+class region_model_manager
+{
+public:
+  region_model_manager (logger *logger = NULL);
+  ~region_model_manager ();
+
+  /* call_string consolidation.  */
+  const call_string _empty_call_string () const
+  {
+return m_empty_call_string;
+  }
+
+  /* svalue consolidation.  */
+  const svalue *get_or_create_constant_svalue (tree cst_expr);
+  const svalue *get_or_create_int_cst (tree type, poly_int64);
+  const svalue *get_or_create_unknown_svalue (tree type);
+  const svalue *get_or_create_setjmp_svalue (const setjmp_record ,
+tree type);
+  const svalue *get_or_create_poisoned_svalue (enum poison_kind kind,
+  tree type);
+  const svalue *get_or_create_initial_value (const region *reg);
+  const svalue *get_ptr_svalue (tree ptr_type, const region *pointee);
+  const svalue *get_or_create_unaryop (tree type, enum tree_code op,
+  const svalue *arg);
+  const svalue *get_or_create_cast (tree type, const svalue *arg);
+  const svalue *get_or_create_binop (tree type,
+enum tree_code op,
+const svalue *arg0, const svalue *arg1);
+  const svalue *get_or_create_sub_svalue (tree type,
+ const svalue *parent_svalue,
+ const region *subregion);
+  const svalue *get_or_create_repeated_svalue (tree type,
+  const svalue *outer_size,
+  const svalue *inner_svalue);
+  const svalue *get_or_create_bits_within (tree type,
+  const bit_range ,
+  const svalue *inner_svalue);
+  const svalue *get_or_create_unmergeable (const svalue *arg);
+  const svalue *get_or_create_widening_svalue (tree type,
+  const function_point ,
+  const svalue *base_svalue,
+  const svalue *iter_svalue);
+  const svalue *get_or_create_compound_svalue (tree type,
+  const binding_map );
+  const svalue *get_or_create_conjured_svalue (tree type, const gimple *stmt,
+  const region *id_reg,
+  const conjured_purge );
+  const svalue *
+  get_or_create_asm_output_svalue (tree type,
+  const gasm *asm_stmt,
+  unsigned output_idx,
+  const vec );
+  const svalue *
+  get_or_create_const_fn_result_svalue (tree type,
+   tree fndecl,
+   const vec );
+
+  const svalue *maybe_get_char_from_string_cst (tree string_cst,
+   tree byte_offset_cst);
+
+  /* 

[committed] analyzer: revamp side-effects of call summaries [PR107072]

2022-10-04 Thread David Malcolm via Gcc-patches
With -fanalyzer-call-summaries the analyzer canl attempt to summarize
the effects of some function calls at their call site, rather than
simulate the call directly, which can avoid big slowdowns during
analysis.

Previously, this summarization was extremely simplistic: no attempt
was made to update sm-state, and region_model::update_for_call_summary
would simply set the return value of the function to UNKNOWN, and assume
the function had no side effects.

This patch implements less simplistic summarizations: it tracks each
possible return enode from the called function, and attempts to generate
a successor enode from the callsite for each that have compatible
conditions, mapping state changes in the summary to state changes
at the callsite.  It also implements the beginnings of heuristics for
generating user-facing descriptions of a summary e.g.
  "when 'foo' returns NULL"
versus:
  "when 'foo' returns a heap-allocated buffer"

This still has some bugs, but much more accurately tracks the effects
of a call, and so is an improvement; it should only have an effect
when -fanalyzer-call-summaries is enabled.

As before, -fanalyzer-call-summaries is disabled by default in
analyzer.opt (but enabled by default in the test suite).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-3077-gbfca9505f6fce6.

gcc/ChangeLog:
PR analyzer/107072
* Makefile.in (ANALYZER_OBJS): Add analyzer/call-summary.o.

gcc/analyzer/ChangeLog:
PR analyzer/107072
* analyzer-logging.h: Include "diagnostic-core.h".
* analyzer.h: Include "function.h".
(class call_summary): New forward decl.
(class call_summary_replay): New forward decl.
(struct per_function_data): New forward decl.
(struct interesting_t): New forward decl.
(custom_edge_info::update_state): New vfunc.
* call-info.cc (custom_edge_info::update_state): New.
* call-summary.cc: New file.
* call-summary.h: New file.
* constraint-manager.cc: Include "analyzer/call-summary.h".
(class replay_fact_visitor): New.
(constraint_manager::replay_call_summary): New.
* constraint-manager.h (constraint_manager::replay_call_summary):
New.
* engine.cc: Include "analyzer/call-summary.h".
(exploded_node::on_stmt): Handle call summaries.
(class call_summary_edge_info): New.
(exploded_node::replay_call_summaries): New.
(exploded_node::replay_call_summary): New.
(per_function_data::~per_function_data): New.
(per_function_data::add_call_summary): Move here from header and
reimplement.
(exploded_graph::process_node): Call update_state rather than
update_model when handling bifurcation
(viz_callgraph_node::dump_dot): Use a regular label rather
than an HTML table; add summaries to dump.
* exploded-graph.h: Include "alloc-pool.h", "fibonacci_heap.h",
"supergraph.h", "sbitmap.h", "shortest-paths.h", "analyzer/sm.h",
"analyzer/program-state.h", and "analyzer/diagnostic-manager.h".
(exploded_node::replay_call_summaries): New decl.
(exploded_node::replay_call_summary): New decl.
(per_function_data::~per_function_data): New decl.
(per_function_data::add_call_summary): Move implemention from
header.
(per_function_data::m_summaries): Update type of element.
* known-function-manager.h: Include "analyzer/analyzer-logging.h".
* program-point.h: Include "pretty-print.h" and
"analyzer/call-string.h".
* program-state.cc: Include "analyzer/call-summary.h".
(sm_state_map::replay_call_summary): New.
(program_state::replay_call_summary): New.
* program-state.h (sm_state_map::replay_call_summary): New decl.
(program_state::replay_call_summary): New decl.
* region-model-manager.cc
(region_model_manager::get_or_create_asm_output_svalue): New
overload.
* region-model-manager.h
(region_model_manager::get_or_create_asm_output_svalue): New
overload decl.
* region-model.cc: Include "analyzer/call-summary.h".
(region_model::maybe_update_for_edge): Remove call to
region_model::update_for_call_summary on
SUPEREDGE_INTRAPROCEDURAL_CALL.
(region_model::update_for_call_summary): Delete.
(region_model::replay_call_summary): New.
* region-model.h (region_model::replay_call_summary): New decl.
(region_model::update_for_call_summary): Delete decl.
* store.cc: Include "analyzer/call-summary.h".
(store::replay_call_summary): New.
(store::replay_call_summary_cluster): New.
* store.h: Include "tristate.h".
(is_a_helper ::test): New.
(store::replay_call_summary): New decl.
(store::replay_call_summary_cluster): New decl.
* supergraph.cc 

[committed] analyzer: fold -(-(VAL)) to VAL

2022-10-04 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-3075-g7f42f7adfa69fe.

gcc/analyzer/ChangeLog:
* region-model-manager.cc
(region_model_manager::maybe_fold_unaryop): Fold -(-(VAL)) to VAL.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index ed5b9c75910..1956cfc3e8d 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -432,6 +432,17 @@ region_model_manager::maybe_fold_unaryop (tree type, enum 
tree_code op,
}
   }
   break;
+case NEGATE_EXPR:
+  {
+   /* -(-(VAL)) is VAL, for integer types.  */
+   if (const unaryop_svalue *unaryop = arg->dyn_cast_unaryop_svalue ())
+ if (unaryop->get_op () == NEGATE_EXPR
+ && type == unaryop->get_type ()
+ && type
+ && INTEGRAL_TYPE_P (type))
+   return unaryop->get_arg ();
+  }
+  break;
 }
 
   /* Constants.  */
-- 
2.26.3



[committed] analyzer: widening_svalues take a function_point rather than a program_point

2022-10-04 Thread David Malcolm via Gcc-patches
Enabling work towrads better call summarization.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-3074-ge6fe02d8322093.

gcc/analyzer/ChangeLog:
* region-model-manager.cc
(region_model_manager::get_or_create_widening_svalue): Use a
function_point rather than a program_point.
* region-model.cc (selftest::test_widening_constraints): Likewise.
* region-model.h
(region_model_manager::get_or_create_widening_svalue): Likewise.
(model_merger::get_function_point): New.
* svalue.cc (svalue::can_merge_p): Use a function_point rather
than a program_point.
(svalue::can_merge_p): Likewise.
* svalue.h (widening_svalue::key_t): Likewise.
(widening_svalue::widening_svalue): Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc | 9 +
 gcc/analyzer/region-model.cc | 2 +-
 gcc/analyzer/region-model.h  | 6 +-
 gcc/analyzer/svalue.cc   | 4 ++--
 gcc/analyzer/svalue.h| 8 
 5 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index cbda77f3d9c..ed5b9c75910 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -1143,10 +1143,11 @@ region_model_manager::get_or_create_unmergeable (const 
svalue *arg)
and ITER_SVAL at POINT, creating it if necessary.  */
 
 const svalue *
-region_model_manager::get_or_create_widening_svalue (tree type,
-const program_point ,
-const svalue *base_sval,
-const svalue *iter_sval)
+region_model_manager::
+get_or_create_widening_svalue (tree type,
+  const function_point ,
+  const svalue *base_sval,
+  const svalue *iter_sval)
 {
   gcc_assert (base_sval->get_kind () != SK_WIDENING);
   gcc_assert (iter_sval->get_kind () != SK_WIDENING);
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 22c52872c3e..e92bba2b438 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -7956,7 +7956,7 @@ static void
 test_widening_constraints ()
 {
   region_model_manager mgr;
-  program_point point (program_point::origin (mgr));
+  function_point point (program_point::origin (mgr).get_function_point ());
   tree int_0 = build_int_cst (integer_type_node, 0);
   tree int_m1 = build_int_cst (integer_type_node, -1);
   tree int_1 = build_int_cst (integer_type_node, 1);
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index e86720a645c..baac7ba4a12 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -278,7 +278,7 @@ public:
   const svalue *inner_svalue);
   const svalue *get_or_create_unmergeable (const svalue *arg);
   const svalue *get_or_create_widening_svalue (tree type,
-  const program_point ,
+  const function_point ,
   const svalue *base_svalue,
   const svalue *iter_svalue);
   const svalue *get_or_create_compound_svalue (tree type,
@@ -1282,6 +1282,10 @@ struct model_merger
   }
 
   bool mergeable_svalue_p (const svalue *) const;
+  const function_point _function_point () const
+  {
+return m_point.get_function_point ();
+  }
 
   const region_model *m_model_a;
   const region_model *m_model_b;
diff --git a/gcc/analyzer/svalue.cc b/gcc/analyzer/svalue.cc
index f5a5f1c9697..a37c152bb04 100644
--- a/gcc/analyzer/svalue.cc
+++ b/gcc/analyzer/svalue.cc
@@ -207,7 +207,7 @@ svalue::can_merge_p (const svalue *other,
   if (maybe_get_constant () && other->maybe_get_constant ())
 {
   return mgr->get_or_create_widening_svalue (other->get_type (),
-merger->m_point,
+merger->get_function_point (),
 other, this);
 }
 
@@ -220,7 +220,7 @@ svalue::can_merge_p (const svalue *other,
&& binop_sval->get_arg1 ()->get_kind () == SK_CONSTANT
&& other->get_kind () != SK_WIDENING)
   return mgr->get_or_create_widening_svalue (other->get_type (),
-merger->m_point,
+merger->get_function_point (),
 other, this);
 
   /* Merge: (Widen(existing_val, V), existing_val) -> Widen (existing_val, V)
diff --git a/gcc/analyzer/svalue.h b/gcc/analyzer/svalue.h
index f4cab0d4134..9393d6ec213 100644
--- a/gcc/analyzer/svalue.h
+++ 

[PATCH v2] c-family: ICE with [[gnu::nocf_check]] [PR106937]

2022-10-04 Thread Marek Polacek via Gcc-patches
On Fri, Sep 30, 2022 at 09:12:24AM -0400, Jason Merrill wrote:
> On 9/29/22 18:49, Marek Polacek wrote:
> > When getting the name of an attribute, we ought to use
> > get_attribute_name, which handles both [[ ]] and __attribute__(())
> > forms.  Failure to do so may result in an ICE, like here.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> How do we print the attributes with this patch?  Don't we also want to print
> the namespace, and use [[]] in the output?

Good point, however: while the testcase indeed has an attribute
in the [[]] form in the typedef, here we're printing its "aka":

warning: initialization of 'FuncPointerWithNoCfCheck' {aka 'void 
(__attribute__((nocf_check)) *)(void)'} from incompatible pointer type 
'FuncPointer' {aka 'void (*)(void)'}

c-pretty-print.cc doesn't seem to know how to print an [[]] attribute.
I could do that, but then we'd print

  aka 'void ([[nocf_check]] *)(void)'

in the above, but that's invalid syntax!  pp_c_attributes_display appears
to be called for * and & only where you can't use an [[]] attribute.  So
perhaps we want to keep printing the GNU form here?

I noticed that pp_c_attributes has never been used, so we can just remove it.

I've also adjusted the test not to use "-w".

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
When getting the name of an attribute, we ought to use
get_attribute_name, which handles both [[ ]] and __attribute__(())
forms.  Failure to do so may result in an ICE, like here.

pp_c_attributes has been unused since its introduction in r56273.

PR c++/106937

gcc/c-family/ChangeLog:

* c-pretty-print.cc (pp_c_attributes): Remove.
(pp_c_attributes_display): Use get_attribute_name.
* c-pretty-print.h (pp_c_attributes): Remove.

gcc/testsuite/ChangeLog:

* gcc.dg/fcf-protection-1.c: New test.
---
 gcc/c-family/c-pretty-print.cc  | 30 +++--
 gcc/c-family/c-pretty-print.h   |  1 -
 gcc/testsuite/gcc.dg/fcf-protection-1.c | 13 +++
 3 files changed, 16 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fcf-protection-1.c

diff --git a/gcc/c-family/c-pretty-print.cc b/gcc/c-family/c-pretty-print.cc
index efa1768f4d6..2419e149333 100644
--- a/gcc/c-family/c-pretty-print.cc
+++ b/gcc/c-family/c-pretty-print.cc
@@ -850,32 +850,8 @@ c_pretty_printer::declaration (tree t)
   pp_c_init_declarator (this, t);
 }
 
-/* Pretty-print ATTRIBUTES using GNU C extension syntax.  */
-
-void
-pp_c_attributes (c_pretty_printer *pp, tree attributes)
-{
-  if (attributes == NULL_TREE)
-return;
-
-  pp_c_ws_string (pp, "__attribute__");
-  pp_c_left_paren (pp);
-  pp_c_left_paren (pp);
-  for (; attributes != NULL_TREE; attributes = TREE_CHAIN (attributes))
-{
-  pp_tree_identifier (pp, TREE_PURPOSE (attributes));
-  if (TREE_VALUE (attributes))
-   pp_c_call_argument_list (pp, TREE_VALUE (attributes));
-
-  if (TREE_CHAIN (attributes))
-   pp_separate_with (pp, ',');
-}
-  pp_c_right_paren (pp);
-  pp_c_right_paren (pp);
-}
-
 /* Pretty-print ATTRIBUTES using GNU C extension syntax for attributes
-   marked to be displayed on disgnostic.  */
+   marked to be displayed on diagnostic.  */
 
 void
 pp_c_attributes_display (c_pretty_printer *pp, tree a)
@@ -888,7 +864,7 @@ pp_c_attributes_display (c_pretty_printer *pp, tree a)
   for (; a != NULL_TREE; a = TREE_CHAIN (a))
 {
   const struct attribute_spec *as;
-  as = lookup_attribute_spec (TREE_PURPOSE (a));
+  as = lookup_attribute_spec (get_attribute_name (a));
   if (!as || as->affects_type_identity == false)
 continue;
   if (c_dialect_cxx ()
@@ -906,7 +882,7 @@ pp_c_attributes_display (c_pretty_printer *pp, tree a)
{
  pp_separate_with (pp, ',');
}
-  pp_tree_identifier (pp, TREE_PURPOSE (a));
+  pp_tree_identifier (pp, get_attribute_name (a));
   if (TREE_VALUE (a))
pp_c_call_argument_list (pp, TREE_VALUE (a));
 }
diff --git a/gcc/c-family/c-pretty-print.h b/gcc/c-family/c-pretty-print.h
index be86bed4fee..92674ab4d06 100644
--- a/gcc/c-family/c-pretty-print.h
+++ b/gcc/c-family/c-pretty-print.h
@@ -119,7 +119,6 @@ void pp_c_space_for_pointer_operator (c_pretty_printer *, 
tree);
 /* Declarations.  */
 void pp_c_tree_decl_identifier (c_pretty_printer *, tree);
 void pp_c_function_definition (c_pretty_printer *, tree);
-void pp_c_attributes (c_pretty_printer *, tree);
 void pp_c_attributes_display (c_pretty_printer *, tree);
 void pp_c_cv_qualifiers (c_pretty_printer *pp, int qualifiers, bool func_type);
 void pp_c_type_qualifier_list (c_pretty_printer *, tree);
diff --git a/gcc/testsuite/gcc.dg/fcf-protection-1.c 
b/gcc/testsuite/gcc.dg/fcf-protection-1.c
new file mode 100644
index 000..baad74cd86f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fcf-protection-1.c
@@ -0,0 +1,13 @@
+/* PR c++/106937 */
+/* { dg-options "-fcf-protection" } */
+

[pushed] c++: fix debug info for array temporary [PR107154]

2022-10-04 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --

In the testcase the elaboration of the array init that happens at genericize
time was getting the location info for the end of the function; fixed by
doing the expansion at the location of the original expression.

PR c++/107154

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_genericize_init_expr): Use iloc_sentinel.
(cp_genericize_target_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/lineno-array1.C: New test.
---
 gcc/cp/cp-gimplify.cc |  2 ++
 .../g++.dg/debug/dwarf2/lineno-array1.C   | 25 +++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/lineno-array1.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index cca3b9fea33..404a7699a72 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -920,6 +920,7 @@ cp_genericize_init (tree *replace, tree from, tree to)
 static void
 cp_genericize_init_expr (tree *stmt_p)
 {
+  iloc_sentinel ils = EXPR_LOCATION (*stmt_p);
   tree to = TREE_OPERAND (*stmt_p, 0);
   tree from = TREE_OPERAND (*stmt_p, 1);
   if (SIMPLE_TARGET_EXPR_P (from)
@@ -935,6 +936,7 @@ cp_genericize_init_expr (tree *stmt_p)
 static void
 cp_genericize_target_expr (tree *stmt_p)
 {
+  iloc_sentinel ils = EXPR_LOCATION (*stmt_p);
   tree slot = TARGET_EXPR_SLOT (*stmt_p);
   cp_genericize_init (_EXPR_INITIAL (*stmt_p),
  TARGET_EXPR_INITIAL (*stmt_p), slot);
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/lineno-array1.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/lineno-array1.C
new file mode 100644
index 000..befac5f04b3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/lineno-array1.C
@@ -0,0 +1,25 @@
+// PR c++/107154
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-gno-as-loc-support -dA" }
+// Test that we emit debug info exactly once for the last line.
+// { dg-final { scan-assembler-times {:25:1} 1 } }
+
+bool dummy;
+
+struct S {
+  const char *p;
+  S(const char *p): p(p) {}
+  ~S() { dummy = true; }
+};
+
+using Sar = S[];
+
+struct X {
+  X(Sar&&) { }
+};
+
+int main()
+{
+  X x(Sar{"", ""});
+  return 0;
+}

base-commit: 49c3e9dfc5e23a335f4057efffbff2273e3c4631
-- 
2.31.1



Re: [PATCH] Set discriminators for call stmts on the same line within the same basic block

2022-10-04 Thread Jason Merrill via Gcc-patches

On 10/3/22 02:08, Eugene Rozenfeld wrote:

This change is based on commit 1e6c4a7a8fb8e20545bb9f9032d3854f3f794c18
by Dehao Chen in vendors/google/heads/gcc-4_8.

Tested on x86_64-pc-linux-gnu.


Brief rationale for the change?


gcc/ChangeLog:
 * tree-cfg.cc (assign_discriminators): Set discriminators for call 
stmts
 on the same line within the same basic block.
---
  gcc/tree-cfg.cc | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index ade66c54499..8e2a3a5f6c6 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -1203,8 +1203,39 @@ assign_discriminators (void)
  {
edge e;
edge_iterator ei;
+  gimple_stmt_iterator gsi;
gimple *last = last_stmt (bb);
location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
+  location_t curr_locus = UNKNOWN_LOCATION;
+  int curr_discr = 0;
+
+  /* Traverse the basic block, if two function calls within a basic block
+   are mapped to the same line, assign a new discriminator because a call
+   stmt could be a split point of a basic block.  */
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ expanded_location curr_locus_e;
+ if (curr_locus == UNKNOWN_LOCATION)
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+   }
+ else if (!same_line_p (curr_locus, _locus_e, gimple_location 
(stmt)))
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+ curr_discr = 0;
+   }
+ else if (curr_discr != 0)
+   {
+ gimple_set_location (stmt, location_with_discriminator (
+ gimple_location (stmt), curr_discr));


This indentation is wonky, with an open paren at the end of the line; 
I'd suggest reformatting to


 location_t dloc = (location_with_discriminator > (gimple_location (stmt), 

curr_discr));>  gimple_set_location (stmt, dloc);

Jason



Re: [PATCH] middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support

2022-10-04 Thread Jason Merrill via Gcc-patches

On 10/4/22 05:06, Jakub Jelinek wrote:

On Fri, Sep 30, 2022 at 04:08:10PM +0200, Jakub Jelinek via Gcc-patches wrote:

On Fri, Sep 30, 2022 at 09:49:08AM -0400, Jason Merrill wrote:

The comment from Apple on the ABI mangling proposal suggests to me that we
might want to delay enabling C++ std::bfloat16_t (i.e. defining
__STDCPP_BFLOAT16_T__) until we have that excess precision support?


I saw that comment.  We have similar problem with _Float16 too, where C++
effectively right now works as when one uses -fexcess-precision=16 in C
(which isn't default).
I can see how hard would it be to add EXCESS_PRECISION_EXPR support to C++
FE.


I've started on that but it will take some time.  That said, it should
work though less efficiently even without that, even in C users can always
select request such behavior with -fexcess-precision=16.


If we're using DF32x for _Float32x, maybe we want DF16b for bfloat16?


Perhaps, I just followed what was in the pull request.  Can change it.


Changed now, added support for the builtins and ported most of the
float16 tests, so that it gets at least some test coverage.
Also, for now I've left the aarch64 and arm changes out of the patch,
because I haven't tested it on aarch64 yet and arm support was incomplete
and I haven't heard from the ARM maintainers yet what they want or don't
want.

The added testcases showed a few problems.  One is that i?86 maintains
2 kinds of fp comparisons, trivial and non-trivial, the trivial which can
be handled by just a single conditional jump or setCC are handled directly,
while the complex ones which need two are not handled and the generic
code then figures it out using the trivial ones.  Unfortunately this means
that for == and != we end up with libcalls for it.  For _Float16, we have
added __nehf2 and __eqhf2 entrypoints last year.  I wanted to avoid doing
the same for __bf16, so I've added cbranchbf4 and cstorebf4 expanders
that handle all fp comparisons and internally just shift the operands up
to construct SFmode without even handling sNaNs and then call the generic
code to handle SFmode comparisons.

Another problem is for HFmode comparisons, when we see we don't support
directly some HFmode comparison, we iterate on wider scalar float modes
and look for usable comparisons, but BFmode and HFmode are unordered and
one of them has to appear as wider but neither is a subset nor superset,
so I had to skip wider modes which have equal precision to the starting one.
Yet another problem is because I've only enabled the bf16/BF16 suffixes in
C++ because for C it might clash with some later extension.  Am I right to
fear about that, or do you think C will never standardize suffixes that
would clash with that because C++ standardized the bf16/BF16 suffixes for
something already?  If I could enable it, I'd always pedwarn for C for those
and could enable the __BF16_*__ macros.  Right now I had to disable some
-fbuilding-libgcc macros because of that (though nothing really uses them
right now).

Another question is the suffixes of the builtins.  For now I have added
bf16 suffix and enabled the builtins with !both_p, so one always needs to
use __builtin_* form for them.  None of the GCC builtins end with b,
so this isn't ambiguous with __builtin_*f16, but some libm functions do end
with b, in particular ilogb, logb and f{??,??x}sub.  ilogb and the subs
always have it, but is __builtin_logbf16 f16 suffixed logb or bf16 suffixed
log?  Shall the builtins use f16b suffixes instead like the mangling does?


Do we want bf16 builtins at all?  The impression I've gotten is that 
users want computation to happen in SFmode and only later truncate back 
to BFmode.



Full patch bootstrapped/regtested on x86_64-linux and i686-linux.

2022-10-04  Jakub Jelinek  

gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
(CASE_FLT_FN_FLOATN_NX): Also include BUILT_IN_*BF16.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
(emit_store_flag_1): Don't consider [BH]Fmode as wider mode to
narrower modes.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions.  If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16,
BT_FN_BFLOAT16_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING,
BT_FN_BFLOAT16_BFLOAT16_BFLOAT16,
BT_FN_BFLOAT16_BFLOAT16_BFLOAT16_BFLOAT16): New.
* builtins.def (DEF_GCC_FLOATN_NX_BUILTINS,
DEF_EXT_LIB_FLOATN_NX_BUILTINS): Also add 

PING^1: [PATCH] x86: Check corrupted return address when unwinding stack

2022-10-04 Thread H.J. Lu via Gcc-patches
On Wed, Sep 21, 2022 at 1:42 PM H.J. Lu  wrote:
>
> If shadow stack is enabled, when unwinding stack, we count how many stack
> frames we pop to reach the landing pad and adjust shadow stack by the same
> amount.  When counting the stack frame, we compare the return address on
> normal stack against the return address on shadow stack.  If they don't
> match, return _URC_FATAL_PHASE2_ERROR for the corrupted return address on
> normal stack.  Don't check the return address for
>
> 1. Non-catchable exception where exception_class == 0.  Process will be
> terminated.
> 2. Zero return address which marks the outermost stack frame.
> 3. Signal stack frame since kernel puts a restore token on shadow stack.
>
> * unwind-generic.h (_Unwind_Frames_Increment): Add the EXC
> argument.
> * unwind.inc (_Unwind_RaiseException_Phase2): Pass EXC to
> _Unwind_Frames_Increment.
> (_Unwind_ForcedUnwind_Phase2): Likewise.
> * config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
> Take the EXC argument.  Return _URC_FATAL_PHASE2_ERROR if the
> return address on normal stack doesn't match the return address
> on shadow stack.
> ---
>  libgcc/config/i386/shadow-stack-unwind.h | 51 ++--
>  libgcc/unwind-generic.h  |  2 +-
>  libgcc/unwind.inc|  4 +-
>  3 files changed, 50 insertions(+), 7 deletions(-)
>
> diff --git a/libgcc/config/i386/shadow-stack-unwind.h 
> b/libgcc/config/i386/shadow-stack-unwind.h
> index 2b02682bdae..89d44165000 100644
> --- a/libgcc/config/i386/shadow-stack-unwind.h
> +++ b/libgcc/config/i386/shadow-stack-unwind.h
> @@ -54,10 +54,39 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
> aligned.  If the original shadow stack is 8 byte aligned, we just
> need to pop 2 slots, one restore token, from shadow stack.  Otherwise,
> we need to pop 3 slots, one restore token + 4 byte padding, from
> -   shadow stack.  */
> -#ifndef __x86_64__
> +   shadow stack.
> +
> +   When popping a stack frame, we compare the return address on normal
> +   stack against the return address on shadow stack.  If they don't match,
> +   return _URC_FATAL_PHASE2_ERROR for the corrupted return address on
> +   normal stack.  Don't check the return address for
> +   1. Non-catchable exception where exception_class == 0.  Process will
> +  be terminated.
> +   2. Zero return address which marks the outermost stack frame.
> +   3. Signal stack frame since kernel puts a restore token on shadow
> +  stack.
> + */
>  #undef _Unwind_Frames_Increment
> -#define _Unwind_Frames_Increment(context, frames)  \
> +#ifdef __x86_64__
> +#define _Unwind_Frames_Increment(exc, context, frames) \
> +{  \
> +  frames++;\
> +  if (exc->exception_class != 0\
> + && _Unwind_GetIP (context) != 0   \
> + && !_Unwind_IsSignalFrame (context))  \
> +   {   \
> + _Unwind_Word ssp = _get_ssp ();   \
> + if (ssp != 0) \
> +   {   \
> + ssp += 8 * frames;\
> + _Unwind_Word ra = *(_Unwind_Word *) ssp;  \
> + if (ra != _Unwind_GetIP (context))\
> +   return _URC_FATAL_PHASE2_ERROR; \
> +   }   \
> +   }   \
> +}
> +#else
> +#define _Unwind_Frames_Increment(exc, context, frames) \
>if (_Unwind_IsSignalFrame (context)) \
>  do \
>{\
> @@ -83,5 +112,19 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>}\
>  while (0); \
>else \
> -frames++;
> +{  \
> +  frames++;\
> +  if (exc->exception_class != 0\
> + && _Unwind_GetIP (context) != 0)  \
> +   {   \
> + _Unwind_Word ssp = _get_ssp ();   \
> + if (ssp != 0) \
> +   {   \
> + ssp += 4 * frames;\
> + _Unwind_Word ra = *(_Unwind_Word *) ssp;  \
> + if (ra != _Unwind_GetIP (context))\
> +   return _URC_FATAL_PHASE2_ERROR; \
> + 

Re: [PATCH] fixincludes: Deal also with the _Float128x cases [PR107059]

2022-10-04 Thread Jason Merrill via Gcc-patches

On 9/30/22 03:20, Jakub Jelinek wrote:

On Wed, Sep 28, 2022 at 08:19:43PM +0200, Jakub Jelinek via Gcc-patches wrote:

Another case are the following 3 snippets:
#  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
#   error "_Float128X supported but no constant suffix"
#  else
#   define __f128x(x) x##f128x
#  endif
...
#  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
#   error "_Float128X supported but no complex type"
#  else
#   define __CFLOAT128X _Complex _Float128x
#  endif
...
#  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
#   error "_Float128x supported but no type"
#  endif
but as no target has _Float128x right now and don't see it
coming soon, it isn't a big deal (on the glibc side it is of
course ok to adjust those).


This incremental patch deals handles the above 3 cases, so we
fixinclude what glibc itself changed too.

Bootstrapped/regtested on x86_64-linux and i686-linux (together with the
previously posted fixincludes/ change too), ok for trunk?


Both OK on Friday if no comments before then.


2022-09-30  Jakub Jelinek  

PR bootstrap/107059
* inclhack.def (glibc_cxx_floatn_5): New.
* fixincl.x: Regenerated.
* tests/base/bits/floatn.h: Regenerated.

--- fixincludes/inclhack.def.jj 2022-09-29 22:18:47.974402688 +0200
+++ fixincludes/inclhack.def2022-09-29 22:22:48.151145670 +0200
@@ -2131,6 +2131,23 @@ fix = {
EOT;
  };
  
+fix = {

+hackname  = glibc_cxx_floatn_5;
+files = bits/floatn.h, bits/floatn-common.h, "*/bits/floatn.h", 
"*/bits/floatn-common.h";
+select= "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined 
__cplusplus\n"
+   "([ \t]*#[ \t]+error \"_Float128[xX] supported but no )";
+c_fix = format;
+c_fix_arg = "%1(defined __cplusplus && !__GNUC_PREREQ (13, 0))\n%2";
+test_text = <<-EOT
+   #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
+   #   error "_Float128X supported but no constant suffix"
+   #  endif
+   #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
+   #   error "_Float128x supported but no type"
+   #  endif
+   EOT;
+};
+
  /*  glibc-2.3.5 defines pthread mutex initializers incorrectly,
   *  so we replace them with versions that correspond to the
   *  definition.
--- fixincludes/fixincl.x.jj2022-09-29 22:18:47.975402675 +0200
+++ fixincludes/fixincl.x   2022-09-29 22:22:55.675909244 +0200
@@ -2,11 +2,11 @@
   *
   * DO NOT EDIT THIS FILE   (fixincl.x)
   *
- * It has been AutoGen-ed  September 28, 2022 at 07:56:15 PM by AutoGen 5.18.16
+ * It has been AutoGen-ed  September 29, 2022 at 10:22:55 PM by AutoGen 5.18.16
   * From the definitionsinclhack.def
   * and the template file   fixincl
   */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Wed Sep 28 19:56:15 CEST 2022
+/* DO NOT SVN-MERGE THIS FILE, EITHER Thu Sep 29 22:22:55 CEST 2022
   *
   * You must regenerate it.  Use the ./genfixes script.
   *
@@ -15,7 +15,7 @@
   * certain ANSI-incompatible system header files which are fixed to work
   * correctly with ANSI C and placed in a directory that GNU C will search.
   *
- * This file contains 271 fixup descriptions.
+ * This file contains 272 fixup descriptions.
   *
   * See README for more information.
   *
@@ -4273,6 +4273,43 @@ static const char* apzGlibc_Cxx_Floatn_4
  
  /* * * * * * * * * * * * * * * * * * * * * * * * * *

   *
+ *  Description of Glibc_Cxx_Floatn_5 fix
+ */
+tSCC zGlibc_Cxx_Floatn_5Name[] =
+ "glibc_cxx_floatn_5";
+
+/*
+ *  File name selection pattern
+ */
+tSCC zGlibc_Cxx_Floatn_5List[] =
+  
"bits/floatn.h\0bits/floatn-common.h\0*/bits/floatn.h\0*/bits/floatn-common.h\0";
+/*
+ *  Machine/OS name selection pattern
+ */
+#define apzGlibc_Cxx_Floatn_5Machs (const char**)NULL
+
+/*
+ *  content selection pattern - do fix if pattern found
+ */
+tSCC zGlibc_Cxx_Floatn_5Select0[] =
+   "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined 
__cplusplus\n\
+([ \t]*#[ \t]+error \"_Float128[xX] supported but no )";
+
+#defineGLIBC_CXX_FLOATN_5_TEST_CT  1
+static tTestDesc aGlibc_Cxx_Floatn_5Tests[] = {
+  { TT_EGREP,zGlibc_Cxx_Floatn_5Select0, (regex_t*)NULL }, };
+
+/*
+ *  Fix Command Arguments for Glibc_Cxx_Floatn_5
+ */
+static const char* apzGlibc_Cxx_Floatn_5Patch[] = {
+"format",
+"%1(defined __cplusplus && !__GNUC_PREREQ (13, 0))\n\
+%2",
+(char*)NULL };
+
+/* * * * * * * * * * * * * * * * * * * * * * * * * *
+ *
   *  Description of Glibc_Mutex_Init fix
   */
  tSCC zGlibc_Mutex_InitName[] =
@@ -11038,9 +11075,9 @@ static const char* apzX11_SprintfPatch[]
   *
   *  List of all fixes
   */
-#define REGEX_COUNT  309
+#define REGEX_COUNT  310
  #define MACH_LIST_SIZE_LIMIT 187
-#define FIX_COUNT271
+#define FIX_COUNT272
  
  /*

   *  Enumerate the fixes
@@ -11147,6 +11184,7 @@ typedef enum {
  GLIBC_CXX_FLOATN_2_FIXIDX,
  GLIBC_CXX_FLOATN_3_FIXIDX,
  GLIBC_CXX_FLOATN_4_FIXIDX,
+

[PATCH] Fortran: error recovery for invalid types in array constructors [PR107000]

2022-10-04 Thread Harald Anlauf via Gcc-patches
Dear all,

we did not recover well from bad expressions in array constructors,
especially when there was a typespec and a unary '+' or '-', and
when the array constructor was used in an arithmetic expression.

The attached patch introduces an ARITH_INVALID_TYPE that is used
when we try to recover from these errors, and tries to handle
all unary and binary arithmetic expressions.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From ad892a270c504def2f8f84494d5c7bcba9aef27f Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 4 Oct 2022 23:04:06 +0200
Subject: [PATCH] Fortran: error recovery for invalid types in array
 constructors [PR107000]

gcc/fortran/ChangeLog:

	PR fortran/107000
	* arith.cc (gfc_arith_error): Define error message for
	ARITH_INVALID_TYPE.
	(reduce_unary): Catch arithmetic expressions with invalid type.
	(reduce_binary_ac): Likewise.
	(reduce_binary_ca): Likewise.
	(reduce_binary_aa): Likewise.
	(gfc_real2complex): Source expression must be of type REAL.
	* gfortran.h (enum arith): Add ARITH_INVALID_TYPE.

gcc/testsuite/ChangeLog:

	PR fortran/107000
	* gfortran.dg/pr107000.f90: New test.
---
 gcc/fortran/arith.cc   | 19 ++
 gcc/fortran/gfortran.h |  2 +-
 gcc/testsuite/gfortran.dg/pr107000.f90 | 50 ++
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr107000.f90

diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc
index d57059a375f..e6e35ef3c42 100644
--- a/gcc/fortran/arith.cc
+++ b/gcc/fortran/arith.cc
@@ -118,6 +118,9 @@ gfc_arith_error (arith code)
 case ARITH_WRONGCONCAT:
   p = G_("Illegal type in character concatenation at %L");
   break;
+case ARITH_INVALID_TYPE:
+  p = G_("Invalid type in arithmetic operation at %L");
+  break;

 default:
   gfc_internal_error ("gfc_arith_error(): Bad error code");
@@ -1261,6 +1264,9 @@ reduce_unary (arith (*eval) (gfc_expr *, gfc_expr **), gfc_expr *op,
   gfc_expr *r;
   arith rc;

+  if (op->expr_type == EXPR_OP && op->ts.type == BT_UNKNOWN)
+return ARITH_INVALID_TYPE;
+
   if (op->expr_type == EXPR_CONSTANT)
 return eval (op, result);

@@ -1302,6 +1308,9 @@ reduce_binary_ac (arith (*eval) (gfc_expr *, gfc_expr *, gfc_expr **),
   gfc_expr *r;
   arith rc = ARITH_OK;

+  if (op1->expr_type == EXPR_OP && op1->ts.type == BT_UNKNOWN)
+return ARITH_INVALID_TYPE;
+
   head = gfc_constructor_copy (op1->value.constructor);
   for (c = gfc_constructor_first (head); c; c = gfc_constructor_next (c))
 {
@@ -1354,6 +1363,9 @@ reduce_binary_ca (arith (*eval) (gfc_expr *, gfc_expr *, gfc_expr **),
   gfc_expr *r;
   arith rc = ARITH_OK;

+  if (op2->expr_type == EXPR_OP && op2->ts.type == BT_UNKNOWN)
+return ARITH_INVALID_TYPE;
+
   head = gfc_constructor_copy (op2->value.constructor);
   for (c = gfc_constructor_first (head); c; c = gfc_constructor_next (c))
 {
@@ -1414,6 +1426,10 @@ reduce_binary_aa (arith (*eval) (gfc_expr *, gfc_expr *, gfc_expr **),
   if (!gfc_check_conformance (op1, op2, _("elemental binary operation")))
 return ARITH_INCOMMENSURATE;

+  if ((op1->expr_type == EXPR_OP && op1->ts.type == BT_UNKNOWN)
+  || (op2->expr_type == EXPR_OP && op2->ts.type == BT_UNKNOWN))
+return ARITH_INVALID_TYPE;
+
   head = gfc_constructor_copy (op1->value.constructor);
   for (c = gfc_constructor_first (head),
d = gfc_constructor_first (op2->value.constructor);
@@ -2238,6 +2254,9 @@ gfc_real2complex (gfc_expr *src, int kind)
   arith rc;
   bool did_warn = false;

+  if (src->ts.type != BT_REAL)
+return NULL;
+
   result = gfc_get_constant_expr (BT_COMPLEX, kind, >where);

   mpc_set_fr (result->value.complex, src->value.real, GFC_MPC_RND_MODE);
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 4babd77924b..fc0aa51df57 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -226,7 +226,7 @@ enum gfc_intrinsic_op
 enum arith
 { ARITH_OK = 1, ARITH_OVERFLOW, ARITH_UNDERFLOW, ARITH_NAN,
   ARITH_DIV0, ARITH_INCOMMENSURATE, ARITH_ASYMMETRIC, ARITH_PROHIBIT,
-  ARITH_WRONGCONCAT
+  ARITH_WRONGCONCAT, ARITH_INVALID_TYPE
 };

 /* Statements.  */
diff --git a/gcc/testsuite/gfortran.dg/pr107000.f90 b/gcc/testsuite/gfortran.dg/pr107000.f90
new file mode 100644
index 000..c13627f556b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr107000.f90
@@ -0,0 +1,50 @@
+! { dg-do compile }
+! PR fortran/107000 - ICE in gfc_real2complex, reduce_unary, reduce_binary_*
+! Contributed by G.Steinmetz
+
+program p
+  real:: y(1)
+  complex :: x(1)
+  x = (1.0, 2.0) * [real :: -'1'] ! { dg-error "Operand of unary numeric operator" }
+  x = (1.0, 2.0) * [complex :: +'1'] ! { dg-error "Invalid type" }
+  x = [complex :: -'1'] * (1.0, 2.0) ! { dg-error "Invalid type" }
+  y = [complex :: -'1'] * 2  ! { dg-error "Invalid type" }
+  y = 2 * [complex :: -'1']! { dg-error "Invalid type" }
+  y = 2 * [complex :: 

Re: [PATCH] c++, c, v2: Implement C++23 P1774R8 - Portable assumptions [PR106654]

2022-10-04 Thread Jason Merrill via Gcc-patches

On 10/3/22 15:22, Jakub Jelinek wrote:

On Fri, Sep 30, 2022 at 04:39:25PM -0400, Jason Merrill wrote:

* fold-const.h (simple_operand_p_2): Declare.


This needs a better name if it's going to be a public interface.

The usage also needs rationale for why this is the right predicate for
assume, rather than just no-side-effects.  Surely the latter is right for
constexpr, at least?


You're right that for the constexpr case !TREE_SIDE_EFFECTS is all we need,
including const/pure function calls.
For the gimplification case, TREE_SIDE_EFFECTS isn't good enough.
TREE_SIDE_EFFECTS is documented as:
/* In any expression, decl, or constant, nonzero means it has side effects or
reevaluation of the whole expression could produce a different value.
This is set if any subexpression is a function call, a side effect or a
reference to a volatile variable.  In a ..._DECL, this is set only if the
declaration said `volatile'.  This will never be set for a constant.  */
so !TREE_SIDE_EFFECTS expressions can be safely evaluated multiple times
instead of just once.
But we need more than that, we need basically the same requirements as
when trying to hoist an expression from inside of if (0) block to before
that block (or just any conditional guarded block where we don't know the
condition value).  And so we need to ensure that we don't get any traps,
raise exceptions etc. or do anything else with observable effects.
And on top of that, we'd better limit it to something small, because
if we have a condition with hundreds of non-side-effect operations in it,
it will affect inlining limits and we'd need to trust that DCE will clean up
everything as unused.


Let's factor this out of here and cp_parser_constant_expression rather than
duplicate it.


Done.


+  for (; attr; attr = lookup_attribute ("gnu", "assume", TREE_CHAIN (attr)))
+{
+  tree args = TREE_VALUE (attr);
+  int nargs = list_length (args);
+  if (nargs != 1)
+   {


Need auto_diagnostic_group.


Added (and while playing with finish_static_assert noticed that
it doesn't use that either).
Now that I look, attribs.cc (decl_attributes) doesn't do that either,
will test a separate patch for that.


+  bool in_assume;


I think it would be better to reject jumps into statement-expressions like
the C front-end.


Already committed, thanks for the review.


+ if (!*non_constant_p && !ctx->quiet)
+   error_at (EXPR_LOCATION (t),
+ "failed % attribute assumption");


Maybe share some code for explaining the failure with finish_static_assert?


I couldn't share the find_failing_clause stuff (but fortunately it is
short), because it should call different function to evaluate it, but I can
share the reporting.


It could choose which function to call based on whether the 
constexpr_ctx parameter is null?



Here is a lightly tested updated patch which I'll bootstrap/regtest tonight.

2022-10-03  Jakub Jelinek  

PR c++/106654
gcc/
* internal-fn.def (ASSUME): New internal function.
* internal-fn.h (expand_ASSUME): Declare.
* internal-fn.cc (expand_ASSUME): Define.
* gimplify.cc (gimplify_call_expr): Gimplify IFN_ASSUME.
* fold-const.h (simple_condition_p): Declare.
* fold-const.cc (simple_operand_p_2): Rename to ...
(simple_condition_p): ... this.  Remove forward declaration.
No longer static.  Adjust function comment and fix a typo in it.
Adjust recursive call.
(simple_operand_p): Adjust function comment.
(fold_truth_andor): Adjust simple_operand_p_2 callers to call
simple_condition_p.
* attribs.h (remove_attribute): Declare overload with additional
attr_ns argument.
(private_lookup_attribute): Declare overload with additional
attr_ns and attr_ns_len arguments.
(lookup_attribute): New overload with additional attr_ns argument.
* attribs.cc (remove_attribute): New overload with additional
attr_ns argument.
(private_lookup_attribute): New overload with additional
attr_ns and attr_ns_len arguments.


I think go ahead and commit the attribs.{h,cc} changes separately.


* doc/extend.texi: Document assume attribute.  Move fallthrough
attribute example to its section.
gcc/c-family/
* c-attribs.cc (handle_assume_attribute): New function.
(c_common_attribute_table): Add entry for assume attribute.
* c-lex.cc (c_common_has_attribute): Handle
__have_cpp_attribute (assume).
gcc/c/
* c-parser.cc (handle_assume_attribute): New function.
(c_parser_declaration_or_fndef): Handle assume attribute.
(c_parser_attribute_arguments): Add assume_attr argument,
if true, parse first argument as conditional expression.
(c_parser_gnu_attribute, c_parser_std_attribute): Adjust
c_parser_attribute_arguments callers.

Re: Adding a new thread model to GCC

2022-10-04 Thread Bernhard Reutner-Fischer via Gcc-patches
On 4 October 2022 10:06:00 CEST, LIU Hao  wrote:
>在 2022-10-03 13:03, Bernhard Reutner-Fischer 写道:
>> 
>> No, sorry for my brevity.
>> Using __gthread_t like in your patch is correct.
>> 
>
>I see. In 'libgfortran/io/async.c' there is
>
>  ```
>async_unit *au = u->au;
>LOCK (>lock);
>thread_unit = u;
>au->thread = __gthread_self ();
>  ```
>
>so indeed `thread` should be `__gthread_t`.

Yes.

> By the way I reported this issue four months ago and haven't received any 
> response so far:
>
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105764

So, ideally, you would mention this PR in your patch.

LGTM (obvious even) but I cannot formally approve it.
thanks,


[PATCH] Fortran: reject procedures and procedure pointers as output item [PR107074]

2022-10-04 Thread Harald Anlauf via Gcc-patches
Dear all,

when looking at output item lists we didn't catch procedures
and procedure pointers and ran into a gfc_internal_error().
Such items are not allowed by the Fortran standard, e.g. for
procedure pointers there is

C1233 (R1217) An expression that is an output-item shall not
  have a value that is a procedure pointer.

Attached patch generates an error instead.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 3b15fe83830c1e75339114e0241e9d2158393017 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 4 Oct 2022 21:19:21 +0200
Subject: [PATCH] Fortran: reject procedures and procedure pointers as output
 item [PR107074]

gcc/fortran/ChangeLog:

	PR fortran/107074
	* trans-io.cc (transfer_expr): A procedure or a procedure pointer
	cannot be output items.

gcc/testsuite/ChangeLog:

	PR fortran/107074
	* gfortran.dg/pr107074.f90: New test.
---
 gcc/fortran/trans-io.cc| 14 ++
 gcc/testsuite/gfortran.dg/pr107074.f90 | 11 +++
 2 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr107074.f90

diff --git a/gcc/fortran/trans-io.cc b/gcc/fortran/trans-io.cc
index 9f86815388c..c4e1537eed6 100644
--- a/gcc/fortran/trans-io.cc
+++ b/gcc/fortran/trans-io.cc
@@ -2430,6 +2430,20 @@ transfer_expr (gfc_se * se, gfc_typespec * ts, tree addr_expr,

   break;

+case BT_PROCEDURE:
+  if (code->expr1
+	  && code->expr1->symtree
+	  && code->expr1->symtree->n.sym)
+	{
+	  if (code->expr1->symtree->n.sym->attr.proc_pointer)
+	gfc_error ("Procedure pointer at %C cannot be an output item");
+	  else
+	gfc_error ("Procedure at %C cannot be an output item");
+	  return;
+	}
+  /* If a PROCEDURE item gets through to here, fall through and ICE.  */
+  gcc_fallthrough ();
+
 case_bt_struct:
 case BT_CLASS:
   if (gfc_bt_struct (ts->type) || ts->type == BT_CLASS)
diff --git a/gcc/testsuite/gfortran.dg/pr107074.f90 b/gcc/testsuite/gfortran.dg/pr107074.f90
new file mode 100644
index 000..a09088c2e9d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr107074.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/107074 - ICE: Bad IO basetype (8)
+! Contributed by G.Steinmetz
+
+program p
+  implicit none
+  integer, external:: a
+  procedure(real), pointer :: b
+  print *, merge (a, a, .true.) ! { dg-error "Procedure" }
+  print *, merge (b, b, .true.) ! { dg-error "Procedure pointer" }
+end
--
2.35.3



Re: [PATCH RESEND 1/1] p1689r5: initial support

2022-10-04 Thread Harald Anlauf via Gcc-patches

Am 04.10.22 um 17:12 schrieb Ben Boeckel:

This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.


Is there a reason that you are touching so many frontends?


diff --git a/gcc/fortran/cpp.cc b/gcc/fortran/cpp.cc
index 364bd0d2a85..0b9df9c02cd 100644
--- a/gcc/fortran/cpp.cc
+++ b/gcc/fortran/cpp.cc
@@ -712,7 +712,7 @@ gfc_cpp_done (void)
  FILE *f = fopen (gfc_cpp_option.deps_filename, "w");
  if (f)
{
- cpp_finish (cpp_in, f);
+ cpp_finish (cpp_in, f, NULL);
  fclose (f);
}
  else
@@ -721,7 +721,7 @@ gfc_cpp_done (void)
 xstrerror (errno));
}
else
-   cpp_finish (cpp_in, stdout);
+   cpp_finish (cpp_in, stdout, NULL);
  }

cpp_undef_all (cpp_in);


Couldn't you simply default the third argument of cpp_finish() to NULL?


diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 2db1e9cbdfb..90787230a9e 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -298,6 +298,9 @@ typedef CPPCHAR_SIGNED_T cppchar_signed_t;
  /* Style of header dependencies to generate.  */
  enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM };

+/* Format of header dependencies to generate.  */
+enum cpp_deps_format { DEPS_FMT_NONE = 0, DEPS_FMT_P1689R5 };
+
  /* The possible normalization levels, from most restrictive to least.  */
  enum cpp_normalize_level {
/* In NFKC.  */
@@ -581,6 +584,9 @@ struct cpp_options
  /* Style of header dependencies to generate.  */
  enum cpp_deps_style style;

+/* Format of header dependencies to generate.  */
+enum cpp_deps_format format;
+
  /* Assume missing files are generated files.  */
  bool missing_files;

@@ -1104,9 +1110,9 @@ extern void cpp_post_options (cpp_reader *);
  extern void cpp_init_iconv (cpp_reader *);

  /* Call this to finish preprocessing.  If you requested dependency
-   generation, pass an open stream to write the information to,
-   otherwise NULL.  It is your responsibility to close the stream.  */
-extern void cpp_finish (cpp_reader *, FILE *deps_stream);
+   generation, pass open stream(s) to write the information to,
+   otherwise NULL.  It is your responsibility to close the stream(s).  */
+extern void cpp_finish (cpp_reader *, FILE *deps_stream, FILE *fdeps_stream);

 ^^^


  /* Call this to release the handle at the end of preprocessing.  Any
 use of the handle after this function returns is invalid.  */





Re: [COMMITTED] Remove assert from set_nonzero_bits.

2022-10-04 Thread Jeff Law via Gcc-patches



On 10/4/22 11:52, Aldy Hernandez wrote:

The assert removed by this patch was there to keep users from passing
masks of incompatible types.  The self tests are passing host wide
ints down (set_nonzero_bits (-1)), which seem to be 32 bits, whereas
some embedded targets have integer_type_node's of 16-bits.  This is
causing problems in m32c-elf, among others.

I suppose there's no harm in passing a 32-bit mask, because
set_nonzero_bits calls wide_int::from() to convert the mask to the
appropriate type.  So we can remove the assert.

Sorry for the pain Jeff.

gcc/ChangeLog:

* value-range.cc (irange::set_nonzero_bits): Remove assert.


Thanks.  I'll respin everything that failed this AM and see where we are.


jeff




Re: [RFC] libstdc++: Generate error_constants.h from [PR104883]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
On Tue, 4 Oct 2022 at 19:05, Jonathan Wakely wrote:
>
> On Tue, 4 Oct 2022 at 17:51, Jonathan Wakely via Libstdc++
>  wrote:
> >
> > Does anybody see any issues with generating the list of error numbers at
> > build time?
> >
> >
> > -- >8 --
> >
> > Instead of having several very similar target-specific headers with
> > slightly different sets of enumerators, generate the error_constants.h
> > file as part of the build. This ensures that all enumerators are always
> > defined, with the value from the corresponding errno macro if present,
> > or a libstdc++-specific alternative value.
> >
> > The libstdc++-specific values will be values greater than the positive
> > integer _GLIBCXX_ERRC_ORIGIN, which defaults to  but can be set in
> > os_defines.h if a more suitable value exists for the OS (e.g. ELAST
> > could be used for BSD targets).
> ...
> > +${CXXCPP} -P -D_POSIX_C_SOURCE=200809L -x c++ "$constants_h" \
> > +  | sed -e '1,/^GLIBCXX ERROR CONSTANTS BELOW HERE$/d' \
> > +  >> "$output_h" || exit $?
>
> Gah, this is the wrong version of the script! It's supposed to replace
> unexpanded EXXX tokens with __LINE__ (which is why #line is used to
> set the origin) but I seem to have committed the wrong version.
>
> Let me dig that out of my git reflog ...

I was looking on the wrong machine. Here's the working patch that
expands undefined EXXX constants to __LINE__, which has been offset by
the _GLIBCXX_ERRC_ORIGIN value.
commit 5e6714e4c591d390e2b8aa5c673490e99d8c906a
Author: Jonathan Wakely 
Date:   Thu Sep 22 21:54:44 2022

libstdc++: Generate error_constants.h from  [PR104883]

Instead of having several very similar target-specific headers with
slightly different sets of enumerators, generate the error_constants.h
file as part of the build. This ensures that all enumerators are always
defined, with the value from the corresponding errno macro if present,
or a libstdc++-specific alternative value.

The libstdc++-specific values will be values greater than the positive
integer _GLIBCXX_ERRC_ORIGIN, which defaults to  but can be set in
os_defines.h if a more suitable value exists for the OS (e.g. ELAST
could be used for BSD targets).

libstdc++-v3/ChangeLog:

PR libstdc++/104883
* configure.host (error_constants_dir): Remove.
* include/Makefile.am (error_constants.h): Generate using
make_errc.sh script.
* include/Makefile.in: Regenerate.
* scripts/make_errc.sh: New file.

diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index ec32980aa0d..6d51c5f6a11 100644
--- a/libstdc++-v3/configure.host
+++ b/libstdc++-v3/configure.host
@@ -56,9 +56,6 @@
 #   cpu_opt_ext_random path name of random.h containing CPU-specific
 #  optimizations for extensions
 #
-#   error_constants_dirlocation of error_constants.h
-#  defaults to os/generic.
-#
 # It possibly modifies the following variables:
 #
 #   OPT_LDFLAGSextra flags to pass when linking the library, of
@@ -93,7 +90,6 @@ cpu_defines_dir="cpu/generic"
 try_cpu=generic
 abi_baseline_subdir_switch=--print-multi-directory
 abi_tweaks_dir="cpu/generic"
-error_constants_dir="os/generic"
 tmake_file=
 
 # HOST-SPECIFIC OVERRIDES
@@ -247,7 +243,6 @@ case "${host_os}" in
 ;;
   *djgpp*)  # leading * picks up "msdosdjgpp"
 os_include_dir="os/djgpp"
-error_constants_dir="os/djgpp"
 ;;
   dragonfly*)
 os_include_dir="os/bsd/dragonfly"
@@ -274,11 +269,9 @@ case "${host_os}" in
 case "$host" in
   *-w64-*)
 os_include_dir="os/mingw32-w64"
-error_constants_dir="os/mingw32-w64"
 ;;
   *)
 os_include_dir="os/mingw32"
-error_constants_dir="os/mingw32"
 ;;
 esac
 OPT_LDFLAGS="${OPT_LDFLAGS} \$(lt_host_flags)"
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 97542524a69..28c5b3c4453 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1033,7 +1033,6 @@ host_headers = \
${glibcxx_srcdir}/$(ATOMIC_WORD_SRCDIR)/atomic_word.h \
${glibcxx_srcdir}/$(ABI_TWEAKS_SRCDIR)/cxxabi_tweaks.h \
${glibcxx_srcdir}/$(CPU_DEFINES_SRCDIR)/cpu_defines.h \
-   ${glibcxx_srcdir}/$(ERROR_CONSTANTS_SRCDIR)/error_constants.h \
${glibcxx_srcdir}/include/precompiled/stdc++.h \
${glibcxx_srcdir}/include/precompiled/stdtr1c++.h \
${glibcxx_srcdir}/include/precompiled/extc++.h
@@ -1051,6 +1050,7 @@ host_headers_extra = \
${host_builddir}/c++allocator.h \
${host_builddir}/c++io.h \
${host_builddir}/c++locale.h \
+   ${host_builddir}/error_constants.h \
${host_builddir}/messages_members.h \
${host_builddir}/time_members.h
 
@@ -1106,6 +1106,7 @@ allstamped = \
 # catenation.
 allcreated = \
${host_builddir}/c++config.h \
+ 

Re: [RFC] libstdc++: Generate error_constants.h from [PR104883]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
On Tue, 4 Oct 2022 at 17:51, Jonathan Wakely via Libstdc++
 wrote:
>
> Does anybody see any issues with generating the list of error numbers at
> build time?
>
>
> -- >8 --
>
> Instead of having several very similar target-specific headers with
> slightly different sets of enumerators, generate the error_constants.h
> file as part of the build. This ensures that all enumerators are always
> defined, with the value from the corresponding errno macro if present,
> or a libstdc++-specific alternative value.
>
> The libstdc++-specific values will be values greater than the positive
> integer _GLIBCXX_ERRC_ORIGIN, which defaults to  but can be set in
> os_defines.h if a more suitable value exists for the OS (e.g. ELAST
> could be used for BSD targets).
...
> +${CXXCPP} -P -D_POSIX_C_SOURCE=200809L -x c++ "$constants_h" \
> +  | sed -e '1,/^GLIBCXX ERROR CONSTANTS BELOW HERE$/d' \
> +  >> "$output_h" || exit $?

Gah, this is the wrong version of the script! It's supposed to replace
unexpanded EXXX tokens with __LINE__ (which is why #line is used to
set the origin) but I seem to have committed the wrong version.

Let me dig that out of my git reflog ...



[COMMITTED] Remove assert from set_nonzero_bits.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
The assert removed by this patch was there to keep users from passing
masks of incompatible types.  The self tests are passing host wide
ints down (set_nonzero_bits (-1)), which seem to be 32 bits, whereas
some embedded targets have integer_type_node's of 16-bits.  This is
causing problems in m32c-elf, among others.

I suppose there's no harm in passing a 32-bit mask, because
set_nonzero_bits calls wide_int::from() to convert the mask to the
appropriate type.  So we can remove the assert.

Sorry for the pain Jeff.

gcc/ChangeLog:

* value-range.cc (irange::set_nonzero_bits): Remove assert.
---
 gcc/value-range.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index afb26a40083..a307559b654 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2913,7 +2913,6 @@ irange::set_nonzero_bits (const wide_int_ref )
 {
   gcc_checking_assert (!undefined_p ());
   unsigned prec = TYPE_PRECISION (type ());
-  gcc_checking_assert (prec == bits.get_precision ());
 
   // Drop VARYINGs with a nonzero mask to a plain range.
   if (m_kind == VR_VARYING && bits != -1)
-- 
2.37.1



Re: [GCC13][Patch][V5][PATCH 1/2] Add a new option -fstrict-flex-arrays[=n] and new attribute strict_flex_array

2022-10-04 Thread Joseph Myers
On Tue, 4 Oct 2022, Qing Zhao via Gcc-patches wrote:

> +  { "strict_flex_array",  1, 1, false, false, false, false,
> +   handle_strict_flex_array_attribute, NULL },

You're not requiring that the attribute be applied to a declaration here.

> +static tree
> +handle_strict_flex_array_attribute (tree *node, tree name,
> + tree args, int ARG_UNUSED (flags),
> + bool *no_add_attrs)
> +{
> +  tree decl = *node;
> +  tree argval = TREE_VALUE (args);
> +
> +  /* This attribute only applies to field decls of a structure.  */
> +  if (TREE_CODE (decl) != FIELD_DECL)
> +{
> +  error_at (DECL_SOURCE_LOCATION (decl),
> + "%qE attribute may not be specified for %q+D", name, decl);

But here you're using DECL_SOURCE_LOCATION on what might be a type, not a 
DECL.  So if you have a test such as

int [[gnu::strict_flex_array(1)]] x;

that applies the attribute to a type, you get an ICE:

t.c:1:1: internal compiler error: tree check: expected tree that contains 'decl 
minimal' structure, have 'integer_type' in handle_strict_flex_array_attribute, 
at c-family/c-attribs.cc:2526
1 | int [[gnu::strict_flex_array(1)]] x;
  | ^~~

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] c++ modules: lazy loading from within template [PR99377]

2022-10-04 Thread Patrick Palka via Gcc-patches
Here when lazily loading the binding for f at parse time from the
template g, processing_template_decl is set and thus the call to
note_vague_linkage_fn from module_state::read_cluster has no effect,
and we never push f onto deferred_fns and end up never emitting its
definition.

ISTM the behavior of the lazy loading machinery shouldn't be sensitive
to whether we're inside a template, and therefore we should probably be
clearing processing_template_decl somewhere e.g in lazy_load_binding.
This is sufficient to fix the testcase.

But it also seems the processing_template_decl test in
note_vague_linkage_fn, added by r8-7539-g977bc3ee11383e for PR84973, is
perhaps too strong: if the intent is to avoid deferring output for
uninstantiated templates, we should make sure that DECL in question is
an uninstantiated template by checking e.g. value_dependent_expression_p.
This too is sufficient to fix the testcase (since f isn't a template)
and survives bootstrap and regtest.

Does one or the other approach look like the correct fix for this PR?

PR c++/99377

gcc/cp/ChangeLog:

* decl2.cc (note_vague_linkage_fn): Relax processing_template_decl
test to value_dependent_expression_p.
* module.cc (lazy_load_binding): Clear processing_template_decl.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr99377-2_a.C: New test.
* g++.dg/modules/pr99377-2_b.C: New test.
---
 gcc/cp/decl2.cc| 2 +-
 gcc/cp/module.cc   | 2 ++
 gcc/testsuite/g++.dg/modules/pr99377-2_a.C | 5 +
 gcc/testsuite/g++.dg/modules/pr99377-2_b.C | 6 ++
 4 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr99377-2_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr99377-2_b.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 9f18466192f..5af4d17ee3b 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -876,7 +876,7 @@ check_classfn (tree ctype, tree function, tree 
template_parms)
 void
 note_vague_linkage_fn (tree decl)
 {
-  if (processing_template_decl)
+  if (value_dependent_expression_p (decl))
 return;
 
   DECL_DEFER_OUTPUT (decl) = 1;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 500ac06563a..79cbb346ffa 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -19074,6 +19074,8 @@ lazy_load_binding (unsigned mod, tree ns, tree id, 
binding_slot *mslot)
 
   timevar_start (TV_MODULE_IMPORT);
 
+  processing_template_decl_sentinel ptds;
+
   /* Stop GC happening, even in outermost loads (because our caller
  could well be building up a lookup set).  */
   function_depth++;
diff --git a/gcc/testsuite/g++.dg/modules/pr99377-2_a.C 
b/gcc/testsuite/g++.dg/modules/pr99377-2_a.C
new file mode 100644
index 000..26e2bccbbbe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr99377-2_a.C
@@ -0,0 +1,5 @@
+// PR c++/99377
+// { dg-additional-options -fmodules-ts }
+// { dg-module-cmi pr99377 }
+export module pr99377;
+export inline void f() { }
diff --git a/gcc/testsuite/g++.dg/modules/pr99377-2_b.C 
b/gcc/testsuite/g++.dg/modules/pr99377-2_b.C
new file mode 100644
index 000..69571952c8a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr99377-2_b.C
@@ -0,0 +1,6 @@
+// PR c++/99377
+// { dg-additional-options -fmodules-ts }
+// { dg-do link }
+import pr99377;
+template void g() { f(); }
+int main() { g(); }
-- 
2.38.0.rc2



[PATCH][AArch64] Improve immediate expansion [PR106583]

2022-10-04 Thread Wilco Dijkstra via Gcc-patches
Improve immediate expansion of immediates which can be created from a
bitmask immediate and 2 MOVKs.  This reduces the number of 4-instruction
immediates in SPECINT/FP by 10-15%.

Passes regress, OK for commit?

gcc/ChangeLog:

PR target/106583
* config/aarch64/aarch64.cc (aarch64_internal_mov_immediate)
Add support for a bitmask immediate with 2 MOVKs.

gcc/testsuite:
PR target/106583
* gcc.target/aarch64/pr106583.c: Add new test.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
926e81f028c82aac9a5fecc18f921f84399c24ae..1601d11710cb6132c80a77bb4fe2f8429519aa5a
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -5568,7 +5568,7 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   one_match = ((~val & mask) == 0) + ((~val & (mask << 16)) == 0) +
 ((~val & (mask << 32)) == 0) + ((~val & (mask << 48)) == 0);
 
-  if (zero_match != 2 && one_match != 2)
+  if (zero_match < 2 && one_match < 2)
 {
   /* Try emitting a bitmask immediate with a movk replacing 16 bits.
 For a 64-bit bitmask try whether changing 16 bits to all ones or
@@ -5600,6 +5600,43 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
}
 }
 
+  /* Try a bitmask plus 2 movk to generate the immediate in 3 instructions.  */
+  if (zero_match + one_match == 0)
+{
+  mask = 0x;
+
+  for (i = 0; i < 64; i += 16)
+   {
+ val2 = val & ~mask;
+ if (aarch64_bitmask_imm (val2, mode))
+   break;
+ val2 = val | mask;
+ if (aarch64_bitmask_imm (val2, mode))
+   break;
+ val2 = val2 & ~mask;
+ val2 = val2 | (((val2 >> 32) | (val2 << 32)) & mask);
+ if (aarch64_bitmask_imm (val2, mode))
+   break;
+
+ mask = (mask << 16) | (mask >> 48);
+   }
+
+  if (i != 64)
+   {
+ if (generate)
+   {
+ emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));
+ emit_insn (gen_insv_immdi (dest, GEN_INT (i),
+GEN_INT ((val >> i) & 0x)));
+ i = (i + 16) & 63;
+ emit_insn (gen_insv_immdi (dest, GEN_INT (i),
+GEN_INT ((val >> i) & 0x)));
+   }
+
+ return 3;
+   }
+}
+
   /* Generate 2-4 instructions, skipping 16 bits of all zeroes or ones which
  are emitted by the initial mov.  If one_match > zero_match, skip set bits,
  otherwise skip zero bits.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/pr106583.c 
b/gcc/testsuite/gcc.target/aarch64/pr106583.c
new file mode 100644
index 
..f0a027a0950e506d4ddaacce5e151f57070948dc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr106583.c
@@ -0,0 +1,30 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2 --save-temps" } */
+
+long f1 (void)
+{
+  return 0x7efefefefefefeff;
+}
+
+long f2 (void)
+{
+  return 0x12345678;
+}
+
+long f3 (void)
+{
+  return 0x12345678;
+}
+
+long f4 (void)
+{
+  return 0x12345678;
+}
+
+long f5 (void)
+{
+  return 0x12345678;
+}
+
+/* { dg-final { scan-assembler-times {\tmovk\t} 10 } } */
+/* { dg-final { scan-assembler-times {\tmov\t} 5 } } */



[PATCH] improved const shifts for AVR targets

2022-10-04 Thread Alexander Binzberger via Gcc-patches
Hi,
recently I used some arduino uno for a project and realized some areas
which do not output optimal asm code. Especially around shifts and function
calls.
With this as motivation and hacktoberfest I started patching things.
Since patch files do not provide a good overview and I hope for a
"hacktoberfest-accepted" label on the PR on github I also opened it there:
https://github.com/gcc-mirror/gcc/pull/73

This patch improves shifts with const right hand operand. While 8bit and
16bit shifts where mostly fine 24bit and 32bit where not handled well.

Testing
I checked output with a local installation of compiler explorer in asm and
a tiny unit test comparing shifts with mul/div by 2.
I however did not write any testcases in gcc for it.

Target
This patch is only targeting atmel avr family of chips.

Changelog
improved const shifts for AVR targets

Patch
-
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 4ed390e4cf9..c7b70812d5c 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -6043,9 +6043,6 @@ out_shift_with_cnt (const char *templ, rtx_insn
*insn, rtx operands[],
   op[2] = operands[2];
   op[3] = operands[3];

-  if (plen)
-*plen = 0;
-
   if (CONST_INT_P (operands[2]))
 {
   /* Operand 3 is a scratch register if this is a
@@ -6150,96 +6147,68 @@ out_shift_with_cnt (const char *templ, rtx_insn
*insn, rtx operands[],
 /* 8bit shift left ((char)x << i)   */

 const char *
-ashlqi3_out (rtx_insn *insn, rtx operands[], int *len)
+ashlqi3_out (rtx_insn *insn, rtx operands[], int *plen)
 {
   if (CONST_INT_P (operands[2]))
 {
-  int k;
-
-  if (!len)
- len = 
-
   switch (INTVAL (operands[2]))
  {
  default:
   if (INTVAL (operands[2]) < 8)
 break;

-  *len = 1;
-  return "clr %0";
-
- case 1:
-  *len = 1;
-  return "lsl %0";
-
- case 2:
-  *len = 2;
-  return ("lsl %0" CR_TAB
-  "lsl %0");
-
- case 3:
-  *len = 3;
-  return ("lsl %0" CR_TAB
-  "lsl %0" CR_TAB
-  "lsl %0");
+return avr_asm_len ("clr %0", operands, plen, 1);

  case 4:
   if (test_hard_reg_class (LD_REGS, operands[0]))
 {
-  *len = 2;
-  return ("swap %0" CR_TAB
-  "andi %0,0xf0");
+return avr_asm_len ("swap %0" CR_TAB
+  "andi %0,0xf0", operands, plen, 2);
 }
-  *len = 4;
-  return ("lsl %0" CR_TAB
+return avr_asm_len ("lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
-  "lsl %0");
+  "lsl %0", operands, plen, 4);

  case 5:
   if (test_hard_reg_class (LD_REGS, operands[0]))
 {
-  *len = 3;
-  return ("swap %0" CR_TAB
+return avr_asm_len ("swap %0" CR_TAB
   "lsl %0"  CR_TAB
-  "andi %0,0xe0");
+  "andi %0,0xe0", operands, plen, 3);
 }
-  *len = 5;
-  return ("lsl %0" CR_TAB
+return avr_asm_len ("lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
-  "lsl %0");
+  "lsl %0", operands, plen, 5);

  case 6:
   if (test_hard_reg_class (LD_REGS, operands[0]))
 {
-  *len = 4;
-  return ("swap %0" CR_TAB
+return avr_asm_len ("swap %0" CR_TAB
   "lsl %0"  CR_TAB
   "lsl %0"  CR_TAB
-  "andi %0,0xc0");
+  "andi %0,0xc0", operands, plen, 4);
 }
-  *len = 6;
-  return ("lsl %0" CR_TAB
+return avr_asm_len ("lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
   "lsl %0" CR_TAB
-  "lsl %0");
+  "lsl %0", operands, plen, 6);

  case 7:
-  *len = 3;
-  return ("ror %0" CR_TAB
+return avr_asm_len ("ror %0" CR_TAB
   "clr %0" CR_TAB
-  "ror %0");
+  "ror %0", operands, plen, 3);
  }
 }
   else if (CONSTANT_P (operands[2]))
 fatal_insn ("internal compiler error.  Incorrect shift:", insn);

   out_shift_with_cnt ("lsl %0",
-  insn, operands, len, 1);
+  insn, operands, plen, 1);
   return "";
 }

@@ -6247,7 +6216,7 @@ ashlqi3_out (rtx_insn *insn, rtx operands[], int *len)
 /* 16bit shift left ((short)x << i)   */

 const char *
-ashlhi3_out (rtx_insn *insn, rtx operands[], int *len)
+ashlhi3_out (rtx_insn *insn, rtx operands[], int *plen)
 {
   if (CONST_INT_P (operands[2]))
 {
@@ -6255,11 +6224,6 @@ ashlhi3_out (rtx_insn *insn, rtx operands[], int
*len)
  && XVECLEN (PATTERN (insn), 0) == 3
  && REG_P (operands[3]));
   int ldi_ok = test_hard_reg_class (LD_REGS, operands[0]);
-  int k;
-  int *t = len;
-
-  if (!len)
- len = 

   switch (INTVAL (operands[2]))
  {
@@ -6267,33 +6231,30 @@ ashlhi3_out (rtx_insn *insn, rtx operands[], int
*len)
   if (INTVAL (operands[2]) < 16)
 break;

-  *len = 2;
-  return ("clr %B0" CR_TAB
-  "clr %A0");
+return avr_asm_len ("clr %B0" CR_TAB
+  "clr %A0", operands, plen, 2);

  case 4:
   if (optimize_size && scratch)
 break;  /* 5 */
   if (ldi_ok)
 {
-  *len = 6;
-  return ("swap %A0"  CR_TAB
+return avr_asm_len ("swap %A0"  CR_TAB
   "swap %B0"  CR_TAB
   "andi %B0,0xf0" CR_TAB
   "eor 

Re: [PATCH] fixincludes: Deal also with the _Float128x cases [PR107059]

2022-10-04 Thread will schmidt via Gcc-patches
On Fri, 2022-09-30 at 09:20 +0200, Jakub Jelinek via Gcc-patches wrote:
> On Wed, Sep 28, 2022 at 08:19:43PM +0200, Jakub Jelinek via Gcc-
> patches wrote:
> > Another case are the following 3 snippets:
> > #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> > #   error "_Float128X supported but no constant suffix"
> > #  else
> > #   define __f128x(x) x##f128x
> > #  endif
> > ...
> > #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> > #   error "_Float128X supported but no complex type"
> > #  else
> > #   define __CFLOAT128X _Complex _Float128x
> > #  endif
> > ...
> > #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> > #   error "_Float128x supported but no type"
> > #  endif
> > but as no target has _Float128x right now and don't see it
> > coming soon, it isn't a big deal (on the glibc side it is of
> > course ok to adjust those).
> 
> This incremental patch deals handles the above 3 cases, so we
> fixinclude what glibc itself changed too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux (together with
> the
> previously posted fixincludes/ change too), ok for trunk?

Hi,

The combination of these two patches allows me to build gcc
successfully.  (PPC64LE with RHEL9).

A nit that Part1 needed massaging
of the path/to/files (i.e. gcc/inclhack.def versus
fixincludes/inclhack.def) to apply.

I can't otherwise speak to the
changes, aside from they seem to work for me.

Thanks
-WIll



> 
> 2022-09-30  Jakub Jelinek  
> 
>   PR bootstrap/107059
>   * inclhack.def (glibc_cxx_floatn_5): New.
>   * fixincl.x: Regenerated.
>   * tests/base/bits/floatn.h: Regenerated.
> 
> --- fixincludes/inclhack.def.jj   2022-09-29 22:18:47.974402688
> +0200
> +++ fixincludes/inclhack.def  2022-09-29 22:22:48.151145670 +0200
> @@ -2131,6 +2131,23 @@ fix = {
>   EOT;
>  };
> 
> +fix = {
> +hackname  = glibc_cxx_floatn_5;
> +files = bits/floatn.h, bits/floatn-common.h,
> "*/bits/floatn.h", "*/bits/floatn-common.h";
> +select= "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\|
> )defined __cplusplus\n"
> + "([ \t]*#[ \t]+error \"_Float128[xX] supported but no
> )";
> +c_fix = format;
> +c_fix_arg = "%1(defined __cplusplus && !__GNUC_PREREQ (13,
> 0))\n%2";
> +test_text = <<-EOT
> + #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> + #   error "_Float128X supported but no constant suffix"
> + #  endif
> + #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
> + #   error "_Float128x supported but no type"
> + #  endif
> + EOT;
> +};
> +
>  /*  glibc-2.3.5 defines pthread mutex initializers incorrectly,
>   *  so we replace them with versions that correspond to the
>   *  definition.
> --- fixincludes/fixincl.x.jj  2022-09-29 22:18:47.975402675 +0200
> +++ fixincludes/fixincl.x 2022-09-29 22:22:55.675909244 +0200
> @@ -2,11 +2,11 @@
>   *
>   * DO NOT EDIT THIS FILE   (fixincl.x)
>   *
> - * It has been AutoGen-ed  September 28, 2022 at 07:56:15 PM by
> AutoGen 5.18.16
> + * It has been AutoGen-ed  September 29, 2022 at 10:22:55 PM by
> AutoGen 5.18.16
>   * From the definitionsinclhack.def
>   * and the template file   fixincl
>   */
> -/* DO NOT SVN-MERGE THIS FILE, EITHER Wed Sep 28 19:56:15 CEST 2022
> +/* DO NOT SVN-MERGE THIS FILE, EITHER Thu Sep 29 22:22:55 CEST 2022
>   *
>   * You must regenerate it.  Use the ./genfixes script.
>   *
> @@ -15,7 +15,7 @@
>   * certain ANSI-incompatible system header files which are fixed to
> work
>   * correctly with ANSI C and placed in a directory that GNU C will
> search.
>   *
> - * This file contains 271 fixup descriptions.
> + * This file contains 272 fixup descriptions.
>   *
>   * See README for more information.
>   *
> @@ -4273,6 +4273,43 @@ static const char* apzGlibc_Cxx_Floatn_4
> 
>  /* * * * * * * * * * * * * * * * * * * * * * * * * *
>   *
> + *  Description of Glibc_Cxx_Floatn_5 fix
> + */
> +tSCC zGlibc_Cxx_Floatn_5Name[] =
> + "glibc_cxx_floatn_5";
> +
> +/*
> + *  File name selection pattern
> + */
> +tSCC zGlibc_Cxx_Floatn_5List[] =
> +  "bits/floatn.h\0bits/floatn-
> common.h\0*/bits/floatn.h\0*/bits/floatn-common.h\0";
> +/*
> + *  Machine/OS name selection pattern
> + */
> +#define apzGlibc_Cxx_Floatn_5Machs (const char**)NULL
> +
> +/*
> + *  content selection pattern - do fix if pattern found
> + */
> +tSCC zGlibc_Cxx_Floatn_5Select0[] =
> +   "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined
> __cplusplus\n\
> +([ \t]*#[ \t]+error \"_Float128[xX] supported but no )";
> +
> +#defineGLIBC_CXX_FLOATN_5_TEST_CT  1
> +static tTestDesc aGlibc_Cxx_Floatn_5Tests[] = {
> +  { TT_EGREP,zGlibc_Cxx_Floatn_5Select0, (regex_t*)NULL }, };
> +
> +/*
> + *  Fix Command Arguments for Glibc_Cxx_Floatn_5
> + */
> +static const char* apzGlibc_Cxx_Floatn_5Patch[] = {
> +"format",
> +"%1(defined __cplusplus && !__GNUC_PREREQ (13, 0))\n\
> +%2",
> +(char*)NULL };
> +
> +/* * * * * * * * * * * * * * * * * * * * * * * * * 

[RFC] libstdc++: Generate error_constants.h from [PR104883]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Does anybody see any issues with generating the list of error numbers at
build time?


-- >8 --

Instead of having several very similar target-specific headers with
slightly different sets of enumerators, generate the error_constants.h
file as part of the build. This ensures that all enumerators are always
defined, with the value from the corresponding errno macro if present,
or a libstdc++-specific alternative value.

The libstdc++-specific values will be values greater than the positive
integer _GLIBCXX_ERRC_ORIGIN, which defaults to  but can be set in
os_defines.h if a more suitable value exists for the OS (e.g. ELAST
could be used for BSD targets).

libstdc++-v3/ChangeLog:

PR libstdc++/104883
* configure.host (error_constants_dir): Remove.
* include/Makefile.am (error_constants.h): Generate using
make_errc.sh script.
* include/Makefile.in: Regenerate.
* scripts/make_errc.sh: New file.
---
 libstdc++-v3/configure.host   |   7 --
 libstdc++-v3/include/Makefile.am  |   6 +-
 libstdc++-v3/include/Makefile.in  |   6 +-
 libstdc++-v3/scripts/make_errc.sh | 165 ++
 4 files changed, 175 insertions(+), 9 deletions(-)
 create mode 100755 libstdc++-v3/scripts/make_errc.sh

diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index ec32980aa0d..6d51c5f6a11 100644
--- a/libstdc++-v3/configure.host
+++ b/libstdc++-v3/configure.host
@@ -56,9 +56,6 @@
 #   cpu_opt_ext_random path name of random.h containing CPU-specific
 #  optimizations for extensions
 #
-#   error_constants_dirlocation of error_constants.h
-#  defaults to os/generic.
-#
 # It possibly modifies the following variables:
 #
 #   OPT_LDFLAGSextra flags to pass when linking the library, of
@@ -93,7 +90,6 @@ cpu_defines_dir="cpu/generic"
 try_cpu=generic
 abi_baseline_subdir_switch=--print-multi-directory
 abi_tweaks_dir="cpu/generic"
-error_constants_dir="os/generic"
 tmake_file=
 
 # HOST-SPECIFIC OVERRIDES
@@ -247,7 +243,6 @@ case "${host_os}" in
 ;;
   *djgpp*)  # leading * picks up "msdosdjgpp"
 os_include_dir="os/djgpp"
-error_constants_dir="os/djgpp"
 ;;
   dragonfly*)
 os_include_dir="os/bsd/dragonfly"
@@ -274,11 +269,9 @@ case "${host_os}" in
 case "$host" in
   *-w64-*)
 os_include_dir="os/mingw32-w64"
-error_constants_dir="os/mingw32-w64"
 ;;
   *)
 os_include_dir="os/mingw32"
-error_constants_dir="os/mingw32"
 ;;
 esac
 OPT_LDFLAGS="${OPT_LDFLAGS} \$(lt_host_flags)"
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 97542524a69..28c5b3c4453 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1033,7 +1033,6 @@ host_headers = \
${glibcxx_srcdir}/$(ATOMIC_WORD_SRCDIR)/atomic_word.h \
${glibcxx_srcdir}/$(ABI_TWEAKS_SRCDIR)/cxxabi_tweaks.h \
${glibcxx_srcdir}/$(CPU_DEFINES_SRCDIR)/cpu_defines.h \
-   ${glibcxx_srcdir}/$(ERROR_CONSTANTS_SRCDIR)/error_constants.h \
${glibcxx_srcdir}/include/precompiled/stdc++.h \
${glibcxx_srcdir}/include/precompiled/stdtr1c++.h \
${glibcxx_srcdir}/include/precompiled/extc++.h
@@ -1051,6 +1050,7 @@ host_headers_extra = \
${host_builddir}/c++allocator.h \
${host_builddir}/c++io.h \
${host_builddir}/c++locale.h \
+   ${host_builddir}/error_constants.h \
${host_builddir}/messages_members.h \
${host_builddir}/time_members.h
 
@@ -1106,6 +1106,7 @@ allstamped = \
 # catenation.
 allcreated = \
${host_builddir}/c++config.h \
+   ${host_builddir}/error_constants.h \
${host_builddir}/largefile-config.h \
${thread_host_headers} \
${pch_build}
@@ -1404,6 +1405,9 @@ ${host_builddir}/c++config.h: ${CONFIG_HEADER} \
echo "" >> $@ ;\
echo "#endif // _GLIBCXX_CXX_CONFIG_H" >> $@
 
+${host_builddir}/error_constants.h: ${glibcxx_srcdir}/scripts/make_errc.sh
+   ${glibcxx_srcdir}/scripts/make_errc.sh $@ "${CXXCPP}" ${AM_CPPFLAGS}
+
 # Host includes for threads
 uppercase = [ABCDEFGHIJKLMNOPQRSTUVWXYZ_]
 
diff --git a/libstdc++-v3/scripts/make_errc.sh 
b/libstdc++-v3/scripts/make_errc.sh
new file mode 100755
index 000..4b98abe2bed
--- /dev/null
+++ b/libstdc++-v3/scripts/make_errc.sh
@@ -0,0 +1,165 @@
+#!/bin/sh
+
+[ -z "$1" ] && exit 1
+target=$1
+shift
+CXXCPP="$@"
+
+{
+  dir=`(umask 077 && mktemp -d "$target.XX") 2>/dev/null` &&
+  test -d "$dir"
+} || {
+  dir=$target.$RANDOM$$
+  (umask 077 && mkdir "$dir")
+} || exit $?
+
+output_h=$dir/output.h
+constants_h=$dir/constants.h
+
+cat > "$output_h" << EOT
+// Definition of std::errc enumeration type  -*- C++ -*-
+
+// Copyright The GNU Toolchain Authors.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under 

[committed] libstdc++: Use new built-ins __remove_cv, __remove_reference etc.

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_cv): Use __remove_cv built-in.
(remove_reference): Use __remove_reference built-in.
(remove_cvref): Use __remove_cvref built-in. Remove inheritance
for fallback implementation.
---
 libstdc++-v3/include/std/type_traits | 33 
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index a015fd95a71..b74565eb521 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1507,6 +1507,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { typedef _Tp type; };
 
   /// remove_cv
+#if __has_builtin(__remove_cv)
+  template
+struct remove_cv
+{ using type = __remove_cv(_Tp); };
+#else
   template
 struct remove_cv
 { using type = _Tp; };
@@ -1522,6 +1527,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct remove_cv
 { using type = _Tp; };
+#endif
 
   /// add_const
   template
@@ -1570,17 +1576,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Reference transformations.
 
   /// remove_reference
+#if __has_builtin(__remove_reference)
   template
 struct remove_reference
-{ typedef _Tp   type; };
+{ using type = __remove_reference(_Tp); };
+#else
+  template
+struct remove_reference
+{ using type = _Tp; };
 
   template
 struct remove_reference<_Tp&>
-{ typedef _Tp   type; };
+{ using type = _Tp; };
 
   template
 struct remove_reference<_Tp&&>
-{ typedef _Tp   type; };
+{ using type = _Tp; };
+#endif
 
   /// add_lvalue_reference
   template
@@ -3358,20 +3370,23 @@ template
*/
 #define __cpp_lib_remove_cvref 201711L
 
+#if __has_builtin(__remove_cvref)
   template
 struct remove_cvref
-: remove_cv<_Tp>
-{ };
+{ using type = __remove_cvref(_Tp); };
+#else
+  template
+struct remove_cvref
+{ using type = typename remove_cv<_Tp>::type; };
 
   template
 struct remove_cvref<_Tp&>
-: remove_cv<_Tp>
-{ };
+{ using type = typename remove_cv<_Tp>::type; };
 
   template
 struct remove_cvref<_Tp&&>
-: remove_cv<_Tp>
-{ };
+{ using type = typename remove_cv<_Tp>::type; };
+#endif
 
   template
 using remove_cvref_t = typename remove_cvref<_Tp>::type;
-- 
2.37.3



[committed] libstdc++: Fix test FAIL for old std::string ABI

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* testsuite/std/ranges/adaptors/join_with/1.cc: Remove unused

 #include 
-#include 
-#include 
+#include 
 #include 
 #include 
 
@@ -73,7 +72,10 @@ test03()
   return true;
 }
 
-constexpr bool
+#if _GLIBCXX_USE_CXX11_ABI
+constexpr
+#endif
+bool
 test04()
 {
   std::string rs[] = {"a", "", "b", "", "c"};
@@ -93,5 +95,9 @@ main()
   static_assert(test01());
   static_assert(test02());
   static_assert(test03());
+#if _GLIBCXX_USE_CXX11_ABI
   static_assert(test04());
+#else
+  VERIFY(test04());
+#endif
 }
-- 
2.37.3



[committed] libstdc++: Refactor seed sequence constraints in

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

Every use of _If_seed_seq in  and  uses it with
enable_if. We can just move the enable_if into the helper alias instead
of repeating it everywhere.

libstdc++-v3/ChangeLog:

* include/bits/random.h (__is_seed_seq): Replace with ...
(_If_seed_seq_for): ... this.
* include/ext/random: Adjust to use _If_seed_seq_for.
---
 libstdc++-v3/include/bits/random.h | 39 ++
 libstdc++-v3/include/ext/random|  6 ++---
 2 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/libstdc++-v3/include/bits/random.h 
b/libstdc++-v3/include/bits/random.h
index 3e6eb9de7d9..28b37a9e5a5 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -198,16 +198,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Engine& _M_g;
   };
 
+// Detect whether a template argument _Sseq is a valid seed sequence for
+// a random number engine _Engine with result type _Res.
+// Used to constrain _Engine::_Engine(_Sseq&) and _Engine::seed(_Sseq&)
+// as required by [rand.eng.general].
+
 template
   using __seed_seq_generate_t = decltype(
  std::declval<_Sseq&>().generate(std::declval(),
  std::declval()));
 
-// Detect whether _Sseq is a valid seed sequence for
-// a random number engine _Engine with result type _Res.
 template>
-  using __is_seed_seq = __and_<
+  using _If_seed_seq_for = _Require<
 __not_, _Engine>>,
is_unsigned,
__not_>
@@ -263,8 +266,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"template argument substituting __m out of bounds");
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, linear_congruential_engine, _UIntType>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, linear_congruential_engine,
+  _UIntType>;
 
 public:
   /** The type of the generated random value. */
@@ -502,8 +506,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"template argument substituting __f out of bound");
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, mersenne_twister_engine, _UIntType>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, mersenne_twister_engine,
+  _UIntType>;
 
 public:
   /** The type of the generated random value. */
@@ -702,8 +707,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"template argument substituting __w out of bounds");
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, subtract_with_carry_engine, _UIntType>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, subtract_with_carry_engine,
+  _UIntType>;
 
 public:
   /** The type of the generated random value. */
@@ -894,8 +900,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename _RandomNumberEngine::result_type result_type;
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, discard_block_engine, result_type>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, discard_block_engine,
+  result_type>;
 
   // parameter values
   static constexpr size_t block_size = __p;
@@ -1113,8 +1120,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"template argument substituting __w out of bounds");
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, independent_bits_engine, _UIntType>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, independent_bits_engine,
+  _UIntType>;
 
 public:
   /** The type of the generated random value. */
@@ -1336,8 +1344,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename _RandomNumberEngine::result_type result_type;
 
   template
-   using _If_seed_seq = typename enable_if<__detail::__is_seed_seq<
- _Sseq, shuffle_order_engine, result_type>::value>::type;
+   using _If_seed_seq
+ = __detail::_If_seed_seq_for<_Sseq, shuffle_order_engine,
+  result_type>;
 
   static constexpr size_t table_size = __k;
 
diff --git a/libstdc++-v3/include/ext/random b/libstdc++-v3/include/ext/random
index 4cc0e25e025..406b12b5d23 100644
--- a/libstdc++-v3/include/ext/random
+++ b/libstdc++-v3/include/ext/random
@@ -89,9 +89,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
using _If_seed_seq
- = typename std::enable_if::value
-   >::type;
+ = 

Re: [PATCH] aarch64: update Ampere-1 core definition

2022-10-04 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich  writes:
> This brings the extensions detected by -mcpu=native on Ampere-1 systems
> in sync with the defaults generated for -mcpu=ampere1.
>
> Note that some kernel versions may misreport the presence of PAUTH and
> PREDRES (i.e., -mcpu=native will add 'nopauth' and 'nopredres').
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-cores.def (AARCH64_CORE): Update
>   Ampere-1 core entry.
>
> Signed-off-by: Philipp Tomsich 
>
> ---
> Ok for backport?
>
>  gcc/config/aarch64/aarch64-cores.def | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 60299160bb6..9090f80b4b7 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -69,7 +69,7 @@ AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  
> V8A,  (CRC, CRYPTO), thu
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  V8A,  (CRC, CRYPTO), 
> thunderx,  0x43, 0x0a3, -1)
>  
>  /* Ampere Computing ('\xC0') cores. */
> -AARCH64_CORE("ampere1", ampere1, cortexa57, V8_6A, (), ampere1, 0xC0, 0xac3, 
> -1)
> +AARCH64_CORE("ampere1", ampere1, cortexa57, V8_6A, (F16, RCPC, RNG, AES, 
> SHA3), ampere1, 0xC0, 0xac3, -1)

The fact that you had include RCPC here shows that there was a bug
in the definition of Armv8.3-A.  I've just pushed a fix for that.

Otherwise, this seems to line up with the LLVM definition, except
that this definition enables RNG/AEK_RAND whereas the LLVM one doesn't
seem to.  Which one's right (or is it me that's wrong)?

Thanks,
Richard


>  /* Do not swap around "emag" and "xgene1",
> this order is required to handle variant correctly. */
>  AARCH64_CORE("emag",emag,  xgene1,V8A,  (CRC, CRYPTO), emag, 
> 0x50, 0x000, 3)


[PATCH] aarch64: Define __ARM_FEATURE_RCPC

2022-10-04 Thread Richard Sandiford via Gcc-patches
https://github.com/ARM-software/acle/pull/199 adds a new feature
macro for RCPC, for use in things like inline assembly.  This patch
adds the associated support to GCC.

Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it.  This was probably harmless in practice
since GCC simply ignored the extension until now.  (The GAS
definition is OK.)

Tested on aarch64-linux-gnu & pushed.  Since the new thing is "only"
a macro definition and since the patch is arguably fixing the Armv8.3-A
definition, I'm intending to backport to GCC 12 soon.

Richard


gcc/
* config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New macro.
* config/aarch64/aarch64-arches.def (armv8.3-a): Include RCPC.
* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
---
 gcc/config/aarch64/aarch64-arches.def |  2 +-
 gcc/config/aarch64/aarch64-c.cc   |  1 +
 gcc/config/aarch64/aarch64-cores.def  | 10 +-
 gcc/config/aarch64/aarch64.h  |  1 +
 .../gcc.target/aarch64/pragma_cpp_predefs_1.c | 20 +++
 5 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-arches.def 
b/gcc/config/aarch64/aarch64-arches.def
index 9f82466181d..5a9eff33648 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -33,7 +33,7 @@
 AARCH64_ARCH("armv8-a",   generic,   V8A,   8,  (SIMD))
 AARCH64_ARCH("armv8.1-a", generic,   V8_1A, 8,  (V8A, LSE, CRC, 
RDMA))
 AARCH64_ARCH("armv8.2-a", generic,   V8_2A, 8,  (V8_1A))
-AARCH64_ARCH("armv8.3-a", generic,   V8_3A, 8,  (V8_2A, PAUTH))
+AARCH64_ARCH("armv8.3-a", generic,   V8_3A, 8,  (V8_2A, PAUTH, 
RCPC))
 AARCH64_ARCH("armv8.4-a", generic,   V8_4A, 8,  (V8_3A, F16FML, 
DOTPROD, FLAGM))
 AARCH64_ARCH("armv8.5-a", generic,   V8_5A, 8,  (V8_4A, SB, SSBS, 
PREDRES))
 AARCH64_ARCH("armv8.6-a", generic,   V8_6A, 8,  (V8_5A, I8MM, 
BF16))
diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index 592af8cd729..e296c73350f 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -202,6 +202,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
"__ARM_FEATURE_BF16_SCALAR_ARITHMETIC", pfile);
   aarch64_def_or_undef (TARGET_LS64,
"__ARM_FEATURE_LS64", pfile);
+  aarch64_def_or_undef (AARCH64_ISA_RCPC, "__ARM_FEATURE_RCPC", pfile);
 
   /* Not for ACLE, but required to keep "float.h" correct if we switch
  target between implementations that do or do not support ARMv8.2-A
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 60299160bb6..b50628d6b51 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -133,17 +133,17 @@ AARCH64_CORE("tsv110",  tsv110, tsv110, V8_2A,  (CRYPTO, 
F16), tsv110,   0x48, 0
 /* ARMv8.3-A Architecture Processors.  */
 
 /* Marvell cores (TX3). */
-AARCH64_CORE("thunderx3t110",  thunderx3t110,  thunderx3t110, V8_3A,  (CRYPTO, 
RCPC, SM4, SHA3, F16FML), thunderx3t110, 0x43, 0x0b8, 0x0a)
+AARCH64_CORE("thunderx3t110",  thunderx3t110,  thunderx3t110, V8_3A,  (CRYPTO, 
SM4, SHA3, F16FML), thunderx3t110, 0x43, 0x0b8, 0x0a)
 
 /* ARMv8.4-A Architecture Processors.  */
 
 /* Arm ('A') cores.  */
-AARCH64_CORE("zeus", zeus, cortexa57, V8_4A,  (SVE, RCPC, I8MM, BF16, PROFILE, 
SSBS, RNG), neoversev1, 0x41, 0xd40, -1)
-AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, V8_4A,  (SVE, RCPC, I8MM, 
BF16, PROFILE, SSBS, RNG), neoversev1, 0x41, 0xd40, -1)
-AARCH64_CORE("neoverse-512tvb", neoverse512tvb, cortexa57, V8_4A,  (SVE, RCPC, 
I8MM, BF16, PROFILE, SSBS, RNG), neoverse512tvb, INVALID_IMP, INVALID_CORE, -1)
+AARCH64_CORE("zeus", zeus, cortexa57, V8_4A,  (SVE, I8MM, BF16, PROFILE, SSBS, 
RNG), neoversev1, 0x41, 0xd40, -1)
+AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, V8_4A,  (SVE, I8MM, BF16, 
PROFILE, SSBS, RNG), neoversev1, 0x41, 0xd40, -1)
+AARCH64_CORE("neoverse-512tvb", neoverse512tvb, cortexa57, V8_4A,  (SVE, I8MM, 
BF16, PROFILE, SSBS, RNG), neoverse512tvb, INVALID_IMP, INVALID_CORE, -1)
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("saphira", saphira,saphira,V8_4A,  (CRYPTO, RCPC), 
saphira,   0x51, 0xC01, -1)
+AARCH64_CORE("saphira", saphira,saphira,V8_4A,  (CRYPTO), saphira, 
  0x51, 0xC01, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 2d6221826bb..05da9af0367 100644
--- a/gcc/config/aarch64/aarch64.h
+++ 

Re: [PATCH] middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support

2022-10-04 Thread Joseph Myers
On Tue, 4 Oct 2022, Jakub Jelinek via Gcc-patches wrote:

> Yet another problem is because I've only enabled the bf16/BF16 suffixes in
> C++ because for C it might clash with some later extension.  Am I right to
> fear about that, or do you think C will never standardize suffixes that
> would clash with that because C++ standardized the bf16/BF16 suffixes for
> something already?  If I could enable it, I'd always pedwarn for C for those

I think any C proposal to standardize something conflicting with C++ would 
get objections from the WG21 liaison.

> Another question is the suffixes of the builtins.  For now I have added
> bf16 suffix and enabled the builtins with !both_p, so one always needs to
> use __builtin_* form for them.  None of the GCC builtins end with b,
> so this isn't ambiguous with __builtin_*f16, but some libm functions do end
> with b, in particular ilogb, logb and f{??,??x}sub.  ilogb and the subs
> always have it, but is __builtin_logbf16 f16 suffixed logb or bf16 suffixed
> log?  Shall the builtins use f16b suffixes instead like the mangling does?

Indeed, that conflict means bf16 isn't suitable for the built-in function 
suffix.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c++: install cp-trait.def as part of plugin headers [PR107136]

2022-10-04 Thread Jason Merrill via Gcc-patches

On 10/3/22 20:42, Patrick Palka wrote:

This is apparently needed since we include cp-trait.def from cp-tree.h
(in order to define the cp_trait_kind enum), as with operators.def.

Tested on x86_64-pc-linux-gnu by doing make install and verifying
cp-trait.def appears in

   $prefix/lib/gcc/x86_64-pc-linux-gnu/13.0.0/plugin/include/cp/

Does this look OK for trunk?


OK.


PR c++/107136

gcc/cp/ChangeLog:

* Make-lang.in (CP_PLUGIN_HEADERS): Add cp-trait.def.
---
  gcc/cp/Make-lang.in | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 38d8eeed1f0..aa84d6827be 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -39,7 +39,7 @@ CXX_INSTALL_NAME := $(shell echo c++|sed 
'$(program_transform_name)')
  GXX_INSTALL_NAME := $(shell echo g++|sed '$(program_transform_name)')
  CXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo c++|sed 
'$(program_transform_name)')
  GXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo g++|sed 
'$(program_transform_name)')
-CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h type-utils.h 
operators.def
+CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h type-utils.h 
operators.def cp-trait.def
  
  #

  # Define the names for selecting c++ in LANGUAGES.




Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Andrew MacLeod via Gcc-patches



On 10/4/22 11:14, Aldy Hernandez wrote:

On Tue, Oct 4, 2022 at 4:34 PM Richard Biener
 wrote:




Am 04.10.2022 um 16:30 schrieb Aldy Hernandez :

On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod  wrote:



On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:

On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:
On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
 wrote:

Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <

gcc-patches@gcc.gnu.org>:

The reason the nonzero mask was kept in a tree was basically inertia,
as everything in irange is a tree.  However, there's no need to keep
it in a tree, as the conversions to and from wide ints are very
annoying.  That, plus special casing NULL masks to be -1 is prone
to error.

I have not only rewritten all the uses to assume a wide int, but
have corrected a few places where we weren't propagating the masks, or
rather pessimizing them to -1.  This will become more important in
upcoming patches where we make better use of the masks.

Performance testing shows a trivial improvement in VRP, as things like
irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
iranges to go away.

You want trailing wide int storage though.  A wide_int is quite large.

Absolutely, this is only for short term storage.  Any time we need
long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
through vrange_storage which will stream things in a more memory
efficient manner.  For irange, vrange_storage will stream all the
sub-ranges, including the nonzero bitmask which is the first entry in
such storage, as trailing_wide_ints.

See irange_storage_slot to see how it lives in GC memory.


That being said, the ranger's internal cache uses iranges, albeit with a
squished down number of subranges (the minimum amount to represent the
range).  So each cache entry will now be bigger by the difference between
one tree and one wide int.

I wonder if we should change the cache to use vrange_storage. If not now,
then when we convert all the subranges to wide ints.

Of course, the memory pressure of the cache is not nearly as problematic as
SSA_NAME_RANGE_INFO. The cache only stores names it cares about.

Rangers cache can be a memory bottleneck in pathological cases..
Certainly not as bad as it use to be, but I'm sure it can still be
problematic.Its suppose to be a memory efficient representation
because of that.  The cache can have an entry for any live ssa-name
(which means all of them at some point in the IL) multiplied by a factor
involving the number of dominator blocks and outgoing edges ranges are
calculated on.   So while SSA_NAME_RANGE_INFO is a linear thing, the
cache lies somewhere between a logarithmic and exponential factor based
on the CFG size.

Hmmm, perhaps the ultimate goal here should be to convert the cache to
use vrange_storage, which uses trailing wide ints for all of the end
points plus the masks (known_ones included for the next release).


if you are growing the common cases of 1 to 2 endpoints to more than
double in size (and most of the time not be needed), that would not be
very appealing :-P  If we have any wide-ints, they would need to be a
memory efficient version.   The Cache uses an irange_allocator, which is
suppose to provide a memory efficient objects.. hence why it trims the
number of ranges down to only what is needed.  It seems like a trailing
wide-Int might be in order based on that..

Andrew


PS. which will be more problematic if you eventually introduce a
known_ones wide_int.I thought the mask tracking was/could be
something simple like  HOST_WIDE_INT..  then you only tracks masks in
types up to the size of a HOST_WIDE_INT.  then storage and masking is
all trivial without going thru a wide_int.Is that not so/possible?

That's certainly easy and cheaper to do.  The hard part was fixing all
the places where we weren't keeping the masks up to date, and that's
done (sans any bugs ;-)).

Can we get consensus here on only tracking masks for type sizes less
than HOST_WIDE_INT?  I'd hate to do all the work only to realize we
need to track 512 bit masks on a 32-bit host cross :-).

64bits are not enough, 128 might be.  But there’s trailing wide int storage so 
I don’t see the point in restricting ourselves?

Fair enough.  Perhaps we should bite the bullet and convert the cache
to vrange_storage which is all set up for streaming irange's with
trailing_wide_ints.  No changes should be necessary for irange, since
we never have more than 3-4 live at any one time.  It's the cache that
needs twiddling.

Wouldnt it be irange_allocator that needs twiddling?  It purpose in life 
is to allocate iranges for memory storage...  the cache is just a 
client, as is rangers global cache, etc...  that was the intention of 
irange_allocator to isolate clients from having to worry about memory 
storage issues?


Or is that problematic?


Andrew





[GCC13][Patch][V5][PATCH 2/2] Use array_at_struct_end_p in __builtin_object_size [PR101836]

2022-10-04 Thread Qing Zhao via Gcc-patches
Use array_at_struct_end_p to determine whether the trailing array
of a structure is flexible array member in __builtin_object_size.

gcc/ChangeLog:

PR tree-optimization/101836
* tree-object-size.cc (addr_object_size): Use array_at_struct_end_p
to determine a flexible array member reference.

gcc/testsuite/ChangeLog:

PR tree-optimization/101836
* gcc.dg/pr101836.c: New test.
* gcc.dg/pr101836_1.c: New test.
* gcc.dg/pr101836_2.c: New test.
* gcc.dg/pr101836_3.c: New test.
* gcc.dg/pr101836_4.c: New test.
* gcc.dg/pr101836_5.c: New test.
* gcc.dg/strict-flex-array-2.c: New test.
* gcc.dg/strict-flex-array-3.c: New test.
---
 gcc/testsuite/gcc.dg/pr101836.c| 60 ++
 gcc/testsuite/gcc.dg/pr101836_1.c  | 60 ++
 gcc/testsuite/gcc.dg/pr101836_2.c  | 60 ++
 gcc/testsuite/gcc.dg/pr101836_3.c  | 60 ++
 gcc/testsuite/gcc.dg/pr101836_4.c  | 60 ++
 gcc/testsuite/gcc.dg/pr101836_5.c  | 60 ++
 gcc/testsuite/gcc.dg/strict-flex-array-2.c | 60 ++
 gcc/testsuite/gcc.dg/strict-flex-array-3.c | 60 ++
 gcc/tree-object-size.cc| 16 +++---
 9 files changed, 487 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr101836.c
 create mode 100644 gcc/testsuite/gcc.dg/pr101836_1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr101836_2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr101836_3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr101836_4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr101836_5.c
 create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-2.c
 create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-3.c

diff --git a/gcc/testsuite/gcc.dg/pr101836.c b/gcc/testsuite/gcc.dg/pr101836.c
new file mode 100644
index 000..efad02cfe89
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101836.c
@@ -0,0 +1,60 @@
+/* -fstrict-flex-arrays is aliased with -ftrict-flex-arrays=3, which is the
+   strictest, only [] is treated as flexible array.  */ 
+/* PR tree-optimization/101836 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fstrict-flex-arrays" } */
+
+#include 
+
+#define expect(p, _v) do { \
+size_t v = _v; \
+if (p == v) \
+printf("ok:  %s == %zd\n", #p, p); \
+else \
+   {  \
+  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+ __builtin_abort (); \
+   } \
+} while (0);
+
+struct trailing_array_1 {
+int a;
+int b;
+int c[4];
+};
+
+struct trailing_array_2 {
+int a;
+int b;
+int c[1];
+};
+
+struct trailing_array_3 {
+int a;
+int b;
+int c[0];
+};
+struct trailing_array_4 {
+int a;
+int b;
+int c[];
+};
+
+void __attribute__((__noinline__)) stuff(
+struct trailing_array_1 *normal,
+struct trailing_array_2 *trailing_1,
+struct trailing_array_3 *trailing_0,
+struct trailing_array_4 *trailing_flex)
+{
+expect(__builtin_object_size(normal->c, 1), 16);
+expect(__builtin_object_size(trailing_1->c, 1), 4);
+expect(__builtin_object_size(trailing_0->c, 1), 0);
+expect(__builtin_object_size(trailing_flex->c, 1), -1);
+}
+
+int main(int argc, char *argv[])
+{
+stuff((void *)argv[0], (void *)argv[0], (void *)argv[0], (void *)argv[0]);
+
+return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr101836_1.c 
b/gcc/testsuite/gcc.dg/pr101836_1.c
new file mode 100644
index 000..e2931ce1012
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101836_1.c
@@ -0,0 +1,60 @@
+/* -fstrict-flex-arrays=3 is the strictest, only [] is treated as
+   flexible array.  */ 
+/* PR tree-optimization/101836 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fstrict-flex-arrays=3" } */
+
+#include 
+
+#define expect(p, _v) do { \
+size_t v = _v; \
+if (p == v) \
+printf("ok:  %s == %zd\n", #p, p); \
+else \
+   {  \
+  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+ __builtin_abort (); \
+   } \
+} while (0);
+
+struct trailing_array_1 {
+int a;
+int b;
+int c[4];
+};
+
+struct trailing_array_2 {
+int a;
+int b;
+int c[1];
+};
+
+struct trailing_array_3 {
+int a;
+int b;
+int c[0];
+};
+struct trailing_array_4 {
+int a;
+int b;
+int c[];
+};
+
+void __attribute__((__noinline__)) stuff(
+struct trailing_array_1 *normal,
+struct trailing_array_2 *trailing_1,
+struct trailing_array_3 *trailing_0,
+struct trailing_array_4 *trailing_flex)
+{
+expect(__builtin_object_size(normal->c, 1), 16);
+expect(__builtin_object_size(trailing_1->c, 1), 4);
+expect(__builtin_object_size(trailing_0->c, 1), 0);
+expect(__builtin_object_size(trailing_flex->c, 1), -1);
+}
+
+int main(int argc, char *argv[])
+{
+stuff((void *)argv[0], (void *)argv[0], (void *)argv[0], (void *)argv[0]);
+
+

[PATCH 0/2] [GCC13][Patch][V5][PATCH 0/2] Add a new option -fstrict-flex-arrays[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-10-04 Thread Qing Zhao via Gcc-patches


This is the 5th version of the patch set.
Compare to the 4th version, the following are the major change:(Address
Martin's comments).

  1. change the name of the attribute from "strict_flex_arrays" to
"strict_flex_array";
  2. update document to update all mentions of flexible array member
with additional qualification "for the purposes of accessing the
elements of such array".

Compare to the 3rd version, the following are the major change:

1. delete all the warnings for the confliction between -std and
-fstrict-flex-arrays per our discussion.
2. delete all the related testing cases for these warnings.
3. update all the wording changes, and documentation format changes
recommanded by Joseph.
I have bootstrapped and regression tested on both aarch64 and x86, no issues.

The above changes are all in documentation and FEs.

Since the Middle end change has been Okayed by Bichard in the V3 of the
patch review. So, Joseph, could you please take a look at the FE  and
doc changes and let me know whether they are good to commit?

thanks a lot.

Qing 


Qing Zhao (2):
  Add a new option -fstrict-flex-arrays[=n] and new attribute
strict_flex_array
  Use array_at_struct_end_p in __builtin_object_size [PR101836]



[GCC13][Patch][V5][PATCH 1/2] Add a new option -fstrict-flex-arrays[=n] and new attribute strict_flex_array

2022-10-04 Thread Qing Zhao via Gcc-patches
Add the following new option -fstrict-flex-arrays[=n] and a corresponding
attribute strict_flex_array to GCC:

'-fstrict-flex-arrays'
 Control when to treat the trailing array of a structure as a flexible array
 member for the purpose of accessing the elements of such an array.
 The positive form is equivalent to '-fstrict-flex-arrays=3', which is the
 strictest.  A trailing array is treated as a flexible array member only 
when
 it declared as a flexible array member per C99 standard onwards.
 The negative form is equivalent to '-fstrict-flex-arrays=0', which is the
 least strict.  All trailing arrays of structures are treated as flexible
 array members.

'-fstrict-flex-arrays=LEVEL'
 Control when to treat the trailing array of a structure as a flexible array
 member for the purpose of accessing the elements of such an array.  The 
value
 of LEVEL controls the level of strictness

 The possible values of LEVEL are the same as for the
 'strict_flex_array' attribute (*note Variable Attributes::).

 You can control this behavior for a specific trailing array field
 of a structure by using the variable attribute 'strict_flex_array'
 attribute (*note Variable Attributes::).

'strict_flex_array (LEVEL)'
 The 'strict_flex_array' attribute should be attached to the trailing
 array field of a structure. It controls when to treat the trailing array
 field of a structure as a flexible array member for the purposesof 
accessing
 the elements of such an array. LEVEL must be an integer betwen 0 to 3.

 LEVEL=0 is the least strict level, all trailing arrays of
 structures are treated as flexible array members.  LEVEL=3 is the
 strictest level, only when the trailing array is declared as a
 flexible array member per C99 standard onwards ('[]'), it is
 treated as a flexible array member.

 There are two more levels in between 0 and 3, which are provided to
 support older codes that use GCC zero-length array extension
 ('[0]') or one-element array as flexible array members('[1]'): When
 LEVEL is 1, the trailing array is treated as a flexible array member
 when it is declared as either '[]', '[0]', or '[1]'; When
 LEVEL is 2, the trailing array is treated as a flexible array member
 when it is declared as either '[]', or '[0]'.

 This attribute can be used with or without the
 '-fstrict-flex-arrays'.  When both the attribute and the option
 present at the same time, the level of the strictness for the
 specific trailing array field is determined by the attribute.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_strict_flex_array_attribute): New function.
(c_common_attribute_table): New item for strict_flex_array.
* c.opt: (fstrict-flex-arrays): New option.
(fstrict-flex-arrays=): New option.

gcc/c/ChangeLog:

* c-decl.cc (flexible_array_member_type_p): New function.
(one_element_array_type_p): Likewise.
(zero_length_array_type_p): Likewise.
(add_flexible_array_elts_to_size): Call new utility
routine flexible_array_member_type_p.
(is_flexible_array_member_p): New function.
(finish_struct): Set the new DECL_NOT_FLEXARRAY flag.

gcc/cp/ChangeLog:

* module.cc (trees_out::core_bools): Stream out new bit
decl_not_flexarray.
(trees_in::core_bools): Stream in new bit decl_not_flexarray.

gcc/ChangeLog:

* doc/extend.texi: Document strict_flex_array attribute.
* doc/invoke.texi: Document -fstrict-flex-arrays[=n] option.
* print-tree.cc (print_node): Print new bit decl_not_flexarray.
* tree-core.h (struct tree_decl_common): New bit field
decl_not_flexarray.
* tree-streamer-in.cc (unpack_ts_decl_common_value_fields): Stream
in new bit decl_not_flexarray.
* tree-streamer-out.cc (pack_ts_decl_common_value_fields): Stream
out new bit decl_not_flexarray.
* tree.cc (array_at_struct_end_p): Update it with the new bit field
decl_not_flexarray.
* tree.h (DECL_NOT_FLEXARRAY): New flag.

gcc/testsuite/ChangeLog:

* g++.dg/strict-flex-array-1.C: New test.
* gcc.dg/strict-flex-array-1.c: New test.
---
 gcc/c-family/c-attribs.cc  |  47 
 gcc/c-family/c.opt |   7 ++
 gcc/c/c-decl.cc| 130 +++--
 gcc/cp/module.cc   |   2 +
 gcc/doc/extend.texi|  26 +
 gcc/doc/invoke.texi|  28 -
 gcc/print-tree.cc  |   8 +-
 gcc/testsuite/g++.dg/strict-flex-array-1.C |  31 +
 gcc/testsuite/gcc.dg/strict-flex-array-1.c |  31 +
 gcc/tree-core.h|   5 +-
 gcc/tree-streamer-in.cc|   1 +
 gcc/tree-streamer-out.cc   |   1 +
 gcc/tree.cc 

[PATCH] Add --without-makeinfo

2022-10-04 Thread Tom de Vries via Gcc-patches
Hi,

Currently, we cannot build gdb without makeinfo installed.

It would be convenient to work around this by using the configure flag
MAKEINFO=/usr/bin/true or some such, but that doesn't work because top-level
configure requires a makeinfo of at least version 4.7, and that version check
fails for /usr/bin/true, so we end up with MAKEINFO=missing instead.

What does work is this:
...
$ ./configure
$ make MAKEINFO=/usr/bin/true
...
but the drawback is that it'll have to be specified for each make invocation.

Fix this by adding support for --without-makeinfo in top-level configure.

Tested by building gdb on x86_64-linux, and verifying that no .info files
were generated.

OK for trunk?

Thanks,
- Tom

Add --without-makeinfo

ChangeLog:

2022-09-05  Tom de Vries  

* configure.ac: Add --without-makeinfo.
* configure: Regenerate.

---
 configure| 4 
 configure.ac | 4 
 2 files changed, 8 insertions(+)

diff --git a/configure b/configure
index f14e0efd675..eb84add60cb 100755
--- a/configure
+++ b/configure
@@ -8399,6 +8399,9 @@ fi
 done
 test -n "$MAKEINFO" || MAKEINFO="$MISSING makeinfo"
 
+if test $with_makeinfo = "no"; then
+MAKEINFO=true
+else
 case " $build_configdirs " in
   *" texinfo "*) MAKEINFO='$$r/$(BUILD_SUBDIR)/texinfo/makeinfo/makeinfo' ;;
   *)
@@ -8414,6 +8417,7 @@ case " $build_configdirs " in
 ;;
 
 esac
+fi
 
 # FIXME: expect and dejagnu may become build tools?
 
diff --git a/configure.ac b/configure.ac
index 0152c69292e..e4a2c076674 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3441,6 +3441,9 @@ case " $build_configdirs " in
 esac
 
 AC_CHECK_PROGS([MAKEINFO], makeinfo, [$MISSING makeinfo])
+if test $with_makeinfo = "no"; then
+MAKEINFO=true
+else
 case " $build_configdirs " in
   *" texinfo "*) MAKEINFO='$$r/$(BUILD_SUBDIR)/texinfo/makeinfo/makeinfo' ;;
   *)
@@ -3456,6 +3459,7 @@ changequote(,)
 ;;
 changequote([,])
 esac
+fi
 
 # FIXME: expect and dejagnu may become build tools?
 


Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
On Tue, Oct 4, 2022 at 4:34 PM Richard Biener
 wrote:
>
>
>
> > Am 04.10.2022 um 16:30 schrieb Aldy Hernandez :
> >
> > On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod  wrote:
> >>
> >>
> >>> On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:
>  On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:
> >>>
>  On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
>   wrote:
> >
> > Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <
>  gcc-patches@gcc.gnu.org>:
> >> The reason the nonzero mask was kept in a tree was basically inertia,
> >> as everything in irange is a tree.  However, there's no need to keep
> >> it in a tree, as the conversions to and from wide ints are very
> >> annoying.  That, plus special casing NULL masks to be -1 is prone
> >> to error.
> >>
> >> I have not only rewritten all the uses to assume a wide int, but
> >> have corrected a few places where we weren't propagating the masks, or
> >> rather pessimizing them to -1.  This will become more important in
> >> upcoming patches where we make better use of the masks.
> >>
> >> Performance testing shows a trivial improvement in VRP, as things like
> >> irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
> >> iranges to go away.
> > You want trailing wide int storage though.  A wide_int is quite large.
>  Absolutely, this is only for short term storage.  Any time we need
>  long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
>  through vrange_storage which will stream things in a more memory
>  efficient manner.  For irange, vrange_storage will stream all the
>  sub-ranges, including the nonzero bitmask which is the first entry in
>  such storage, as trailing_wide_ints.
> 
>  See irange_storage_slot to see how it lives in GC memory.
> 
> >>> That being said, the ranger's internal cache uses iranges, albeit with a
> >>> squished down number of subranges (the minimum amount to represent the
> >>> range).  So each cache entry will now be bigger by the difference between
> >>> one tree and one wide int.
> >>>
> >>> I wonder if we should change the cache to use vrange_storage. If not now,
> >>> then when we convert all the subranges to wide ints.
> >>>
> >>> Of course, the memory pressure of the cache is not nearly as problematic 
> >>> as
> >>> SSA_NAME_RANGE_INFO. The cache only stores names it cares about.
> >>
> >> Rangers cache can be a memory bottleneck in pathological cases..
> >> Certainly not as bad as it use to be, but I'm sure it can still be
> >> problematic.Its suppose to be a memory efficient representation
> >> because of that.  The cache can have an entry for any live ssa-name
> >> (which means all of them at some point in the IL) multiplied by a factor
> >> involving the number of dominator blocks and outgoing edges ranges are
> >> calculated on.   So while SSA_NAME_RANGE_INFO is a linear thing, the
> >> cache lies somewhere between a logarithmic and exponential factor based
> >> on the CFG size.
> >
> > Hmmm, perhaps the ultimate goal here should be to convert the cache to
> > use vrange_storage, which uses trailing wide ints for all of the end
> > points plus the masks (known_ones included for the next release).
> >
> >>
> >> if you are growing the common cases of 1 to 2 endpoints to more than
> >> double in size (and most of the time not be needed), that would not be
> >> very appealing :-P  If we have any wide-ints, they would need to be a
> >> memory efficient version.   The Cache uses an irange_allocator, which is
> >> suppose to provide a memory efficient objects.. hence why it trims the
> >> number of ranges down to only what is needed.  It seems like a trailing
> >> wide-Int might be in order based on that..
> >>
> >> Andrew
> >>
> >>
> >> PS. which will be more problematic if you eventually introduce a
> >> known_ones wide_int.I thought the mask tracking was/could be
> >> something simple like  HOST_WIDE_INT..  then you only tracks masks in
> >> types up to the size of a HOST_WIDE_INT.  then storage and masking is
> >> all trivial without going thru a wide_int.Is that not so/possible?
> >
> > That's certainly easy and cheaper to do.  The hard part was fixing all
> > the places where we weren't keeping the masks up to date, and that's
> > done (sans any bugs ;-)).
> >
> > Can we get consensus here on only tracking masks for type sizes less
> > than HOST_WIDE_INT?  I'd hate to do all the work only to realize we
> > need to track 512 bit masks on a 32-bit host cross :-).
>
> 64bits are not enough, 128 might be.  But there’s trailing wide int storage 
> so I don’t see the point in restricting ourselves?

Fair enough.  Perhaps we should bite the bullet and convert the cache
to vrange_storage which is all set up for streaming irange's with
trailing_wide_ints.  No changes should be necessary for irange, since
we never have more than 3-4 live at any one 

[PATCH RESEND 0/1] RFC: P1689R5 support

2022-10-04 Thread Ben Boeckel
This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

Testing is currently happening in CMake's CI using a prior revision of
this patch (the differences are basically the changelog, some style, and
`trtbd` instead of `p1689r5` as the format name).

For testing within GCC, I'll work on the following:

- scanning non-module source
- scanning module-importing source (`import X;`)
- scanning module-exporting source (`export module X;`)
- scanning module implementation unit (`module X;`)
- flag combinations?

Are there existing tools for handling JSON output for testing purposes?
Basically, something that I can add to the test suite that doesn't care
about whitespace, but checks the structure (with sensible replacements
for absolute paths where relevant)?

For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

https://reviews.llvm.org/D134269

with the same flags (though using my old `trtbd` spelling for the
format name).

Thanks,

--Ben

Ben Boeckel (1):
  p1689r5: initial support

 gcc/ChangeLog   |   9 ++
 gcc/c-family/ChangeLog  |   6 +
 gcc/c-family/c-opts.cc  |  40 ++-
 gcc/c-family/c.opt  |  12 ++
 gcc/cp/ChangeLog|   5 +
 gcc/cp/module.cc|   3 +-
 gcc/doc/invoke.texi |  15 +++
 gcc/fortran/ChangeLog   |   5 +
 gcc/fortran/cpp.cc  |   4 +-
 gcc/genmatch.cc |   2 +-
 gcc/input.cc|   4 +-
 libcpp/ChangeLog|  11 ++
 libcpp/include/cpplib.h |  12 +-
 libcpp/include/mkdeps.h |  17 ++-
 libcpp/init.cc  |  14 ++-
 libcpp/mkdeps.cc| 235 ++--
 16 files changed, 368 insertions(+), 26 deletions(-)


base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
-- 
2.37.3



[PATCH RESEND 1/1] p1689r5: initial support

2022-10-04 Thread Ben Boeckel
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.

Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
  `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdep-output=` specifies the `.o` that will be written for the TU
  that is scanned. This is required so that the build system can
  correlate the dependency output with the actual compilation that will
  occur.

CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

Signed-off-by: Ben Boeckel 
---
 gcc/ChangeLog   |   9 ++
 gcc/c-family/ChangeLog  |   6 +
 gcc/c-family/c-opts.cc  |  40 ++-
 gcc/c-family/c.opt  |  12 ++
 gcc/cp/ChangeLog|   5 +
 gcc/cp/module.cc|   3 +-
 gcc/doc/invoke.texi |  15 +++
 gcc/fortran/ChangeLog   |   5 +
 gcc/fortran/cpp.cc  |   4 +-
 gcc/genmatch.cc |   2 +-
 gcc/input.cc|   4 +-
 libcpp/ChangeLog|  11 ++
 libcpp/include/cpplib.h |  12 +-
 libcpp/include/mkdeps.h |  17 ++-
 libcpp/init.cc  |  14 ++-
 libcpp/mkdeps.cc| 235 ++--
 16 files changed, 368 insertions(+), 26 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6dded16c0e3..2d61de6adde 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2022-09-20  Ben Boeckel  
+
+   * doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
+   -fdep-output= flags.
+   * genmatch.cc (main): Add new preprocessor parameter used for C++
+   module tracking.
+   * input.cc (test_lexer): Add new preprocessor parameter used for C++
+   module tracking.
+
 2022-09-19  Torbjörn SVENSSON  
 
* targhooks.cc (default_zero_call_used_regs): Improve sorry
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index ba3d76dd6cb..569dcd96e8c 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,9 @@
+2022-09-20  Ben Boeckel  
+
+   * c-opts.cc (c_common_handle_option): Add fdeps_file variable and
+   -fdeps-format=, -fdep-file=, and -fdep-output= parsing.
+   * c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.
+
 2022-09-15  Richard Biener  
 
* c-common.h (build_void_list_node): Remove.
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index babaa2fc157..617d0e93696 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -77,6 +77,9 @@ static bool verbose;
 /* Dependency output file.  */
 static const char *deps_file;
 
+/* Enhanced dependency output file.  */
+static const char *fdeps_file;
+
 /* The prefix given by -iprefix, if any.  */
 static const char *iprefix;
 
@@ -360,6 +363,23 @@ c_common_handle_option (size_t scode, const char *arg, 
HOST_WIDE_INT value,
   deps_file = arg;
   break;
 
+case OPT_fdep_format_:
+  if (!strcmp (arg, "p1689r5"))
+   cpp_opts->deps.format = DEPS_FMT_P1689R5;
+  else
+   error ("%<-fdep-format=%> unknown format %s", arg);
+  break;
+
+case OPT_fdep_file_:
+  deps_seen = true;
+  fdeps_file = arg;
+  break;
+
+case OPT_fdep_output_:
+  deps_seen = true;
+  defer_opt (code, arg);
+  break;
+
 case OPT_MF:
   deps_seen = true;
   

Re: [PATCH] libstdc++: Implement ranges::join_with_view from P2441R2

2022-10-04 Thread Jonathan Wakely via Gcc-patches
On Tue, 4 Oct 2022 at 15:09, Patrick Palka wrote:
>
> On Tue, 4 Oct 2022, Jonathan Wakely wrote:
>
> > On Tue, 4 Oct 2022 at 02:11, Patrick Palka via Libstdc++
> >  wrote:
> > >
> > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?  FWIW using
> >
> > OK, thanks.
>
> Thanks a lot, patch committed.
>
> >
> > > variant<_PatternIter, _InnerIter> in the implementation means we need to
> > > include  from , which increases the preprocessed size
> > > of  by 3% (51.5k vs 53k).  I suppose that's an acceptable cost?
> >
> > Yeah, I don't think we want to reimplement a lightweight std::variant,
> > because that would just add even more code.
>
> Sounds good.
>
> >
> > As I mentioned on IRC, maybe we could optimize the compilation time
> > for some of the visitation using P2637R0, but that can be done later.
>
> Ah, I didn't consider the compile time impact of using std::visit.
> Since we already use/instantiate std::get elsewhere in the implementation,
> what do you think about doing the visitation manually via index() and
> std::get like so?  Seems to reduce compile time/memory usage for
> join_with/1.cc by around 6% and doesn't look too messy since we're
> dealing with only two alternatives.  (And IIUC this should be equivalent
> to std::visit wrt valueless_by_exception handling, since the call to
> std::get<1> in each else branch will throw bad_variant_access for us
> like std::visit would.)

Nice, 6% seems worth it, and I agree it's not too messy. Please check
this in too!



Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Richard Biener via Gcc-patches



> Am 04.10.2022 um 16:30 schrieb Aldy Hernandez :
> 
> On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod  wrote:
>> 
>> 
>>> On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:
 On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:
>>> 
 On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
  wrote:
> 
> Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <
 gcc-patches@gcc.gnu.org>:
>> The reason the nonzero mask was kept in a tree was basically inertia,
>> as everything in irange is a tree.  However, there's no need to keep
>> it in a tree, as the conversions to and from wide ints are very
>> annoying.  That, plus special casing NULL masks to be -1 is prone
>> to error.
>> 
>> I have not only rewritten all the uses to assume a wide int, but
>> have corrected a few places where we weren't propagating the masks, or
>> rather pessimizing them to -1.  This will become more important in
>> upcoming patches where we make better use of the masks.
>> 
>> Performance testing shows a trivial improvement in VRP, as things like
>> irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
>> iranges to go away.
> You want trailing wide int storage though.  A wide_int is quite large.
 Absolutely, this is only for short term storage.  Any time we need
 long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
 through vrange_storage which will stream things in a more memory
 efficient manner.  For irange, vrange_storage will stream all the
 sub-ranges, including the nonzero bitmask which is the first entry in
 such storage, as trailing_wide_ints.
 
 See irange_storage_slot to see how it lives in GC memory.
 
>>> That being said, the ranger's internal cache uses iranges, albeit with a
>>> squished down number of subranges (the minimum amount to represent the
>>> range).  So each cache entry will now be bigger by the difference between
>>> one tree and one wide int.
>>> 
>>> I wonder if we should change the cache to use vrange_storage. If not now,
>>> then when we convert all the subranges to wide ints.
>>> 
>>> Of course, the memory pressure of the cache is not nearly as problematic as
>>> SSA_NAME_RANGE_INFO. The cache only stores names it cares about.
>> 
>> Rangers cache can be a memory bottleneck in pathological cases..
>> Certainly not as bad as it use to be, but I'm sure it can still be
>> problematic.Its suppose to be a memory efficient representation
>> because of that.  The cache can have an entry for any live ssa-name
>> (which means all of them at some point in the IL) multiplied by a factor
>> involving the number of dominator blocks and outgoing edges ranges are
>> calculated on.   So while SSA_NAME_RANGE_INFO is a linear thing, the
>> cache lies somewhere between a logarithmic and exponential factor based
>> on the CFG size.
> 
> Hmmm, perhaps the ultimate goal here should be to convert the cache to
> use vrange_storage, which uses trailing wide ints for all of the end
> points plus the masks (known_ones included for the next release).
> 
>> 
>> if you are growing the common cases of 1 to 2 endpoints to more than
>> double in size (and most of the time not be needed), that would not be
>> very appealing :-P  If we have any wide-ints, they would need to be a
>> memory efficient version.   The Cache uses an irange_allocator, which is
>> suppose to provide a memory efficient objects.. hence why it trims the
>> number of ranges down to only what is needed.  It seems like a trailing
>> wide-Int might be in order based on that..
>> 
>> Andrew
>> 
>> 
>> PS. which will be more problematic if you eventually introduce a
>> known_ones wide_int.I thought the mask tracking was/could be
>> something simple like  HOST_WIDE_INT..  then you only tracks masks in
>> types up to the size of a HOST_WIDE_INT.  then storage and masking is
>> all trivial without going thru a wide_int.Is that not so/possible?
> 
> That's certainly easy and cheaper to do.  The hard part was fixing all
> the places where we weren't keeping the masks up to date, and that's
> done (sans any bugs ;-)).
> 
> Can we get consensus here on only tracking masks for type sizes less
> than HOST_WIDE_INT?  I'd hate to do all the work only to realize we
> need to track 512 bit masks on a 32-bit host cross :-).

64bits are not enough, 128 might be.  But there’s trailing wide int storage so 
I don’t see the point in restricting ourselves?

> Aldy
> 


Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod  wrote:
>
>
> On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:
> > On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:
> >
> >> On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
> >>  wrote:
> >>>
> >>> Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <
> >> gcc-patches@gcc.gnu.org>:
>  The reason the nonzero mask was kept in a tree was basically inertia,
>  as everything in irange is a tree.  However, there's no need to keep
>  it in a tree, as the conversions to and from wide ints are very
>  annoying.  That, plus special casing NULL masks to be -1 is prone
>  to error.
> 
>  I have not only rewritten all the uses to assume a wide int, but
>  have corrected a few places where we weren't propagating the masks, or
>  rather pessimizing them to -1.  This will become more important in
>  upcoming patches where we make better use of the masks.
> 
>  Performance testing shows a trivial improvement in VRP, as things like
>  irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
>  iranges to go away.
> >>> You want trailing wide int storage though.  A wide_int is quite large.
> >> Absolutely, this is only for short term storage.  Any time we need
> >> long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
> >> through vrange_storage which will stream things in a more memory
> >> efficient manner.  For irange, vrange_storage will stream all the
> >> sub-ranges, including the nonzero bitmask which is the first entry in
> >> such storage, as trailing_wide_ints.
> >>
> >> See irange_storage_slot to see how it lives in GC memory.
> >>
> > That being said, the ranger's internal cache uses iranges, albeit with a
> > squished down number of subranges (the minimum amount to represent the
> > range).  So each cache entry will now be bigger by the difference between
> > one tree and one wide int.
> >
> > I wonder if we should change the cache to use vrange_storage. If not now,
> > then when we convert all the subranges to wide ints.
> >
> > Of course, the memory pressure of the cache is not nearly as problematic as
> > SSA_NAME_RANGE_INFO. The cache only stores names it cares about.
>
> Rangers cache can be a memory bottleneck in pathological cases..
> Certainly not as bad as it use to be, but I'm sure it can still be
> problematic.Its suppose to be a memory efficient representation
> because of that.  The cache can have an entry for any live ssa-name
> (which means all of them at some point in the IL) multiplied by a factor
> involving the number of dominator blocks and outgoing edges ranges are
> calculated on.   So while SSA_NAME_RANGE_INFO is a linear thing, the
> cache lies somewhere between a logarithmic and exponential factor based
> on the CFG size.

Hmmm, perhaps the ultimate goal here should be to convert the cache to
use vrange_storage, which uses trailing wide ints for all of the end
points plus the masks (known_ones included for the next release).

>
> if you are growing the common cases of 1 to 2 endpoints to more than
> double in size (and most of the time not be needed), that would not be
> very appealing :-P  If we have any wide-ints, they would need to be a
> memory efficient version.   The Cache uses an irange_allocator, which is
> suppose to provide a memory efficient objects.. hence why it trims the
> number of ranges down to only what is needed.  It seems like a trailing
> wide-Int might be in order based on that..
>
> Andrew
>
>
> PS. which will be more problematic if you eventually introduce a
> known_ones wide_int.I thought the mask tracking was/could be
> something simple like  HOST_WIDE_INT..  then you only tracks masks in
> types up to the size of a HOST_WIDE_INT.  then storage and masking is
> all trivial without going thru a wide_int.Is that not so/possible?

That's certainly easy and cheaper to do.  The hard part was fixing all
the places where we weren't keeping the masks up to date, and that's
done (sans any bugs ;-)).

Can we get consensus here on only tracking masks for type sizes less
than HOST_WIDE_INT?  I'd hate to do all the work only to realize we
need to track 512 bit masks on a 32-bit host cross :-).

Aldy



[committed] libstdc++: Define functions for freestanding [PR107135]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, and x86_64-linux with -ffreestanding.
Pushed to trunk.

-- >8 --

We don't compile src/c++11/functexcept.cc for freestanding, so just
define the functions used by freestanding entities as inline calls to
std::terminate.

libstdc++-v3/ChangeLog:

PR libstdc++/107135
* include/bits/functexcept.h [!_GLIBCXX_HOSTED]
(__throw_invalid_argument, __throw_out_of_range)
(__throw_out_of_range_fmt, __throw_runtime_error)
(__throw_overflow_error): Define inline.
* include/std/bitset (_M_copy_from_ptr) [!_GLIBCXX_HOSTED]:
Replace __builtin_abort with __throw_invalid_argument.
---
 libstdc++-v3/include/bits/functexcept.h | 25 +
 libstdc++-v3/include/std/bitset |  8 +---
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/bits/functexcept.h 
b/libstdc++-v3/include/bits/functexcept.h
index a78a17b2e04..7fad4b13316 100644
--- a/libstdc++-v3/include/bits/functexcept.h
+++ b/libstdc++-v3/include/bits/functexcept.h
@@ -43,6 +43,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#if _GLIBCXX_HOSTED
   // Helper for exception objects in 
   void
   __throw_bad_exception(void) __attribute__((__noreturn__));
@@ -112,6 +113,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   __throw_bad_function_call() __attribute__((__noreturn__));
 
+#else // ! HOSTED
+
+  __attribute__((__noreturn__)) inline void
+  __throw_invalid_argument(const char*)
+  { std::__terminate(); }
+
+  __attribute__((__noreturn__)) inline void
+  __throw_out_of_range(const char*)
+  { std::__terminate(); }
+
+  __attribute__((__noreturn__)) inline void
+  __throw_out_of_range_fmt(const char*, ...)
+  { std::__terminate(); }
+
+  __attribute__((__noreturn__)) inline void
+  __throw_runtime_error(const char*)
+  { std::__terminate(); }
+
+  __attribute__((__noreturn__)) inline void
+  __throw_overflow_error(const char*)
+  { std::__terminate(); }
+
+#endif // HOSTED
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
diff --git a/libstdc++-v3/include/std/bitset b/libstdc++-v3/include/std/bitset
index afabeb4ba01..1f3f68fefce 100644
--- a/libstdc++-v3/include/std/bitset
+++ b/libstdc++-v3/include/std/bitset
@@ -1514,13 +1514,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
else if (_Traits::eq(__c, __one))
  _Unchecked_set(__i - 1);
else
- {
-#if _GLIBCXX_HOSTED
-   __throw_invalid_argument(__N("bitset::_M_copy_from_ptr"));
-#else
-   __builtin_abort();
-#endif
- }
+ __throw_invalid_argument(__N("bitset::_M_copy_from_ptr"));
  }
   }
 
-- 
2.37.3



[committed] libstdc++: Disable test for freestanding

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux with -ffreestanding.
Pushed to trunk.

-- >8 --

This test checks the exception-safety of std::stable_sort if copying a
value throws. For freestanding we don't allocate in std::stable_sort
anyway, and the exception thrown via __throw_runtime_error terminates,
so disable the test.

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/stable_sort/mem_check.cc: Do nto run
for freestanding.
---
 libstdc++-v3/testsuite/25_algorithms/stable_sort/mem_check.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/testsuite/25_algorithms/stable_sort/mem_check.cc 
b/libstdc++-v3/testsuite/25_algorithms/stable_sort/mem_check.cc
index d1f76906890..9dde4fb2b38 100644
--- a/libstdc++-v3/testsuite/25_algorithms/stable_sort/mem_check.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/stable_sort/mem_check.cc
@@ -15,6 +15,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// { dg-require-effective-target hosted }
+
 // 25.3.1.2 [lib.stable.sort]
 
 #include 
-- 
2.37.3



[committed] libstdc++: Make work freestanding [PR107134]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, and x86_64-linux with -ffreestanding.
Pushed to trunk.

-- >8 --

When gcc/config.gcc defines use_gcc_stdin=wrap, GCC's  tries
to use libc's  unless -ffreestanding is used.

When libstdc++ is configured --disable-hosted-libstdcxx we want
 to work even without -ffreestanding being given. This is a
kluge to make it include GCC's  directly even without
-ffreestanding.

libstdc++-v3/ChangeLog:

PR libstdc++/107134
* include/c_global/cstdint [!_GLIBCXX_HOSTED]: Include
 directly.
---
 libstdc++-v3/include/c_global/cstdint | 59 ++-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/c_global/cstdint 
b/libstdc++-v3/include/c_global/cstdint
index 4490d06f099..4b9df45a9ee 100644
--- a/libstdc++-v3/include/c_global/cstdint
+++ b/libstdc++-v3/include/c_global/cstdint
@@ -37,7 +37,11 @@
 
 #include 
 
-#if _GLIBCXX_HAVE_STDINT_H
+#if ! _GLIBCXX_HOSTED && __has_include()
+// For --disable-hosted-libstdcxx we want GCC's own stdint-gcc.h header
+// even when -ffreestanding isn't used.
+# include 
+#elif __has_include()
 # include 
 #endif
 
@@ -80,9 +84,60 @@ namespace std
   using ::uintmax_t;
   using ::uintptr_t;
 #else // !_GLIBCXX_USE_C99_STDINT_TR1
-  // Define the minimum needed for ,  etc.
+
   using intmax_t = __INTMAX_TYPE__;
   using uintmax_t = __UINTMAX_TYPE__;
+
+#ifdef __INT8_TYPE__
+  using int8_t = __INT8_TYPE__;
+#endif
+#ifdef __INT16_TYPE__
+  using int16_t = __INT16_TYPE__;
+#endif
+#ifdef __INT32_TYPE__
+  using int32_t = __INT32_TYPE__;
+#endif
+#ifdef __INT64_TYPE__
+  using int64_t = __INT64_TYPE__;
+#endif
+
+  using int_least8_t = __INT_LEAST8_TYPE__;
+  using int_least16_t = __INT_LEAST16_TYPE__;
+  using int_least32_t = __INT_LEAST32_TYPE__;
+  using int_least64_t = __INT_LEAST64_TYPE__;
+  using int_fast8_t = __INT_FAST8_TYPE__;
+  using int_fast16_t = __INT_FAST16_TYPE__;
+  using int_fast32_t = __INT_FAST32_TYPE__;
+  using int_fast64_t = __INT_FAST64_TYPE__;
+
+#ifdef __INTPTR_TYPE__
+  using intptr_t = __INTPTR_TYPE__;
+#endif
+
+#ifdef __UINT8_TYPE__
+  using uint8_t = __UINT8_TYPE__;
+#endif
+#ifdef __UINT16_TYPE__
+  using uint16_t = __UINT16_TYPE__;
+#endif
+#ifdef __UINT32_TYPE__
+  using uint32_t = __UINT32_TYPE__;
+#endif
+#ifdef __UINT64_TYPE__
+  using uint64_t = __UINT64_TYPE__;
+#endif
+  using uint_least8_t = __UINT_LEAST8_TYPE__;
+  using uint_least16_t = __UINT_LEAST16_TYPE__;
+  using uint_least32_t = __UINT_LEAST32_TYPE__;
+  using uint_least64_t = __UINT_LEAST64_TYPE__;
+  using uint_fast8_t = __UINT_FAST8_TYPE__;
+  using uint_fast16_t = __UINT_FAST16_TYPE__;
+  using uint_fast32_t = __UINT_FAST32_TYPE__;
+  using uint_fast64_t = __UINT_FAST64_TYPE__;
+#ifdef __UINTPTR_TYPE__
+  using uintptr_t = __UINTPTR_TYPE__;
+#endif
+
 #endif // _GLIBCXX_USE_C99_STDINT_TR1
 } // namespace std
 
-- 
2.37.3



[committed] libstdc++: Enable std::hash> [PR107139]

2022-10-04 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, and x86_64-linux with -ffreestanding.
Pushed to trunk.

-- >8 --

Everything that  depends on is available for freestanding
now.

libstdc++-v3/ChangeLog:

PR libstdc++/107139
* include/std/coroutine: Remove all _GLIBCXXHOSTED preprocessor
conditionals.
---
 libstdc++-v3/include/std/coroutine | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/std/coroutine 
b/libstdc++-v3/include/std/coroutine
index f4189c7e3fc..ebaf11d701f 100644
--- a/libstdc++-v3/include/std/coroutine
+++ b/libstdc++-v3/include/std/coroutine
@@ -39,7 +39,7 @@
 # include 
 #endif
 
-#if !defined __cpp_lib_three_way_comparison && _GLIBCXX_HOSTED
+#if !defined __cpp_lib_three_way_comparison
 # include  // for std::less
 #endif
 
@@ -165,11 +165,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr bool
   operator<(coroutine_handle<> __a, coroutine_handle<> __b) noexcept
   {
-#if _GLIBCXX_HOSTED
 return less()(__a.address(), __b.address());
-#else
-return (__UINTPTR_TYPE__)__a.address() < (__UINTPTR_TYPE__)__b.address();
-#endif
   }
 
   constexpr bool
@@ -343,7 +339,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   } // namespace __n4861
 
-#if _GLIBCXX_HOSTED
   template struct hash;
 
   template
@@ -355,10 +350,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return reinterpret_cast(__h.address());
   }
 };
-#endif
 
 #else
-#error "the coroutine header requires -fcoroutines"
+#error "the  header requires -fcoroutines"
 #endif
 
   _GLIBCXX_END_NAMESPACE_VERSION
-- 
2.37.3



Re: [PATCH] libstdc++: Implement ranges::join_with_view from P2441R2

2022-10-04 Thread Patrick Palka via Gcc-patches
On Tue, 4 Oct 2022, Jonathan Wakely wrote:

> On Tue, 4 Oct 2022 at 02:11, Patrick Palka via Libstdc++
>  wrote:
> >
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?  FWIW using
> 
> OK, thanks.

Thanks a lot, patch committed.

> 
> > variant<_PatternIter, _InnerIter> in the implementation means we need to
> > include  from , which increases the preprocessed size
> > of  by 3% (51.5k vs 53k).  I suppose that's an acceptable cost?
> 
> Yeah, I don't think we want to reimplement a lightweight std::variant,
> because that would just add even more code.

Sounds good.

> 
> As I mentioned on IRC, maybe we could optimize the compilation time
> for some of the visitation using P2637R0, but that can be done later.

Ah, I didn't consider the compile time impact of using std::visit.
Since we already use/instantiate std::get elsewhere in the implementation,
what do you think about doing the visitation manually via index() and
std::get like so?  Seems to reduce compile time/memory usage for
join_with/1.cc by around 6% and doesn't look too messy since we're
dealing with only two alternatives.  (And IIUC this should be equivalent
to std::visit wrt valueless_by_exception handling, since the call to
std::get<1> in each else branch will throw bad_variant_access for us
like std::visit would.)

-- >8 --

Subject: [PATCH] libstdc++: Avoid std::visit in ranges::join_with_view

libstdc++-v3/ChangeLog:

* include/std/ranges (join_with_view::_Iterator::operator*):
Replace use of std::visit with manual visitation.
(join_with_view::_Iterator::operator++): Likewise.
(join_with_view::_Iterator::operator--): Likewise.
(join_with_view::_Iterator::iter_move): Likewise.
(join_with_view::_Iterator::iter_swap): Likewise.
---
 libstdc++-v3/include/std/ranges | 47 +
 1 file changed, 36 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index d0d6ce61a87..1f821128d2d 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -7165,18 +7165,23 @@ namespace views::__adaptor
_M_inner_it.template 
emplace<1>(std::get<1>(std::move(__i._M_inner_it)));
 }
 
-constexpr decltype(auto)
+constexpr common_reference_t,
+iter_reference_t<_PatternIter>>
 operator*() const
 {
-  using reference = common_reference_t,
-  iter_reference_t<_PatternIter>>;
-  return std::visit([](auto& __it) -> reference { return *__it; }, 
_M_inner_it);
+  if (_M_inner_it.index() == 0)
+   return *std::get<0>(_M_inner_it);
+  else
+   return *std::get<1>(_M_inner_it);
 }
 
 constexpr _Iterator&
 operator++()
 {
-  std::visit([](auto& __it){ ++__it; }, _M_inner_it);
+  if (_M_inner_it.index() == 0)
+   ++std::get<0>(_M_inner_it);
+  else
+   ++std::get<1>(_M_inner_it);
   _M_satisfy();
   return *this;
 }
@@ -7232,7 +7237,10 @@ namespace views::__adaptor
}
}
 
-  std::visit([](auto& __it){ --__it; }, _M_inner_it);
+  if (_M_inner_it.index() == 0)
+   --std::get<0>(_M_inner_it);
+  else
+   --std::get<1>(_M_inner_it);
   return *this;
 }
 
@@ -7253,18 +7261,35 @@ namespace views::__adaptor
&& equality_comparable<_OuterIter> && equality_comparable<_InnerIter>
 { return __x._M_outer_it == __y._M_outer_it && __x._M_inner_it 
==__y._M_inner_it; }
 
-friend constexpr decltype(auto)
+friend constexpr common_reference_t,
+   iter_rvalue_reference_t<_PatternIter>>
 iter_move(const _Iterator& __x)
 {
-  using __rval_ref = 
common_reference_t,
-   
iter_rvalue_reference_t<_PatternIter>>;
-  return std::visit<__rval_ref>(ranges::iter_move, __x._M_inner_it);
+  if (__x._M_inner_it.index() == 0)
+   return ranges::iter_move(std::get<0>(__x._M_inner_it));
+  else
+   return ranges::iter_move(std::get<1>(__x._M_inner_it));
 }
 
 friend constexpr void
 iter_swap(const _Iterator& __x, const _Iterator& __y)
   requires indirectly_swappable<_InnerIter, _PatternIter>
-{ std::visit(ranges::iter_swap, __x._M_inner_it, __y._M_inner_it); }
+{
+  if (__x._M_inner_it.index() == 0)
+   {
+ if (__y._M_inner_it.index() == 0)
+   ranges::iter_swap(std::get<0>(__x._M_inner_it), 
std::get<0>(__y._M_inner_it));
+ else
+   ranges::iter_swap(std::get<0>(__x._M_inner_it), 
std::get<1>(__y._M_inner_it));
+   }
+  else
+   {
+ if (__y._M_inner_it.index() == 0)
+   ranges::iter_swap(std::get<1>(__x._M_inner_it), 
std::get<0>(__y._M_inner_it));
+ else
+   ranges::iter_swap(std::get<1>(__x._M_inner_it), 
std::get<1>(__y._M_inner_it));
+   }
+}
   };
 
   template
-- 
2.38.0.rc2



Re: Adding a new thread model to GCC

2022-10-04 Thread LIU Hao via Gcc-patches

在 2022-10-04 21:13, Xi Ruoyao 写道:


In GCC development we usually include the configure regeneration in the
patch because the scripts are also version controlled.



There is a reason for not doing that: Generated contents can't be reviewed.

In mingw-w64 we do the opposite: The person who commits a patch is responsible for update configure, 
Makefile.in, etc. The patch itself doesn't include generated contents.




It's better to include the ID in the subject and ChangeLog of the patch.
Like:

[PATCH 1/3] libgfortran: Use `__gthread_t` instead of `pthread_t` [PR 
105764]

It used to cause errors if a thread model other than `posix` was selected,

which looks like a leftover from a79878585a1c5e32bafbc6d1e73f91fd6e4293bf.

libgfortran/ChangeLog:

	PR libgfortran/105764

* io/async.h (struct async_unit): Use `__gthread_t` instead
of `pthread_t`.



Yes I think this change is good.




And, from https://gcc.gnu.org/contribute.html#patches:

"It is strongly discouraged to post patches as MIME parts of type
application/whatever, disposition attachment or encoded as base64 or
quoted-printable."

Just try "git send-email", it will do the correct thing.  Mimicking its
behavior in a mail client is also possible but error-prune (the mail
client can destroy your patch by replacing your tabs with spaces, etc.)



It's 'discouraged'. It is not forbidden. I expect everywhere people who receive emails to accept 
attachments. Thunderbird has a nice feature to display text attachments inline, so there is no need 
to download it and open it with an external editor, or whatever.



And, I never get `git send-mail` work on my machine:

   ```
   Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll): y
   Unable to initialize SMTP properly. Check config and use --smtp-debug. VA
   LUES: server=smtp.126.com encryption=tls hello=localhost.localdomain port
   =465 at /usr/lib/git-core/git-send-email line 1684,  line 3.
   ```




--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature


Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Andrew MacLeod via Gcc-patches



On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:

On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:


On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
 wrote:


Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <

gcc-patches@gcc.gnu.org>:

The reason the nonzero mask was kept in a tree was basically inertia,
as everything in irange is a tree.  However, there's no need to keep
it in a tree, as the conversions to and from wide ints are very
annoying.  That, plus special casing NULL masks to be -1 is prone
to error.

I have not only rewritten all the uses to assume a wide int, but
have corrected a few places where we weren't propagating the masks, or
rather pessimizing them to -1.  This will become more important in
upcoming patches where we make better use of the masks.

Performance testing shows a trivial improvement in VRP, as things like
irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
iranges to go away.

You want trailing wide int storage though.  A wide_int is quite large.

Absolutely, this is only for short term storage.  Any time we need
long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
through vrange_storage which will stream things in a more memory
efficient manner.  For irange, vrange_storage will stream all the
sub-ranges, including the nonzero bitmask which is the first entry in
such storage, as trailing_wide_ints.

See irange_storage_slot to see how it lives in GC memory.


That being said, the ranger's internal cache uses iranges, albeit with a
squished down number of subranges (the minimum amount to represent the
range).  So each cache entry will now be bigger by the difference between
one tree and one wide int.

I wonder if we should change the cache to use vrange_storage. If not now,
then when we convert all the subranges to wide ints.

Of course, the memory pressure of the cache is not nearly as problematic as
SSA_NAME_RANGE_INFO. The cache only stores names it cares about.


Rangers cache can be a memory bottleneck in pathological cases.. 
Certainly not as bad as it use to be, but I'm sure it can still be 
problematic.    Its suppose to be a memory efficient representation 
because of that.  The cache can have an entry for any live ssa-name 
(which means all of them at some point in the IL) multiplied by a factor 
involving the number of dominator blocks and outgoing edges ranges are 
calculated on.   So while SSA_NAME_RANGE_INFO is a linear thing, the 
cache lies somewhere between a logarithmic and exponential factor based 
on the CFG size.


if you are growing the common cases of 1 to 2 endpoints to more than 
double in size (and most of the time not be needed), that would not be 
very appealing :-P  If we have any wide-ints, they would need to be a 
memory efficient version.   The Cache uses an irange_allocator, which is 
suppose to provide a memory efficient objects.. hence why it trims the 
number of ranges down to only what is needed.  It seems like a trailing 
wide-Int might be in order based on that..


Andrew


PS. which will be more problematic if you eventually introduce a 
known_ones wide_int.    I thought the mask tracking was/could be 
something simple like  HOST_WIDE_INT..  then you only tracks masks in 
types up to the size of a HOST_WIDE_INT.  then storage and masking is 
all trivial without going thru a wide_int.    Is that not so/possible?






Aldy





Re: Adding a new thread model to GCC

2022-10-04 Thread Xi Ruoyao via Gcc-patches
I don't really understand MinGW, but some "non-technical" things:

On Tue, 2022-10-04 at 20:44 +0800, LIU Hao via Gcc-patches wrote:
> After applying these patches, configure scripts in these
> subdirectories need to be regenerated:
> 
>    * gcc
>    * libgcc
>    * libatomic
>    * libstdc++-v3

In GCC development we usually include the configure regeneration in the
patch because the scripts are also version controlled.

> The patch for libgfortran fixes
> 
>* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105764

It's better to include the ID in the subject and ChangeLog of the patch.
Like:

   [PATCH 1/3] libgfortran: Use `__gthread_t` instead of `pthread_t` [PR 105764]
   
   It used to cause errors if a thread model other than `posix` was selected,
   which looks like a leftover from a79878585a1c5e32bafbc6d1e73f91fd6e4293bf.
   
   libgfortran/ChangeLog:
   
PR libgfortran/105764
* io/async.h (struct async_unit): Use `__gthread_t` instead
of `pthread_t`.

This allows a git hook to append a message into the PR 105764 entry in
bugzilla once the patch is committed into trunk.

Normally I leave an empty line after "ChangeLog:" but I'm not sure if
it's strictly needed.

> gcc/config/ChangeLog:
>   * i386/mingw-mcfgthread.h: New file
>   * i386/mingw32.h: Add builtin macro and default libraries for
>   mcfgthread when thread model is `mcf`

Normally I leave a "." for each ChangeLog entry, but I'm not sure if
it's strictly needed.  However there is no gcc/config/ChangeLog, use
gcc/ChangeLog instead.

And, from https://gcc.gnu.org/contribute.html#patches:

"It is strongly discouraged to post patches as MIME parts of type
application/whatever, disposition attachment or encoded as base64 or
quoted-printable."

Just try "git send-email", it will do the correct thing.  Mimicking its
behavior in a mail client is also possible but error-prune (the mail
client can destroy your patch by replacing your tabs with spaces, etc.)

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [Patch] Fortran: Add OpenMP's assume(s) directives

2022-10-04 Thread Jakub Jelinek via Gcc-patches
On Tue, Oct 04, 2022 at 02:26:13PM +0200, Tobias Burnus wrote:
> Hi Jakub,
> 
> On 04.10.22 12:19, Jakub Jelinek wrote:
> 
> On Sun, Oct 02, 2022 at 07:47:18PM +0200, Tobias Burnus wrote:
> 
> 
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -2749,9 +2749,9 @@ have support for @option{-pthread}. @option{-fopenmp} 
> implies
> @opindex fopenmp-simd
> @cindex OpenMP SIMD
> @cindex SIMD
> -Enable handling of OpenMP's SIMD directives with @code{#pragma omp}
> -in C/C++ and @code{!$omp} in Fortran. Other OpenMP directives
> -are ignored.
> +Enable handling of OpenMP's SIMD directives and OPENMP's @code{assume} 
> directive
> 
> 
> s/OPENMP/OpenMP/
> 
> We actually handle more directives, @code{declare reduction},
> @code{ordered}, @code{scan}, @code{loop} and combined/composite directives
> with @code{simd} as constituent.
> ...
> And now in C++ we handle also the attribute syntax (guess we should update
> the text for that here as well as in -fopenmp entry).
> 
> Updated suggestion attached – I still need to update the main patch.
> 
> (I also added 'declare simd' to the list. And I updated Fortran for scan + 
> loop.)
> 
> OK?

Ok, thanks.

> Wouldn't this be better table driven (like c_omp_directives
> in c-family, though guess for Fortran you can just use spaces
> in the name, don't need 3 strings for the separate tokens)?
> Because I think absent/contains isn't the only spot where
> you need directive names, metadirective is another.
> 
> Maybe – I think there are already way to many string repetitions. One problem 
> is that metadirectives permit combined/composite constructs while 'assume(s)' 
> does not. I on purpose did not parse them, but probably in light of 
> Metadirectives, I should.
> 
> I will take a look.

It is true that metadirective supports combined/composite constructs,
and so do we in the C++ attribute case, still we IMHO can use the C/C++
table as is.and it doesn't need to include combined/composite constructs.

The thing is that for the metadirective/C++ attribute case, all we need to
know is to discover the directive category (declarative, stand-alone,
construct, informational, ...) and for that it is enough to parse the
first directive-name in combined/composite constructs.  Both metadirectives
and C++ attributes then have the name of the directive followed by clauses
so we effectively have to use the normal parsing of directives/clauses
there (except perhaps not end on end of directive but on unbalanced closing
paren).  And then there is the absent/contains case, where we only
allow non-combined/composite, so there we need to try to match the directive
name from the table and make sure it is followed by , or ).

> But also, resizing each time a single entry is added to the list isn't
> good for compile time, would be nice to grow the allocation size in
> powers of 2 or so.
> 
> I only expect a very small number – and did not want to keep track of yet 
> another number.
> 
> However, the following should work:
> 
> 
>  if (old_n_absent = 0)
>absent = ... sizeof() * 1
>  else if (popcount (old_n_absent) == 1)
>absent = ... sizeof() * (old_n_absent) * 2)

Yeah.  Or for 0 allocate say 8 and
use (pow2p_hwi (old_n_absent) && old_n_absent >= 8)
in the else if.

> that allocates: 1, 2, 4, 8, 16, 32, ... without keeping track of the number.
> 
> 
> 
> +gfc_match_omp_assumes (void)
> +{
> +  locus loc = gfc_current_locus;
> +  gfc_omp_clauses *c = gfc_get_omp_clauses ();
> +  c->assume = gfc_current_ns->omp_assumes;
> +  if (!gfc_current_ns->proc_name
> +  || (gfc_current_ns->proc_name->attr.flavor != FL_MODULE
> + && !gfc_current_ns->proc_name->attr.subroutine
> + && !gfc_current_ns->proc_name->attr.function))
> +{
> +  gfc_error ("!OMP ASSUMES at %C must be in the specification part of a "
> +"subprogram or module");
> +  return MATCH_ERROR;
> +}
> +  if (gfc_match_omp_clauses (, omp_mask (OMP_CLAUSE_ASSUMPTIONS), true, 
> true,
> +false, false, false, false) != MATCH_YES)
> +{
> +  gfc_current_ns->omp_assumes = NULL;
> +  return MATCH_ERROR;
> +}
> 
> 
> 
> I don't understand the point of preallocation of gfc_omp_clauses here,
> can't it be allocated inside of gfc_match_omp_clauses like everywhere else?
> Because otherwise it e.g. leaks if the first error is reported.
> 
> This is supposed to handle:
>  subroutine foo()
>!$omp assumes absent(target)
>!$omp assumes absent(teams)
>  end
> 
> I did not spot anything which states that it is invalid.
> (I might have missed it, however.) And if it is valid,
> I assume it is equivalent to:
> 
>  subroutine foo()
>!$omp assumes absent(target, teams)
>  end

It is not equivalent to that, because while we have the restriction
that the same list item can't appear multiple times on the same directive,
it can appear multiple times on multiple directives.
So,
  subroutine foo()
!$omp assumes absent(target, target)
  end
or

Re: Adding a new thread model to GCC

2022-10-04 Thread LIU Hao via Gcc-patches

Attached are revised patches. These are exported from trunk.


There is a change since my last message:

  * A new Makefile variable `SHLIB_MCFGTHREAD_LIBS` has been introduced, to keep
the other thread models from being affected.


After applying these patches, configure scripts in these subdirectories need to 
be regenerated:

  * gcc
  * libgcc
  * libatomic
  * libstdc++-v3


The patch for libgfortran fixes

  * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105764


I have successfully bootstrapped GCC 12 with these patches, on i686-w64-mingw32 (with MSVCRT) and 
x86_64-w64-mingw32 (with MSVCRT and UCRT). No errors have been observed so far.


Once these patches land in GCC, we can start the work in mingw-w64 basing on 
`__USING_MCFGTHREAD__`.



--
Best regards,
LIU Hao

From e1ab15fc95ac8180156feed410cacb64a41a9567 Mon Sep 17 00:00:00 2001
From: LIU Hao 
Date: Fri, 27 May 2022 23:12:48 +0800
Subject: [PATCH 1/3] libgfortran: Use `__gthread_t` instead of `pthread_t`

It used to cause errors if a thread model other than `posix` was selected,
which looks like a leftover from a79878585a1c5e32bafbc6d1e73f91fd6e4293bf.

libgfortran/ChangeLog:
* io/async.h (struct async_unit): Use `__gthread_t` instead
of `pthread_t`.

---
 libgfortran/io/async.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgfortran/io/async.h b/libgfortran/io/async.h
index efd542a45e82..d57722a95e44 100644
--- a/libgfortran/io/async.h
+++ b/libgfortran/io/async.h
@@ -351,7 +351,7 @@ typedef struct async_unit
   struct adv_cond work;
   struct adv_cond emptysignal;
   struct st_parameter_dt *pdt;
-  pthread_t thread;
+  __gthread_t thread;
   struct transfer_queue *head;
   struct transfer_queue *tail;
 
-- 
2.37.3

From 0376949aae74b92a7ba327881672e038c3c0d825 Mon Sep 17 00:00:00 2001
From: LIU Hao 
Date: Sun, 2 Oct 2022 00:57:08 +0800
Subject: [PATCH 2/3] libstdc++/thread: Implement `_GLIBCXX_NPROCS` for Windows

This makes `std::thread::hardware_concurrency()` return the number of
logical processors, instead of zero.

libstdc++-v3/ChangeLog:
* src/c++11/thread.cc (get_nprocs): Add new implementation
for native Windows targets

---
 libstdc++-v3/src/c++11/thread.cc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/libstdc++-v3/src/c++11/thread.cc b/libstdc++-v3/src/c++11/thread.cc
index 707a4ad415b9..b39d9f4a9e29 100644
--- a/libstdc++-v3/src/c++11/thread.cc
+++ b/libstdc++-v3/src/c++11/thread.cc
@@ -68,6 +68,15 @@ static inline int get_nprocs()
 #elif defined(_GLIBCXX_USE_SC_NPROC_ONLN)
 # include 
 # define _GLIBCXX_NPROCS sysconf(_SC_NPROC_ONLN)
+#elif defined(_WIN32)
+# include 
+static inline int get_nprocs()
+{
+  SYSTEM_INFO sysinfo;
+  GetSystemInfo();
+  return (int)sysinfo.dwNumberOfProcessors;
+}
+# define _GLIBCXX_NPROCS get_nprocs()
 #else
 # define _GLIBCXX_NPROCS 0
 #endif
-- 
2.37.3

From d69cbaca07cd7b0e2d725574c8d5913b1c5e0bd5 Mon Sep 17 00:00:00 2001
From: LIU Hao 
Date: Sat, 16 Apr 2022 00:46:23 +0800
Subject: [PATCH 3/3] gcc: Add 'mcf' thread model support from mcfgthread

This patch adds the new thread model `mcf`, which implements mutexes
and condition variables with the mcfgthread library.

Source code for mcfgthread is available at 
.

config/ChangeLog:
* gthr.m4 (GCC_AC_THREAD_HEADER): Add new case for `mcf` thread
model

gcc/config/ChangeLog:
* i386/mingw-mcfgthread.h: New file
* i386/mingw32.h: Add builtin macro and default libraries for
mcfgthread when thread model is `mcf`

gcc/ChangeLog:
* config.gcc: Include 'i386/mingw-mcfgthread.h' when thread model
is `mcf`
* configure.ac: Recognize `mcf` as a valid thread model

libatomic/ChangeLog:
* configure.tgt: Add new case for `mcf` thread model

libgcc/ChangeLog:
* config.host: Add new cases for `mcf` thread model
* config/i386/gthr-mcf.h: New file
* config/i386/t-mingw-mcfgthread: New file
* config/i386/t-slibgcc-cygming: Add mcfgthread for libgcc DLL

libstdc++-v3/ChangeLog:
* libsupc++/atexit_thread.cc (__cxa_thread_atexit): Use
implementation from mcfgthread if available
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_release,
__cxa_guard_abort): Use implementations from mcfgthread if
available

---
 config/gthr.m4  |  1 +
 gcc/config.gcc  |  3 +++
 gcc/config/i386/mingw-mcfgthread.h  |  1 +
 gcc/config/i386/mingw32.h   | 11 -
 gcc/configure.ac|  2 +-
 libatomic/configure.tgt |  2 +-
 libgcc/config.host  |  6 +
 libgcc/config/i386/gthr-mcf.h   |  1 +
 libgcc/config/i386/t-mingw-mcfgthread   |  1 +
 libgcc/config/i386/t-slibgcc-cygming|  6 -
 libstdc++-v3/libsupc++/atexit_thread.cc | 20 
 libstdc++-v3/libsupc++/guard.cc | 31 

Re: [PATCH] attribs: Add missing auto_diagnostic_group 3 times

2022-10-04 Thread David Malcolm via Gcc-patches
On Tue, 2022-10-04 at 11:11 +0200, Jakub Jelinek wrote:
> Hi!
> 
> In these spots, the error/error_at has some inform afterwards which
> are
> explanation part of the same diagnostics, so should be tied with
> auto_diagnostic_group with it.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK for trunk; I think future such patches can be self-approved as per
the "obvious" rule.

Thanks
Dave

> 
> 2022-10-04  Jakub Jelinek  
> 
> * attribs.cc (handle_ignored_attributes_option,
> decl_attributes,
> common_function_versions): Use auto_diagnostic_group.
> 
> --- gcc/attribs.cc.jj   2022-10-03 18:31:16.130032566 +0200
> +++ gcc/attribs.cc  2022-10-03 21:27:02.364104260 +0200
> @@ -251,6 +251,7 @@ handle_ignored_attributes_option (vec    /* We don't accept '::attr'.  */
>    if (cln == nullptr || cln == opt)
> {
> + auto_diagnostic_group d;
>   error ("wrong argument to ignored attributes");
>   inform (input_location, "valid format is % or
> %");
>   continue;
> @@ -732,6 +733,7 @@ decl_attributes (tree *node, tree attrib
>   || (spec->max_length >= 0
>   && nargs > spec->max_length))
>     {
> + auto_diagnostic_group d;
>   error ("wrong number of arguments specified for %qE
> attribute",
>  name);
>   if (spec->max_length < 0)
> @@ -1167,6 +1169,7 @@ common_function_versions (tree fn1, tree
>   std::swap (fn1, fn2);
>   attr1 = attr2;
>     }
> + auto_diagnostic_group d;
>   error_at (DECL_SOURCE_LOCATION (fn2),
>     "missing % attribute for multi-versioned
> %qD",
>     fn2);
> 
> Jakub
> 



Re: [Patch] Fortran: Add OpenMP's assume(s) directives

2022-10-04 Thread Tobias Burnus

Hi Jakub,

On 04.10.22 12:19, Jakub Jelinek wrote:

On Sun, Oct 02, 2022 at 07:47:18PM +0200, Tobias Burnus wrote:


--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2749,9 +2749,9 @@ have support for @option{-pthread}. @option{-fopenmp} 
implies
@opindex fopenmp-simd
@cindex OpenMP SIMD
@cindex SIMD
-Enable handling of OpenMP's SIMD directives with @code{#pragma omp}
-in C/C++ and @code{!$omp} in Fortran. Other OpenMP directives
-are ignored.
+Enable handling of OpenMP's SIMD directives and OPENMP's @code{assume} 
directive


s/OPENMP/OpenMP/

We actually handle more directives, @code{declare reduction},
@code{ordered}, @code{scan}, @code{loop} and combined/composite directives
with @code{simd} as constituent.
...
And now in C++ we handle also the attribute syntax (guess we should update
the text for that here as well as in -fopenmp entry).

Updated suggestion attached – I still need to update the main patch.

(I also added 'declare simd' to the list. And I updated Fortran for scan + 
loop.)

OK?

* * *

Wouldn't this be better table driven (like c_omp_directives
in c-family, though guess for Fortran you can just use spaces
in the name, don't need 3 strings for the separate tokens)?
Because I think absent/contains isn't the only spot where
you need directive names, metadirective is another.

Maybe – I think there are already way to many string repetitions. One problem 
is that metadirectives permit combined/composite constructs while 'assume(s)' 
does not. I on purpose did not parse them, but probably in light of 
Metadirectives, I should.

I will take a look.

+  if (is_omp_declarative_stmt (st) || is_omp_informational_stmt (st))
+   {
+ gfc_error ("Invalid %qs directive at %L in %s clause: declarative, "
+"informational and meta directives not permitted",
+gfc_ascii_statement (st, true), _loc,
+is_absent ? "ABSENT" : "CONTAINS");


Do you think we should do the same for C/C++?
Right now it doesn't differentiate between invalid directive names and
names of declarative, informational or meta directives.

Maybe - it might help users to understand why something went wrong, on the 
other hand, I do not really think that a user adds 'absent(declare variant)', 
but I might be wrong.

+   = (gfc_statement *) xrealloc ((*assume)->absent,
+ sizeof (gfc_statement)
+ * (*assume)->n_absent);


XRESIZEVEC?

Aha, that's the macro name!


But also, resizing each time a single entry is added to the list isn't
good for compile time, would be nice to grow the allocation size in
powers of 2 or so.

I only expect a very small number – and did not want to keep track of yet 
another number.

However, the following should work:


 if (old_n_absent = 0)
   absent = ... sizeof() * 1
 else if (popcount (old_n_absent) == 1)
   absent = ... sizeof() * (old_n_absent) * 2)
that allocates: 1, 2, 4, 8, 16, 32, ... without keeping track of the number.



+gfc_match_omp_assumes (void)
+{
+  locus loc = gfc_current_locus;
+  gfc_omp_clauses *c = gfc_get_omp_clauses ();
+  c->assume = gfc_current_ns->omp_assumes;
+  if (!gfc_current_ns->proc_name
+  || (gfc_current_ns->proc_name->attr.flavor != FL_MODULE
+ && !gfc_current_ns->proc_name->attr.subroutine
+ && !gfc_current_ns->proc_name->attr.function))
+{
+  gfc_error ("!OMP ASSUMES at %C must be in the specification part of a "
+"subprogram or module");
+  return MATCH_ERROR;
+}
+  if (gfc_match_omp_clauses (, omp_mask (OMP_CLAUSE_ASSUMPTIONS), true, true,
+false, false, false, false) != MATCH_YES)
+{
+  gfc_current_ns->omp_assumes = NULL;
+  return MATCH_ERROR;
+}



I don't understand the point of preallocation of gfc_omp_clauses here,
can't it be allocated inside of gfc_match_omp_clauses like everywhere else?
Because otherwise it e.g. leaks if the first error is reported.

This is supposed to handle:
 subroutine foo()
   !$omp assumes absent(target)
   !$omp assumes absent(teams)
 end

I did not spot anything which states that it is invalid.
(I might have missed it, however.) And if it is valid,
I assume it is equivalent to:

 subroutine foo()
   !$omp assumes absent(target, teams)
 end

And the simplest way to do the merge seems to use gfc_match_omp_clauses,
which already handles merging  'absent(target) absent(teams)'.

Thus, I pre-populate the clause list with the assumption values from
the previous directive.

Additionally, there shouldn't be a leak as inside gfc_omp_match_clauses is:
 gfc_free_omp_clauses (c);
 return MATCH_ERROR;
which frees the memory. To avoid double freeing, a possibly pre-existing
'gfc_current_ns->omp_assumes' has to be set to NULL.

The other question is whether the spec is clear, e.g. is the following valid?
 !$omp assumes no_openmp
 !$omp assumes no_openmp
In each directive, no_openmp 

Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
On Tue, Oct 4, 2022, 13:28 Aldy Hernandez  wrote:

> On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
>  wrote:
> >
> >
> > Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <
> gcc-patches@gcc.gnu.org>:
> > >
> > > The reason the nonzero mask was kept in a tree was basically inertia,
> > > as everything in irange is a tree.  However, there's no need to keep
> > > it in a tree, as the conversions to and from wide ints are very
> > > annoying.  That, plus special casing NULL masks to be -1 is prone
> > > to error.
> > >
> > > I have not only rewritten all the uses to assume a wide int, but
> > > have corrected a few places where we weren't propagating the masks, or
> > > rather pessimizing them to -1.  This will become more important in
> > > upcoming patches where we make better use of the masks.
> > >
> > > Performance testing shows a trivial improvement in VRP, as things like
> > > irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
> > > iranges to go away.
> >
> > You want trailing wide int storage though.  A wide_int is quite large.
>
> Absolutely, this is only for short term storage.  Any time we need
> long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
> through vrange_storage which will stream things in a more memory
> efficient manner.  For irange, vrange_storage will stream all the
> sub-ranges, including the nonzero bitmask which is the first entry in
> such storage, as trailing_wide_ints.
>
> See irange_storage_slot to see how it lives in GC memory.
>

That being said, the ranger's internal cache uses iranges, albeit with a
squished down number of subranges (the minimum amount to represent the
range).  So each cache entry will now be bigger by the difference between
one tree and one wide int.

I wonder if we should change the cache to use vrange_storage. If not now,
then when we convert all the subranges to wide ints.

Of course, the memory pressure of the cache is not nearly as problematic as
SSA_NAME_RANGE_INFO. The cache only stores names it cares about.

Aldy


Re: [PATCH] libstdc++: Implement ranges::join_with_view from P2441R2

2022-10-04 Thread Jonathan Wakely via Gcc-patches
On Tue, 4 Oct 2022 at 02:11, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?  FWIW using

OK, thanks.

> variant<_PatternIter, _InnerIter> in the implementation means we need to
> include  from , which increases the preprocessed size
> of  by 3% (51.5k vs 53k).  I suppose that's an acceptable cost?

Yeah, I don't think we want to reimplement a lightweight std::variant,
because that would just add even more code.

As I mentioned on IRC, maybe we could optimize the compilation time
for some of the visitation using P2637R0, but that can be done later.



[PATCH] cselib: Skip BImode while keeping track of subvalue relations [PR107088]

2022-10-04 Thread Stefan Schulze Frielinghaus via Gcc-patches
For BImode get_narrowest_mode evaluates to QImode but BImode < QImode.
Thus FOR_EACH_MODE_UNTIL never reaches BImode and iterates until OImode
for which no wider mode exists so we end up with VOIDmode and fail.
Fixed by adding a size guard so we effectively skip BImode.

Bootstrap and regtest are currently running on x64.  Assuming they pass
ok for mainline?

gcc/ChangeLog:

PR rtl-optimization/107088
* cselib.cc (new_cselib_val): Skip BImode while keeping track of
subvalue relations.
---
 gcc/cselib.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/cselib.cc b/gcc/cselib.cc
index 9b582e5d3d6..2abc763a3f8 100644
--- a/gcc/cselib.cc
+++ b/gcc/cselib.cc
@@ -1571,6 +1571,7 @@ new_cselib_val (unsigned int hash, machine_mode mode, rtx 
x)
 
   scalar_int_mode int_mode;
   if (REG_P (x) && is_int_mode (mode, _mode)
+  && GET_MODE_SIZE (int_mode) > 1
   && REG_VALUES (REGNO (x)) != NULL
   && (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn)))
 {
-- 
2.37.3



Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
 wrote:
>
>
> Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches 
> :
> >
> > The reason the nonzero mask was kept in a tree was basically inertia,
> > as everything in irange is a tree.  However, there's no need to keep
> > it in a tree, as the conversions to and from wide ints are very
> > annoying.  That, plus special casing NULL masks to be -1 is prone
> > to error.
> >
> > I have not only rewritten all the uses to assume a wide int, but
> > have corrected a few places where we weren't propagating the masks, or
> > rather pessimizing them to -1.  This will become more important in
> > upcoming patches where we make better use of the masks.
> >
> > Performance testing shows a trivial improvement in VRP, as things like
> > irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
> > iranges to go away.
>
> You want trailing wide int storage though.  A wide_int is quite large.

Absolutely, this is only for short term storage.  Any time we need
long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
through vrange_storage which will stream things in a more memory
efficient manner.  For irange, vrange_storage will stream all the
sub-ranges, including the nonzero bitmask which is the first entry in
such storage, as trailing_wide_ints.

See irange_storage_slot to see how it lives in GC memory.

Aldy



Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-10-04 Thread Andre Vieira (lists) via Gcc-patches

Hi all,

Can I backport this to gcc-11 branch? Also applies cleanly (with the 
exception of the file extensions being different: 'aarch64-builtins.cc 
vs aarch64-builtins.c').


Bootstrapped and regression tested on aarch64-linux-gnu.

Kind regards,
Andre Vieira


Re: [PATCH] c++, c, v2: Implement C++23 P1774R8 - Portable assumptions [PR106654]

2022-10-04 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 03, 2022 at 09:22:42PM +0200, Jakub Jelinek via Gcc-patches wrote:
> Here is a lightly tested updated patch which I'll bootstrap/regtest tonight.

Bootstrap/regtest passed on both x86_64-linux and i686-linux.

Jakub



Re: [Patch] Fortran: Add OpenMP's assume(s) directives

2022-10-04 Thread Jakub Jelinek via Gcc-patches
On Sun, Oct 02, 2022 at 07:47:18PM +0200, Tobias Burnus wrote:
> gcc/ChangeLog:
> 
>   * doc/invoke.texi (-fopenmp-simd): Document that also 'assume'
>   is enabled.
> 
> libgomp/ChangeLog:
> 
>   * libgomp.texi (OpenMP 5.1 Impl. Status): Mark 'assume' as 'Y'.
> 
> gcc/fortran/ChangeLog:
> 
>   * dump-parse-tree.cc (show_omp_assumes): New.
>   (show_omp_clauses, show_namespace): Call it.
>   (show_omp_node, show_code_node): Handle OpenMP ASSUME.
>   * gfortran.h (enum gfc_statement): Add ST_OMP_ASSUME,
>   ST_OMP_END_ASSUME and ST_OMP_ASSUMES.
>   (gfc_exec_op): Add EXEC_OMP_ASSUME.
>   (gfc_omp_assumptions): New struct.
>   (gfc_get_omp_assumptions): New XCNEW #define.
>   (gfc_omp_clauses, gfc_namespace): Add assume member.
>   (gfc_resolve_omp_assumptions): New prototype.
>   * match.h (gfc_match_omp_assume, gfc_match_omp_assumes): New.
>   * openmp.cc (omp_code_to_statement): Declare.
>   (gfc_free_omp_clauses): Free assume member and its struct data.
>   (enum omp_mask2): Add OMP_CLAUSE_ASSUMPTIONS.
>   (gfc_omp_absent_contains_clause): New.
>   (gfc_match_omp_clauses): Call it; optionally use passed
>   omp_clauses argument.
>   (gfc_match_omp_assume, gfc_match_omp_assumes): New.
>   (gfc_resolve_omp_assumptions): New.
>   (resolve_omp_clauses): Call it.
>   (gfc_resolve_omp_directive, omp_code_to_statement): Handle
>   EXEC_OMP_ASSUME.
>   * parse.cc (decode_omp_directive): Parse OpenMP ASSUME(S).
>   (next_statement, parse_executable, parse_omp_structured_block):
>   Handle ST_OMP_ASSUME.
>   (case_omp_decl): Add ST_OMP_ASSUMES.
>   (gfc_ascii_statement): Handle Assumes, optional return
>   string without '!$OMP '/'!$ACC ' prefix.
>   (is_omp_declarative_stmt, is_omp_informational_stmt): New.
>   * parse.h (gfc_ascii_statement): Add optional bool arg to prototype.
>   (is_omp_declarative_stmt, is_omp_informational_stmt): New prototype.
>   * resolve.cc (gfc_resolve_blocks, gfc_resolve_code): Add
>   EXEC_OMP_ASSUME.
>   (gfc_resolve): Resolve ASSUMES directive.
>   * symbol.cc (gfc_free_namespace): Free omp_assumes member.
>   * st.cc (gfc_free_statement): Handle EXEC_OMP_ASSUME.
>   * trans-openmp.cc (gfc_trans_omp_directive): Likewise.
>   * trans.cc (trans_code): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/gomp/assume-1.f90: New test.
>   * gfortran.dg/gomp/assume-2.f90: New test.
>   * gfortran.dg/gomp/assumes-1.f90: New test.
>   * gfortran.dg/gomp/assumes-2.f90: New test.

> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -2749,9 +2749,9 @@ have support for @option{-pthread}. @option{-fopenmp} 
> implies
>  @opindex fopenmp-simd
>  @cindex OpenMP SIMD
>  @cindex SIMD
> -Enable handling of OpenMP's SIMD directives with @code{#pragma omp}
> -in C/C++ and @code{!$omp} in Fortran. Other OpenMP directives
> -are ignored.
> +Enable handling of OpenMP's SIMD directives and OPENMP's @code{assume} 
> directive

s/OPENMP/OpenMP/

We actually handle more directives, @code{declare reduction},
@code{ordered}, @code{scan}, @code{loop} and combined/composite directives
with @code{simd} as constituent.

> +with @code{#pragma omp} in C/C++ and @code{!$omp} in Fortran.  Other OpenMP
> +directives are ignored.

And now in C++ we handle also the attribute syntax (guess we should update
the text for that here as well as in -fopenmp entry).
> @@ -3531,6 +3565,14 @@ show_namespace (gfc_namespace *ns)
>   }
>  }
>  
> +  if (ns->omp_assumes)
> +{
> +  show_indent ();
> +  fprintf (dumpfile, "!$OMP ASSUMES");
> +  show_omp_assumes (ns->omp_assumes);
> +}
> +
> +

Just one empty line?

>fputc ('\n', dumpfile);
>show_indent ();
>fputs ("code:", dumpfile);
> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> index 4babd77924b..29a443dcd44 100644
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -316,7 +316,7 @@ enum gfc_statement
>ST_OMP_END_PARALLEL_MASKED_TASKLOOP_SIMD, ST_OMP_MASKED_TASKLOOP,
>ST_OMP_END_MASKED_TASKLOOP, ST_OMP_MASKED_TASKLOOP_SIMD,
>ST_OMP_END_MASKED_TASKLOOP_SIMD, ST_OMP_SCOPE, ST_OMP_END_SCOPE,
> -  ST_OMP_ERROR, ST_NONE
> +  ST_OMP_ERROR, ST_OMP_ASSUME, ST_OMP_END_ASSUME, ST_OMP_ASSUMES, ST_NONE
>  };
>  
>  /* Types of interfaces that we can have.  Assignment interfaces are
> @@ -1506,6 +1506,19 @@ enum gfc_omp_bind_type
>OMP_BIND_THREAD
>  };
>  
> +typedef struct gfc_omp_assumptions
> +{
> +  int n_absent, n_contains;
> +  enum gfc_statement *absent, *contains;
> +  gfc_expr_list *holds;
> +  locus where;
> +  bool no_openmp:1, no_openmp_routines:1, no_parallelism:1;
> +}
> +gfc_omp_assumptions;
> +
> +#define gfc_get_omp_assumptions() XCNEW (gfc_omp_assumptions)
> +
> +
>  typedef struct gfc_omp_clauses
>  {
>gfc_omp_namelist *lists[OMP_LIST_NUM];
> @@ -1529,6 +1542,7 @@ typedef struct 

Re: [patch] install.texi: gcn - update llvm reqirements, gcn/nvptx - newlib use version

2022-10-04 Thread Tobias Burnus

On 30.09.22 10:00, Tobias Burnus wrote:

That's for https://gcc.gnu.org/install/specific.html
...
   * doc/install.texi (Specific): Add missing items to bullet list.
   (amdgcn): Update LLVM requirements, use version not date for newlib.
   (nvptx): Use version not git hash for newlib.


Now committed as obvious (+ Andrew's LGTM on IRC) as 
https://gcc.gnu.org/r13-3056-ge886ebd17965d78f609b62479f4f48085108389c

Thanks,

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH 1/2] Allow subtarget customization of CC1_SPEC

2022-10-04 Thread Sebastian Huber

On 08/09/2022 07:33, Sebastian Huber wrote:

On 04/08/2022 15:02, Sebastian Huber wrote:

On 22/07/2022 15:02, Sebastian Huber wrote:

gcc/ChangeLog:

* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
(CC1_SPEC): Define to SUBTARGET_CC1_SPEC.
* config/arm/arm.h (CC1_SPEC): Remove.
* config/arc/arc.h (CC1_SPEC): Append SUBTARGET_CC1_SPEC.
* config/cris/cris.h (CC1_SPEC): Likewise.
* config/frv/frv.h (CC1_SPEC): Likewise.
* config/i386/i386.h (CC1_SPEC): Likewise.
* config/ia64/ia64.h (CC1_SPEC): Likewise.
* config/lm32/lm32.h (CC1_SPEC): Likewise.
* config/m32r/m32r.h (CC1_SPEC): Likewise.
* config/mcore/mcore.h (CC1_SPEC): Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/nds32/nds32.h (CC1_SPEC): Likewise.
* config/nios2/nios2.h (CC1_SPEC): Likewise.
* config/pa/pa.h (CC1_SPEC): Likewise.
* config/rs6000/sysv4.h (CC1_SPEC): Likewise.
* config/rx/rx.h (CC1_SPEC): Likewise.
* config/sparc/sparc.h (CC1_SPEC): Likewise.


Could someone please have a look at this patch set?


Ping.


Would someone mind having a look at this patch set? If there is a better 
approach to customize the default TLS model, then please let me know.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[PATCH] attribs: Add missing auto_diagnostic_group 3 times

2022-10-04 Thread Jakub Jelinek via Gcc-patches
Hi!

In these spots, the error/error_at has some inform afterwards which are
explanation part of the same diagnostics, so should be tied with
auto_diagnostic_group with it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-04  Jakub Jelinek  

* attribs.cc (handle_ignored_attributes_option, decl_attributes,
common_function_versions): Use auto_diagnostic_group.

--- gcc/attribs.cc.jj   2022-10-03 18:31:16.130032566 +0200
+++ gcc/attribs.cc  2022-10-03 21:27:02.364104260 +0200
@@ -251,6 +251,7 @@ handle_ignored_attributes_option (vec or %");
  continue;
@@ -732,6 +733,7 @@ decl_attributes (tree *node, tree attrib
  || (spec->max_length >= 0
  && nargs > spec->max_length))
{
+ auto_diagnostic_group d;
  error ("wrong number of arguments specified for %qE attribute",
 name);
  if (spec->max_length < 0)
@@ -1167,6 +1169,7 @@ common_function_versions (tree fn1, tree
  std::swap (fn1, fn2);
  attr1 = attr2;
}
+ auto_diagnostic_group d;
  error_at (DECL_SOURCE_LOCATION (fn2),
"missing % attribute for multi-versioned %qD",
fn2);

Jakub



Re: [PATCH] aarch64: fix off-by-one in reading cpuinfo

2022-10-04 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich  writes:
> Fixes: 341573406b39
>
> Don't subtract one from the result of strnlen() when trying to point
> to the first character after the current string.  This issue would
> cause individual characters (where the 128 byte buffers are stitched
> together) to be lost.
>
> gcc/ChangeLog:
>
>   * config/aarch64/driver-aarch64.cc (readline): Fix off-by-one.
>
> Signed-off-by: Philipp Tomsich 

Thanks for the patch.  Would it be possible to create a testcase along
the lines of gcc.target/aarch64/cpunative/native_cpu_15.c?

Richard

> ---
>
>  gcc/config/aarch64/driver-aarch64.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 52ff537908e..48250e68034 100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -203,9 +203,9 @@ readline (FILE *f)
>   return std::string ();
>/* If we're not at the end of the line then override the
>\0 added by fgets.  */
> -  last = strnlen (buf, size) - 1;
> +  last = strnlen (buf, size);
>  }
> -  while (!feof (f) && buf[last] != '\n');
> +  while (!feof (f) && (last > 0 && buf[last - 1] != '\n'));
>  
>std::string result (buf);
>free (buf);


Re: [patch] Fix thinko in powerpc default specs for -mabi

2022-10-04 Thread Olivier Hainque via Gcc-patches
Hi Segher,

> On 3 Oct 2022, at 18:13, Segher Boessenkool  
> wrote:
> 
> -mabi= does two separate things, unfortunately.
> 
> First, you can use it to set the base ABI: elfv1, elfv2.  But you can
> also use it to set ABI variants, ABI options: -mabi={no-,}altivec,
> -mabi={ieee,ibm}longdouble, -mabi=vec-{extabi,default}.  Things in that
> latter category are completely orthogonal to anything else (except that
> some only make sense together with some base ABIs).

Ooh, I see. I understood there were abi related options internally
(this is quite visible throughout the code of course) but didn't click
on all the implications on the command -mabi line interface.

> Base ABI is not selectable for most, it is implied by your target
> triple.  -mabi=elfv[12] only makes sense for targets that have either of
> the two by default.

Yes, I can see that now.

>> We have been using this for about a year now in gcc-11 based toolchains.
>> This helps our dejagnu testsuite runs for VxWorks on powerpc and 
>> hasn't produced any ill side effect to date.
> 
> So what exactly is this meant to do?

Biased by other ports, the presence of multiple -mabi switches
just seemed wrong on its own, so the first level motivation was
simply to address that. There might have been interactions with
another change in what we observed at the time.

I understand now that multiple -mabi on the command line is not
a problem per se, so:

> But it does not seem correct.  -mabi=optionA should not override the
> -mabi=optionB set in --with-abi=, where A and B are independent, nor
> should it override the base ABI.

Agreed!

Thanks a lot for your constructive feedback, much appreciated.

With Kind Regards,

Olivier



[PATCH] middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic support

2022-10-04 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 30, 2022 at 04:08:10PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Fri, Sep 30, 2022 at 09:49:08AM -0400, Jason Merrill wrote:
> > The comment from Apple on the ABI mangling proposal suggests to me that we
> > might want to delay enabling C++ std::bfloat16_t (i.e. defining
> > __STDCPP_BFLOAT16_T__) until we have that excess precision support?
> 
> I saw that comment.  We have similar problem with _Float16 too, where C++
> effectively right now works as when one uses -fexcess-precision=16 in C
> (which isn't default).
> I can see how hard would it be to add EXCESS_PRECISION_EXPR support to C++
> FE.

I've started on that but it will take some time.  That said, it should
work though less efficiently even without that, even in C users can always
select request such behavior with -fexcess-precision=16.

> > If we're using DF32x for _Float32x, maybe we want DF16b for bfloat16?
> 
> Perhaps, I just followed what was in the pull request.  Can change it.

Changed now, added support for the builtins and ported most of the
float16 tests, so that it gets at least some test coverage.
Also, for now I've left the aarch64 and arm changes out of the patch,
because I haven't tested it on aarch64 yet and arm support was incomplete
and I haven't heard from the ARM maintainers yet what they want or don't
want.

The added testcases showed a few problems.  One is that i?86 maintains
2 kinds of fp comparisons, trivial and non-trivial, the trivial which can
be handled by just a single conditional jump or setCC are handled directly,
while the complex ones which need two are not handled and the generic
code then figures it out using the trivial ones.  Unfortunately this means
that for == and != we end up with libcalls for it.  For _Float16, we have
added __nehf2 and __eqhf2 entrypoints last year.  I wanted to avoid doing
the same for __bf16, so I've added cbranchbf4 and cstorebf4 expanders
that handle all fp comparisons and internally just shift the operands up
to construct SFmode without even handling sNaNs and then call the generic
code to handle SFmode comparisons.

Another problem is for HFmode comparisons, when we see we don't support
directly some HFmode comparison, we iterate on wider scalar float modes
and look for usable comparisons, but BFmode and HFmode are unordered and
one of them has to appear as wider but neither is a subset nor superset,
so I had to skip wider modes which have equal precision to the starting one.
Yet another problem is because I've only enabled the bf16/BF16 suffixes in
C++ because for C it might clash with some later extension.  Am I right to
fear about that, or do you think C will never standardize suffixes that
would clash with that because C++ standardized the bf16/BF16 suffixes for
something already?  If I could enable it, I'd always pedwarn for C for those
and could enable the __BF16_*__ macros.  Right now I had to disable some
-fbuilding-libgcc macros because of that (though nothing really uses them
right now).

Another question is the suffixes of the builtins.  For now I have added
bf16 suffix and enabled the builtins with !both_p, so one always needs to
use __builtin_* form for them.  None of the GCC builtins end with b,
so this isn't ambiguous with __builtin_*f16, but some libm functions do end
with b, in particular ilogb, logb and f{??,??x}sub.  ilogb and the subs
always have it, but is __builtin_logbf16 f16 suffixed logb or bf16 suffixed
log?  Shall the builtins use f16b suffixes instead like the mangling does?

Full patch bootstrapped/regtested on x86_64-linux and i686-linux.

2022-10-04  Jakub Jelinek  

gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
(CASE_FLT_FN_FLOATN_NX): Also include BUILT_IN_*BF16.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
(emit_store_flag_1): Don't consider [BH]Fmode as wider mode to
narrower modes.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions.  If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16,
BT_FN_BFLOAT16_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING,
BT_FN_BFLOAT16_BFLOAT16_BFLOAT16,
BT_FN_BFLOAT16_BFLOAT16_BFLOAT16_BFLOAT16): New.
* builtins.def (DEF_GCC_FLOATN_NX_BUILTINS,
DEF_EXT_LIB_FLOATN_NX_BUILTINS): Also add *bf16 suffixed builtins,
but for these only __builtin_ prefixed functions.
* optabs.cc (can_compare_p, prepare_cmp_insn): Don't consider
[BH]Fmode as 

Re: Adding a new thread model to GCC

2022-10-04 Thread LIU Hao via Gcc-patches

在 2022-10-03 13:03, Bernhard Reutner-Fischer 写道:


No, sorry for my brevity.
Using __gthread_t like in your patch is correct.



I see. In 'libgfortran/io/async.c' there is

  ```
async_unit *au = u->au;
LOCK (>lock);
thread_unit = u;
au->thread = __gthread_self ();
  ```

so indeed `thread` should be `__gthread_t`. By the way I reported this issue four months ago and 
haven't received any response so far:


  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105764


--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature


Re: [COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Richard Biener via Gcc-patches


Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches 
:
> 
> The reason the nonzero mask was kept in a tree was basically inertia,
> as everything in irange is a tree.  However, there's no need to keep
> it in a tree, as the conversions to and from wide ints are very
> annoying.  That, plus special casing NULL masks to be -1 is prone
> to error.
> 
> I have not only rewritten all the uses to assume a wide int, but
> have corrected a few places where we weren't propagating the masks, or
> rather pessimizing them to -1.  This will become more important in
> upcoming patches where we make better use of the masks.
> 
> Performance testing shows a trivial improvement in VRP, as things like
> irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
> iranges to go away.

You want trailing wide int storage though.  A wide_int is quite large.

> gcc/ChangeLog:
> 
>* value-range-storage.cc (irange_storage_slot::set_irange): Remove
>special case.
>* value-range.cc (irange::irange_set): Adjust for nonzero mask
>being a wide int.
>(irange::irange_set_anti_range): Same.
>(irange::set): Same.
>(irange::verify_range): Same.
>(irange::legacy_equal_p): Same.
>(irange::operator==): Same.
>(irange::contains_p): Same.
>(irange::legacy_intersect): Same.
>(irange::legacy_union): Same.
>(irange::irange_single_pair_union): Call union_nonzero_bits.
>(irange::irange_union): Same.
>(irange::irange_intersect): Call intersect_nonzero_bits.
>(irange::intersect): Adjust for nonzero mask being a wide int.
>(irange::invert): Same.
>(irange::set_nonzero_bits): Same.
>(irange::get_nonzero_bits_from_range): New.
>(irange::set_range_from_nonzero_bits): New.
>(irange::get_nonzero_bits): Adjust for nonzero mask being a wide
>int.
>(irange::intersect_nonzero_bits): Same.
>(irange::union_nonzero_bits): Same.
>(range_tests_nonzero_bits): Remove test.
>* value-range.h (irange::varying_compatible_p): Adjust for nonzero
>mask being a wide int.
>(gt_ggc_mx): Same.
>(gt_pch_nx): Same.
>(irange::set_undefined): Same.
>(irange::set_varying): Same.
>(irange::normalize_kind): Same.
> ---
> gcc/value-range-storage.cc |   6 +-
> gcc/value-range.cc | 270 -
> gcc/value-range.h  |  25 ++--
> 3 files changed, 130 insertions(+), 171 deletions(-)
> 
> diff --git a/gcc/value-range-storage.cc b/gcc/value-range-storage.cc
> index de7575ed48d..6e054622830 100644
> --- a/gcc/value-range-storage.cc
> +++ b/gcc/value-range-storage.cc
> @@ -150,11 +150,7 @@ irange_storage_slot::set_irange (const irange )
> {
>   gcc_checking_assert (fits_p (r));
> 
> -  // Avoid calling unsupported get_nonzero_bits on legacy.
> -  if (r.legacy_mode_p ())
> -m_ints[0] = -1;
> -  else
> -m_ints[0] = r.get_nonzero_bits ();
> +  m_ints[0] = r.get_nonzero_bits ();
> 
>   unsigned pairs = r.num_pairs ();
>   for (unsigned i = 0; i < pairs; ++i)
> diff --git a/gcc/value-range.cc b/gcc/value-range.cc
> index 6e196574de9..afb26a40083 100644
> --- a/gcc/value-range.cc
> +++ b/gcc/value-range.cc
> @@ -940,7 +940,7 @@ irange::irange_set (tree min, tree max)
>   m_base[1] = max;
>   m_num_ranges = 1;
>   m_kind = VR_RANGE;
> -  m_nonzero_mask = NULL;
> +  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
>   normalize_kind ();
> 
>   if (flag_checking)
> @@ -1014,7 +1014,7 @@ irange::irange_set_anti_range (tree min, tree max)
> }
> 
>   m_kind = VR_RANGE;
> -  m_nonzero_mask = NULL;
> +  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
>   normalize_kind ();
> 
>   if (flag_checking)
> @@ -1071,7 +1071,7 @@ irange::set (tree min, tree max, value_range_kind kind)
>   m_base[0] = min;
>   m_base[1] = max;
>   m_num_ranges = 1;
> -  m_nonzero_mask = NULL;
> +  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
>   return;
> }
> 
> @@ -1121,7 +1121,7 @@ irange::set (tree min, tree max, value_range_kind kind)
>   m_base[0] = min;
>   m_base[1] = max;
>   m_num_ranges = 1;
> -  m_nonzero_mask = NULL;
> +  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
>   normalize_kind ();
>   if (flag_checking)
> verify_range ();
> @@ -1136,14 +1136,11 @@ irange::verify_range ()
>   if (m_kind == VR_UNDEFINED)
> {
>   gcc_checking_assert (m_num_ranges == 0);
> -  gcc_checking_assert (!m_nonzero_mask);
>   return;
> }
> -  if (m_nonzero_mask)
> -gcc_checking_assert (wi::to_wide (m_nonzero_mask) != -1);
>   if (m_kind == VR_VARYING)
> {
> -  gcc_checking_assert (!m_nonzero_mask);
> +  gcc_checking_assert (m_nonzero_mask == -1);
>   gcc_checking_assert (m_num_ranges == 1);
>   gcc_checking_assert (varying_compatible_p ());
>   return;
> @@ -1238,7 +1235,7 @@ irange::legacy_equal_p (const irange ) const
>   other.tree_lower_bound (0))
>  && 

[COMMITTED] Convert nonzero mask in irange to wide_int.

2022-10-04 Thread Aldy Hernandez via Gcc-patches
The reason the nonzero mask was kept in a tree was basically inertia,
as everything in irange is a tree.  However, there's no need to keep
it in a tree, as the conversions to and from wide ints are very
annoying.  That, plus special casing NULL masks to be -1 is prone
to error.

I have not only rewritten all the uses to assume a wide int, but
have corrected a few places where we weren't propagating the masks, or
rather pessimizing them to -1.  This will become more important in
upcoming patches where we make better use of the masks.

Performance testing shows a trivial improvement in VRP, as things like
irange::contains_p() are tied to a tree.  Ughh, can't wait for trees in
iranges to go away.

gcc/ChangeLog:

* value-range-storage.cc (irange_storage_slot::set_irange): Remove
special case.
* value-range.cc (irange::irange_set): Adjust for nonzero mask
being a wide int.
(irange::irange_set_anti_range): Same.
(irange::set): Same.
(irange::verify_range): Same.
(irange::legacy_equal_p): Same.
(irange::operator==): Same.
(irange::contains_p): Same.
(irange::legacy_intersect): Same.
(irange::legacy_union): Same.
(irange::irange_single_pair_union): Call union_nonzero_bits.
(irange::irange_union): Same.
(irange::irange_intersect): Call intersect_nonzero_bits.
(irange::intersect): Adjust for nonzero mask being a wide int.
(irange::invert): Same.
(irange::set_nonzero_bits): Same.
(irange::get_nonzero_bits_from_range): New.
(irange::set_range_from_nonzero_bits): New.
(irange::get_nonzero_bits): Adjust for nonzero mask being a wide
int.
(irange::intersect_nonzero_bits): Same.
(irange::union_nonzero_bits): Same.
(range_tests_nonzero_bits): Remove test.
* value-range.h (irange::varying_compatible_p): Adjust for nonzero
mask being a wide int.
(gt_ggc_mx): Same.
(gt_pch_nx): Same.
(irange::set_undefined): Same.
(irange::set_varying): Same.
(irange::normalize_kind): Same.
---
 gcc/value-range-storage.cc |   6 +-
 gcc/value-range.cc | 270 -
 gcc/value-range.h  |  25 ++--
 3 files changed, 130 insertions(+), 171 deletions(-)

diff --git a/gcc/value-range-storage.cc b/gcc/value-range-storage.cc
index de7575ed48d..6e054622830 100644
--- a/gcc/value-range-storage.cc
+++ b/gcc/value-range-storage.cc
@@ -150,11 +150,7 @@ irange_storage_slot::set_irange (const irange )
 {
   gcc_checking_assert (fits_p (r));
 
-  // Avoid calling unsupported get_nonzero_bits on legacy.
-  if (r.legacy_mode_p ())
-m_ints[0] = -1;
-  else
-m_ints[0] = r.get_nonzero_bits ();
+  m_ints[0] = r.get_nonzero_bits ();
 
   unsigned pairs = r.num_pairs ();
   for (unsigned i = 0; i < pairs; ++i)
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 6e196574de9..afb26a40083 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -940,7 +940,7 @@ irange::irange_set (tree min, tree max)
   m_base[1] = max;
   m_num_ranges = 1;
   m_kind = VR_RANGE;
-  m_nonzero_mask = NULL;
+  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
   normalize_kind ();
 
   if (flag_checking)
@@ -1014,7 +1014,7 @@ irange::irange_set_anti_range (tree min, tree max)
 }
 
   m_kind = VR_RANGE;
-  m_nonzero_mask = NULL;
+  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
   normalize_kind ();
 
   if (flag_checking)
@@ -1071,7 +1071,7 @@ irange::set (tree min, tree max, value_range_kind kind)
   m_base[0] = min;
   m_base[1] = max;
   m_num_ranges = 1;
-  m_nonzero_mask = NULL;
+  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
   return;
 }
 
@@ -1121,7 +1121,7 @@ irange::set (tree min, tree max, value_range_kind kind)
   m_base[0] = min;
   m_base[1] = max;
   m_num_ranges = 1;
-  m_nonzero_mask = NULL;
+  m_nonzero_mask = wi::shwi (-1, TYPE_PRECISION (TREE_TYPE (min)));
   normalize_kind ();
   if (flag_checking)
 verify_range ();
@@ -1136,14 +1136,11 @@ irange::verify_range ()
   if (m_kind == VR_UNDEFINED)
 {
   gcc_checking_assert (m_num_ranges == 0);
-  gcc_checking_assert (!m_nonzero_mask);
   return;
 }
-  if (m_nonzero_mask)
-gcc_checking_assert (wi::to_wide (m_nonzero_mask) != -1);
   if (m_kind == VR_VARYING)
 {
-  gcc_checking_assert (!m_nonzero_mask);
+  gcc_checking_assert (m_nonzero_mask == -1);
   gcc_checking_assert (m_num_ranges == 1);
   gcc_checking_assert (varying_compatible_p ());
   return;
@@ -1238,7 +1235,7 @@ irange::legacy_equal_p (const irange ) const
   other.tree_lower_bound (0))
  && vrp_operand_equal_p (tree_upper_bound (0),
  other.tree_upper_bound (0))
- && vrp_operand_equal_p (m_nonzero_mask, other.m_nonzero_mask));
+