date:20220419

[PATCH] loongarch: ignore zero-size fields in calling convention

2022-04-19 Thread Xi Ruoyao via Gcc-patches

Currently, LoongArch ELF psABI is not clear on the handling of zero-
sized fields in aggregates arguments or return values [1].  The behavior
of GCC trunk is puzzling considering the following cases:

struct test1
{
  double a[0];
  float x;
};

struct test2
{
  float a[0];
  float x;
};

GCC trunk passes test1::x via GPR, but test2::x via FPR.  I believe no
rational Homo Sapiens can understand (or even expect) this.

And, to make things even worse, test1 behaves differently in C and C++.
GCC trunk passes test1::x via GPR, but G++ trunk passes test1::x via
FPR.

I've write a paragraph about current GCC behavior for the psABI [2], but
I think it's cleaner to just ignore all zero-sized fields in the ABI. 
This will require only a two-line change in GCC (this patch), and an
one-line change in the ABI doc.

If there is not any better idea I'd like to see this reviewed and
applied ASAP.  If we finally have to apply this patch after GCC 12
release, we'll need to add a lot more boring code to emit a -Wpsabi
inform [3].  That will be an unnecessary burden for both us, and the
users using the compiler (as the compiler will spend CPU time only for
checking if a warning should be informed).

[1]:https://github.com/loongson/LoongArch-Documentation/issues/48
[2]:https://github.com/loongson/LoongArch-Documentation/pull/49
[3]:https://gcc.gnu.org/PR102024

gcc/

* config/loongarch/loongarch.cc
(loongarch_flatten_aggregate_field): Ignore empty fields for
RECORD_TYPE.

gcc/testsuite/

* gcc.target/loongarch/zero-size-field-pass.c: New test.
* gcc.target/loongarch/zero-size-field-ret.c: New test.
---
 gcc/config/loongarch/loongarch.cc |  3 ++
 .../loongarch/zero-size-field-pass.c  | 30 +++
 .../loongarch/zero-size-field-ret.c   | 28 +
 3 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/zero-size-field-pass.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/zero-size-field-ret.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index f22150a60cc..57e4d9f82ce 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -326,6 +326,9 @@ loongarch_flatten_aggregate_field (const_tree type,
   for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
if (TREE_CODE (f) == FIELD_DECL)
  {
+   if (DECL_SIZE (f) && integer_zerop (DECL_SIZE (f)))
+ continue;
+
if (!TYPE_P (TREE_TYPE (f)))
  return -1;
 
diff --git a/gcc/testsuite/gcc.target/loongarch/zero-size-field-pass.c 
b/gcc/testsuite/gcc.target/loongarch/zero-size-field-pass.c
new file mode 100644
index 000..999dc913a71
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/zero-size-field-pass.c
@@ -0,0 +1,30 @@
+/* Test that LoongArch backend ignores zero-sized fields of aggregates in
+   argument passing.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdouble-float -mabi=lp64d" } */
+/* { dg-final { scan-assembler "\\\$f1" } } */
+
+struct test
+{
+  int empty1[0];
+  double empty2[0];
+  int : 0;
+  float x;
+  long empty3[0];
+  long : 0;
+  float y;
+  unsigned : 0;
+  char empty4[0];
+};
+
+extern void callee (struct test);
+
+void
+caller (void)
+{
+  struct test test;
+  test.x = 114;
+  test.y = 514;
+  callee (test);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/zero-size-field-ret.c 
b/gcc/testsuite/gcc.target/loongarch/zero-size-field-ret.c
new file mode 100644
index 000..40137d97555
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/zero-size-field-ret.c
@@ -0,0 +1,28 @@
+/* Test that LoongArch backend ignores zero-sized fields of aggregates in
+   returning.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdouble-float -mabi=lp64d" } */
+/* { dg-final { scan-assembler-not "\\\$r4" } } */
+
+struct test
+{
+  int empty1[0];
+  double empty2[0];
+  int : 0;
+  float x;
+  long empty3[0];
+  long : 0;
+  float y;
+  unsigned : 0;
+  char empty4[0];
+};
+
+extern struct test callee (void);
+
+float
+caller (void)
+{
+  struct test test = callee ();
+  return test.x + test.y;
+}
-- 
2.36.0

[PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-19 Thread Jakub Jelinek via Gcc-patches

Hi!

cgraph_node has a semantic_interposition flag which should mirror
opt_for_fn (decl, flag_semantic_interposition).  But it actually is
initialized not from that, but from flag_semantic_interposition in the
  explicit symtab_node (symtab_type t)
: type (t), resolution (LDPR_UNKNOWN), definition (false), alias (false),
...
  semantic_interposition (flag_semantic_interposition),
...
  x_comdat_group (NULL_TREE), x_section (NULL)
  {}
ctor.  I think that might be fine for varpool nodes, but since
flag_semantic_interposition is now implied from -Ofast it isn't correct
for cgraph nodes, unless we guarantee that cgraph node for a particular
function decl is always created while that function is
current_function_decl.  That is often the case, but not always as the
following function shows.
Because symtab_node's ctor doesn't know for which decl the cgraph node
is being created, the following patch keeps that as is, but updates it from
opt_for_fn (decl, flag_semantic_interposition) when we know that, or for
clones copies that flag (often it is then overridden in
set_new_clone_decl_and_node_flags, but not always).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-04-20  Jakub Jelinek  

PR ipa/105306
* cgraph.cc (cgraph_node::create): Set node->semantic_interposition
to opt_for_fn (decl, flag_semantic_interposition).
* cgraphclones.cc (cgraph_node::create_clone): Copy over
semantic_interposition flag.

* g++.dg/opt/pr105306.C: New test.

--- gcc/cgraph.cc.jj2022-02-04 14:36:54.069618372 +0100
+++ gcc/cgraph.cc   2022-04-19 13:38:06.223782974 +0200
@@ -507,6 +507,7 @@ cgraph_node::create (tree decl)
   gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
 
   node->decl = decl;
+  node->semantic_interposition = opt_for_fn (decl, 
flag_semantic_interposition);
 
   if ((flag_openacc || flag_openmp)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
--- gcc/cgraphclones.cc.jj  2022-01-18 11:58:58.948991114 +0100
+++ gcc/cgraphclones.cc 2022-04-19 13:38:43.594262397 +0200
@@ -394,6 +394,7 @@ cgraph_node::create_clone (tree new_decl
   new_node->versionable = versionable;
   new_node->can_change_signature = can_change_signature;
   new_node->redefined_extern_inline = redefined_extern_inline;
+  new_node->semantic_interposition = semantic_interposition;
   new_node->tm_may_enter_irr = tm_may_enter_irr;
   new_node->externally_visible = false;
   new_node->no_reorder = no_reorder;
--- gcc/testsuite/g++.dg/opt/pr105306.C.jj  2022-04-19 13:42:33.908054114 
+0200
+++ gcc/testsuite/g++.dg/opt/pr105306.C 2022-04-19 13:42:08.859403045 +0200
@@ -0,0 +1,13 @@
+// PR ipa/105306
+// { dg-do compile }
+// { dg-options "-Ofast" }
+
+#pragma GCC optimize 0
+template  void foo (T);
+struct B { ~B () {} };
+struct C { B f; };
+template  struct E {
+  void bar () { foo (g); }
+  C g;
+};
+template class E;

Jakub

回复：[PATCH] Asan changes for RISC-V.

2022-04-19 Thread joshua via Gcc-patches

Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.
Is it because # libsanitizer not supported rv32, but it will break the rv64 
multi-lib build, so we disable that temporally until rv32 supported# in 
Makefile.in?


--
发件人：Jim Wilson 
发送时间：2020年10月29日(星期四) 07:59
收件人：gcc-patches 
抄　送：cooper.joshua ; Jim Wilson 

主　题：[PATCH] Asan changes for RISC-V.

We have only riscv64 asan support, there is no riscv32 support as yet.  So I
need to be able to conditionally enable asan support for the riscv target.  I
implemented this by returning zero from the asan_shadow_offset function.  This
requires a change to toplev.c and docs in target.def.

The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
The problem is that the asan high memory region is a small wedge below
0x40.  The new kernel puts shared libraries at 0x3f and going
down which works.  But the old kernel puts shared libraries at 0x20
and going up which does not work, as it isn't in any recognized memory
region.  This might be fixable with more asan work, but we don't really need
support for old kernel versions.

The asan port is curious in that it uses 1<<29 for the shadow offset, but all
other 64-bit targets use a number larger than 1<<32.  But what we have is
working OK for now.

I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running on
qemu and the results look reasonable.

  === gcc Summary ===

# of expected passes  1905
# of unexpected failures 11
# of unsupported tests  224

  === g++ Summary ===

# of expected passes  2002
# of unexpected failures 6
# of unresolved testcases 1
# of unsupported tests  175

OK?

Jim

2020-10-28  Jim Wilson  

 gcc/
 * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
 (TARGET_ASAN_SHADOW_OFFSET): New.
 * doc/tm.texi: Regenerated.
 * target.def (asan_shadow_offset); Mention that it can return zero.
 * toplev.c (process_options): Check for and handle zero return from
 targetm.asan_shadow_offset call.

Co-Authored-By: cooper.joshua 
---
 gcc/config/riscv/riscv.c | 16 
 gcc/doc/tm.texi  |  3 ++-
 gcc/target.def   |  3 ++-
 gcc/toplev.c |  3 ++-
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..6909e200de1 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
   return true;
 }

+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+riscv_asan_shadow_offset (void)
+{
+  /* We only have libsanitizer support for RV64 at present.
+
+ This number must match kRiscv*_ShadowOffset* in the file
+ libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
+ even though 1<<36 makes more sense.  */
+  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
 #undef TARGET_NEW_ADDRESS_PROFITABLE_P
 #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p

+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 24c37f655c8..39c596b647a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12078,7 +12078,8 @@ is zero, which disables this optimization.
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET 
(void)
 Return the offset bitwise ored into shifted address to get corresponding
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
-supported by the target.
+supported by the target.  May return 0 if Address Sanitizer is not supported
+by a subtarget.
 @end deftypefn

 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK 
(unsigned HOST_WIDE_INT @var{val})
diff --git a/gcc/target.def b/gcc/target.def
index ed2da154e30..268b56b6ebd 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4452,7 +4452,8 @@ DEFHOOK
 (asan_shadow_offset,
  "Return the offset bitwise ored into shifted address to get corresponding\n\
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
-supported by the target.",
+supported by the target.  May return 0 if Address Sanitizer is not supported\n\
+by a subtarget.",
  unsigned HOST_WIDE_INT, (void),
  NULL)

diff --git a/gcc/toplev.c b/gcc/toplev.c
index 20e231f4d2a..cf89598252c 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1834,7 +1834,8 @@ process_options (void)
 }

   if ((flag_sanitize & SANITIZE_USER_ADDRESS)
-  && targetm.asan_shadow

Re: [PATCH] c++, coroutines: Avoid expanding within templates [PR103868]

2022-04-19 Thread Jason Merrill via Gcc-patches


On 4/18/22 15:49, Eric Gallager via Gcc-patches wrote:

On Mon, Apr 18, 2022 at 10:01 AM Iain Sandoe via Gcc-patches
 wrote:


From: Nathan Sidwell 

This is a forward-port of a patch by Nathan (against 10.x) which fixes an open
PR.

We are ICEing because we ended up tsubst_copying something that had already
been tsubst, leading to an assert failure (mostly such repeated tsubsting is
harmless).


I wouldn't say "mostly".  It should always be avoided, it frequently 
causes problems.  Pretty much any time there's a class prvalue.



We had a non-dependent co_await in a non-dependent-type template fn, so we
processed it at definition time, and then reprocessed at instantiation time.
We fix this here by deferring substitution while processing templates.

Additional observations (for a better future fix, in the GCC13 timescale):

Exprs only have dependent type if at least one operand is dependent which was
what the current code was intending to do.  Coroutines have the additional
wrinkle, that the current fn's type is an implicit operand.

So, if the coroutine function's type is not dependent, and the operand is not
dependent, we should determine the type of the co_await expression using the
DEPENDENT_EXPR wrapper machinery.  That allows us to determine the
subexpression type, but leave its operand unchanged and then instantiate it
later.


Sure, like what build_x_binary_op and the like do.


Tested on x86_64-darwin (it does also fix the original testcase, but that is
far too time-consuming for the testsuite).


The compiler change seems fine as a temporary workaround.  Is it not 
feasible to write a new short testcase that reproduces the problem, now 
that you understand it?



OK for master? / backports? (when?)
thanks,
Iain

 PR c++/103868

gcc/cp/ChangeLog:

 * coroutines.cc (finish_co_await_expr): Do not process non-dependent
 coroutine expressions at template definition time.
 (finish_co_yield_expr): Likewise.
 (finish_co_return_stmt): Likewise.

gcc/testsuite/ChangeLog:

 * g++.dg/coroutines/pr103868.C: New test.

Co-Authored-by: Iain Sandoe 
---
  gcc/cp/coroutines.cc   |   18 +-
  gcc/testsuite/g++.dg/coroutines/pr103868.C | 7390 
  2 files changed, 7396 insertions(+), 12 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr103868.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index cdf6503c4d3..a9ce6e050dd 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1148,10 +1148,8 @@ finish_co_await_expr (location_t kw, tree expr)
   extraneous warnings during substitution.  */
suppress_warning (current_function_decl, OPT_Wreturn_type);

-  /* If we don't know the promise type, we can't proceed, build the
- co_await with the expression unchanged.  */
-  tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  /* Defer processing when we have dependent types.  */
+  if (processing_template_decl)
  {
tree aw_expr = build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
  NULL_TREE, NULL_TREE, NULL_TREE,
@@ -1222,10 +1220,8 @@ finish_co_yield_expr (location_t kw, tree expr)
   extraneous warnings during substitution.  */
suppress_warning (current_function_decl, OPT_Wreturn_type);

-  /* If we don't know the promise type, we can't proceed, build the
- co_await with the expression unchanged.  */
-  tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  /* Defer processing when we have dependent types.  */
+  if (processing_template_decl)
  return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr, NULL_TREE);

if (!coro_promise_type_found_p (current_function_decl, kw))
@@ -1307,10 +1303,8 @@ finish_co_return_stmt (location_t kw, tree expr)
&& check_for_bare_parameter_packs (expr))
  return error_mark_node;

-  /* If we don't know the promise type, we can't proceed, build the
- co_return with the expression unchanged.  */
-  tree functype = TREE_TYPE (current_function_decl);
-  if (dependent_type_p (functype) || type_dependent_expression_p (expr))
+  /* Defer processing when we have dependent types.  */
+  if (processing_template_decl)
  {
/* co_return expressions are always void type, regardless of the
  expression type.  */

Re: [PATCH] c++, coroutines: Improve check for throwing final await [PR104051].

2022-04-19 Thread Jason Merrill via Gcc-patches


On 4/18/22 11:34, Iain Sandoe wrote:

We check that the final_suspend () method returns a sane type (i.e. a class
or structure) but, unfortunately, that check has to be later than the one
for a throwing case.  If the user returns some nonsensical type from the
method, we need to handle that in the checking for noexcept.

tested on x86_64-darwin, OK for mainline? (when?),


OK.


thanks
Iain

Signed-off-by: Iain Sandoe 

PR c++/104051

gcc/cp/ChangeLog:

* coroutines.cc (coro_diagnose_throwing_final_aw_expr): Handle
non-target expression inputs.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr104051.C: New test.
---
  gcc/cp/coroutines.cc   | 13 +-
  gcc/testsuite/g++.dg/coroutines/pr104051.C | 29 ++
  2 files changed, 36 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr104051.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index d2a765cac11..cb9bbed51e6 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -883,13 +883,14 @@ coro_diagnose_throwing_fn (tree fndecl)
  static bool
  coro_diagnose_throwing_final_aw_expr (tree expr)
  {
-  tree t = TARGET_EXPR_INITIAL (expr);
+  if (TREE_CODE (expr) == TARGET_EXPR)
+expr = TARGET_EXPR_INITIAL (expr);
tree fn = NULL_TREE;
-  if (TREE_CODE (t) == CALL_EXPR)
-fn = CALL_EXPR_FN(t);
-  else if (TREE_CODE (t) == AGGR_INIT_EXPR)
-fn = AGGR_INIT_EXPR_FN (t);
-  else if (TREE_CODE (t) == CONSTRUCTOR)
+  if (TREE_CODE (expr) == CALL_EXPR)
+fn = CALL_EXPR_FN (expr);
+  else if (TREE_CODE (expr) == AGGR_INIT_EXPR)
+fn = AGGR_INIT_EXPR_FN (expr);
+  else if (TREE_CODE (expr) == CONSTRUCTOR)
  return false;
else
  {
diff --git a/gcc/testsuite/g++.dg/coroutines/pr104051.C 
b/gcc/testsuite/g++.dg/coroutines/pr104051.C
new file mode 100644
index 000..ce7ae55405a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr104051.C
@@ -0,0 +1,29 @@
+// { dg-additional-options "-fsyntax-only" }
+#include 
+#include 
+template  struct promise {
+  struct final_awaitable {
+bool await_ready() noexcept;
+template 
+std::coroutine_handle<>
+await_suspend(std::coroutine_handle) noexcept;
+void await_resume() noexcept;
+  };
+  auto get_return_object() {
+return std::coroutine_handle::from_promise(*this);
+  }
+  auto initial_suspend() { return std::suspend_always(); }
+  auto final_suspend() noexcept { return true; }
+  void unhandled_exception();
+};
+template  struct task {
+  using promise_type = promise;
+  task(std::coroutine_handle>);
+  bool await_ready();
+  std::coroutine_handle<> await_suspend(std::coroutine_handle<>);
+  T await_resume();
+};
+task> foo() { // { dg-error {awaitable type 'bool' is not a 
structure} }
+  while ((co_await foo()).empty())
+;
+}

回复：[PATCH] Asan changes for RISC-V.

2022-04-19 Thread joshua via Gcc-patches

Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.
Is it because # libsanitizer not supported rv32, but it will break the rv64 
multi-lib build, so we disable that temporally until rv32 supported# in 
Makefile.in?
--
发件人：Jim Wilson 
发送时间：2020年10月29日(星期四) 07:59
收件人：gcc-patches 
抄　送：cooper.joshua ; Jim Wilson 

主　题：[PATCH] Asan changes for RISC-V.

We have only riscv64 asan support, there is no riscv32 support as yet.  So I
need to be able to conditionally enable asan support for the riscv target.  I
implemented this by returning zero from the asan_shadow_offset function.  This
requires a change to toplev.c and docs in target.def.

The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
The problem is that the asan high memory region is a small wedge below
0x40.  The new kernel puts shared libraries at 0x3f and going
down which works.  But the old kernel puts shared libraries at 0x20
and going up which does not work, as it isn't in any recognized memory
region.  This might be fixable with more asan work, but we don't really need
support for old kernel versions.

The asan port is curious in that it uses 1<<29 for the shadow offset, but all
other 64-bit targets use a number larger than 1<<32.  But what we have is
working OK for now.

I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running on
qemu and the results look reasonable.

  === gcc Summary ===

# of expected passes  1905
# of unexpected failures 11
# of unsupported tests  224

  === g++ Summary ===

# of expected passes  2002
# of unexpected failures 6
# of unresolved testcases 1
# of unsupported tests  175

OK?

Jim

2020-10-28  Jim Wilson  

 gcc/
 * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
 (TARGET_ASAN_SHADOW_OFFSET): New.
 * doc/tm.texi: Regenerated.
 * target.def (asan_shadow_offset); Mention that it can return zero.
 * toplev.c (process_options): Check for and handle zero return from
 targetm.asan_shadow_offset call.

Co-Authored-By: cooper.joshua 
---
 gcc/config/riscv/riscv.c | 16 
 gcc/doc/tm.texi  |  3 ++-
 gcc/target.def   |  3 ++-
 gcc/toplev.c |  3 ++-
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..6909e200de1 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
   return true;
 }

+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+riscv_asan_shadow_offset (void)
+{
+  /* We only have libsanitizer support for RV64 at present.
+
+ This number must match kRiscv*_ShadowOffset* in the file
+ libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
+ even though 1<<36 makes more sense.  */
+  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
 #undef TARGET_NEW_ADDRESS_PROFITABLE_P
 #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p

+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 24c37f655c8..39c596b647a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12078,7 +12078,8 @@ is zero, which disables this optimization.
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET 
(void)
 Return the offset bitwise ored into shifted address to get corresponding
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
-supported by the target.
+supported by the target.  May return 0 if Address Sanitizer is not supported
+by a subtarget.
 @end deftypefn

 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK 
(unsigned HOST_WIDE_INT @var{val})
diff --git a/gcc/target.def b/gcc/target.def
index ed2da154e30..268b56b6ebd 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4452,7 +4452,8 @@ DEFHOOK
 (asan_shadow_offset,
  "Return the offset bitwise ored into shifted address to get corresponding\n\
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
-supported by the target.",
+supported by the target.  May return 0 if Address Sanitizer is not supported\n\
+by a subtarget.",
  unsigned HOST_WIDE_INT, (void),
  NULL)

diff --git a/gcc/toplev.c b/gcc/toplev.c
index 20e231f4d2a..cf89598252c 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1834,7 +1834,8 @@ process_options (void)
 }

   if ((flag_sanitize & SANITIZE_USER_ADDRESS)
-  && targetm.asan_shadow_o

Re: [PATCH] c++, coroutines: Account for overloaded promise return_value() [PR105301].

2022-04-19 Thread Jason Merrill via Gcc-patches


On 4/18/22 10:03, Iain Sandoe wrote:

Whether it was intended or not, it is possible to define a coroutine promise
with multiple return_value() methods [which need not even have the same type].

We were not accounting for this possibility in the check to see whether both
return_value and return_void are specifier (which is prohibited by the
standard).  Fixed thus and provided an adjusted diagnostic for the case that
multiple return_value() methods are present.

tested on x86_64-darwin, OK for mainline? / Backports? (when?)
thanks,
Iain

Signed-off-by: Iain Sandoe 

PR c++/105301

gcc/cp/ChangeLog:

* coroutines.cc (coro_promise_type_found_p): Account for possible
mutliple overloads of the promise return_value() method.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr105301.C: New test.
---
  gcc/cp/coroutines.cc   | 10 -
  gcc/testsuite/g++.dg/coroutines/pr105301.C | 49 ++
  2 files changed, 57 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr105301.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index dcc2284171b..d2a765cac11 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -513,8 +513,14 @@ coro_promise_type_found_p (tree fndecl, location_t loc)
  coro_info->promise_type);
  inform (DECL_SOURCE_LOCATION (BASELINK_FUNCTIONS (has_ret_void)),
  "% declared here");
- inform (DECL_SOURCE_LOCATION (BASELINK_FUNCTIONS (has_ret_val)),
- "% declared here");
+ has_ret_val = BASELINK_FUNCTIONS (has_ret_val);
+ const char *message = "% declared here";
+ if (TREE_CODE (has_ret_val) == OVERLOAD)
+   {
+ has_ret_val = OVL_FIRST (has_ret_val);
+ message = "% first declared here";
+   }


You could also use get_first_fn, but the patch is OK as is.  I'm 
inclined to leave backports in coroutines.cc to your discretion, you 
probably have a better idea of how important they are.



+ inform (DECL_SOURCE_LOCATION (has_ret_val), message);
  coro_info->coro_co_return_error_emitted = true;
  return false;
}
diff --git a/gcc/testsuite/g++.dg/coroutines/pr105301.C 
b/gcc/testsuite/g++.dg/coroutines/pr105301.C
new file mode 100644
index 000..33a0b03cf5d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr105301.C
@@ -0,0 +1,49 @@
+// { dg-additional-options "-fsyntax-only" }
+namespace std {
+template 
+struct traits_sfinae_base {};
+
+template 
+struct coroutine_traits : public traits_sfinae_base {};
+}
+
+template struct coro {};
+template 
+struct std::coroutine_traits, Ps...> {
+  using promise_type = Promise;
+};
+
+struct awaitable {
+  bool await_ready() noexcept;
+  template 
+  void await_suspend(F) noexcept;
+  void await_resume() noexcept;
+} a;
+
+struct suspend_always {
+  bool await_ready() noexcept { return false; }
+  template 
+  void await_suspend(F) noexcept;
+  void await_resume() noexcept {}
+};
+
+namespace std {
+template 
+struct coroutine_handle {};
+}
+
+struct bad_promise_6 {
+  coro get_return_object();
+  suspend_always initial_suspend();
+  suspend_always final_suspend() noexcept;
+  void unhandled_exception();
+  void return_void();
+  void return_value(int) const;
+  void return_value(int);
+};
+
+coro
+bad_implicit_return() // { dg-error {.aka 'bad_promise_6'. declares both 
'return_value' and 'return_void'} }
+{
+  co_await a;
+}

Re: [PATCH] c++, coroutines: Make sure our temporaries are in a bind expr [PR105287]

2022-04-19 Thread Jason Merrill via Gcc-patches


On 4/18/22 10:02, Iain Sandoe wrote:

There are a few cases where we can generate a temporary that does not need
to be added to the coroutine frame (i.e. these are genuinely ephemeral).  The
intent was that unnamed temporaries should not be 'promoted' to coroutine
frame entries.  However there was a thinko and these were not actually ever
added to the bind expressions being generated for the expanded awaits.  This
meant that they were showing in the global namspace, leading to an empty
DECL_CONTEXT and the ICE reported.

tested on x86_64-darwin, OK for mainline? / Backports? (when?)
thanks,
Iain

Signed-off-by: Iain Sandoe 

PR c++/105287

gcc/cp/ChangeLog:

* coroutines.cc (maybe_promote_temps): Ensure generated temporaries
are added to the bind expr.
(add_var_to_bind): Fix local var naming to use portable punctuation.
(register_local_var_uses): Do not add synthetic names to unnamed
temporaries.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr105287.C: New test.
---
  gcc/cp/coroutines.cc   | 17 
  gcc/testsuite/g++.dg/coroutines/pr105287.C | 48 ++
  2 files changed, 56 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr105287.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index a9ce6e050dd..dcc2284171b 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3103,7 +3103,7 @@ maybe_promote_temps (tree *stmt, void *d)
 If the initializer is a conditional expression, we need to collect
 and declare any promoted variables nested within it.  DTORs for such
 variables must be run conditionally too.  */
-  if (t->var && DECL_NAME (t->var))
+  if (t->var)
{
  tree var = t->var;
  DECL_CHAIN (var) = vlist;
@@ -3304,7 +3304,7 @@ add_var_to_bind (tree& bind, tree var_type,
tree b_vars = BIND_EXPR_VARS (bind);
/* Build a variable to hold the condition, this will be included in the
   frame as a local var.  */
-  char *nam = xasprintf ("%s.%d", nam_root, nam_vers);
+  char *nam = xasprintf ("`__%s_%d", nam_root, nam_vers);


` is portable?


tree newvar = build_lang_decl (VAR_DECL, get_identifier (nam), var_type);
free (nam);
DECL_CHAIN (newvar) = b_vars;
@@ -3949,7 +3949,7 @@ register_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
 scopes with identically named locals and still be able to
 identify them in the coroutine frame.  */
  tree lvname = DECL_NAME (lvar);
- char *buf;
+ char *buf = NULL;
  
  	  /* The outermost bind scope contains the artificial variables that

 we inject to implement the coro state machine.  We want to be able
@@ -3959,14 +3959,13 @@ register_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  else if (lvname != NULL_TREE)
buf = xasprintf ("%s_%u_%u", IDENTIFIER_POINTER (lvname),
 lvd->nest_depth, lvd->bind_indx);
- else
-   buf = xasprintf ("_D%u_%u_%u", DECL_UID (lvar), lvd->nest_depth,
-lvd->bind_indx);
  /* TODO: Figure out if we should build a local type that has any
 excess alignment or size from the original decl.  */
- local_var.field_id
-   = coro_make_frame_entry (lvd->field_list, buf, lvtype, lvd->loc);
- free (buf);
+ if (buf) {


Brace should be on the next line.


+   local_var.field_id
+ = coro_make_frame_entry (lvd->field_list, buf, lvtype, lvd->loc);
+   free (buf);
+ }
  /* We don't walk any of the local var sub-trees, they won't contain
 any bind exprs.  */
}
diff --git a/gcc/testsuite/g++.dg/coroutines/pr105287.C 
b/gcc/testsuite/g++.dg/coroutines/pr105287.C
new file mode 100644
index 000..9790945287d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr105287.C
@@ -0,0 +1,48 @@
+// { dg-additional-options "-fanalyzer" }
+// { dg-excess-errors "lots of analyzer output, but no ICE" }
+namespace std {
+template  struct coroutine_traits : _Result {};
+template  struct coroutine_handle {
+  operator coroutine_handle<>();
+};
+}
+struct coro1 {
+  using handle_type = std::coroutine_handle<>;
+  coro1(handle_type);
+  struct suspend_always_prt {
+bool await_ready() noexcept;
+void await_suspend(handle_type) noexcept;
+void await_resume() noexcept;
+  };
+  struct promise_type {
+std::coroutine_handle<> ch_;
+auto get_return_object() { return ch_; }
+auto initial_suspend() { return suspend_always_prt{}; }
+auto final_suspend() noexcept { return suspend_always_prt{}; }
+void unhandled_exception();
+  };
+};
+struct BoolAwaiter {
+  BoolAwaiter(bool);
+  bool await_ready();
+  void await_suspend(std::coroutine_handle<>);
+  bool await_resume();
+};
+struct IntAwaiter {
+  IntAwaiter(int);
+  bool await_ready();
+  void await

Re: Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-19 Thread HAO CHEN GUI via Gcc-patches

Hi Segher,
   Yes, the old committed patch caused it matches two insns.
So I submitted the new patch which fixes the problem. Here is
the new patch.
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html

The new pattern is:
/* { dg-final { scan-rtl-dump-times {\(compare:CC \(and:SI \(subreg:SI 
\(reg:DI} 1 "combine" } } */

I tested it and it is fine on all sub-targets.
Thanks.

On 20/4/2022 上午 5:06, Segher Boessenkool wrote:
> On Tue, Apr 19, 2022 at 04:05:06PM +0800, HAO CHEN GUI wrote:
>>I tested the test case on Linux and AIX with both big and little endian.
>> The test case requires lp64 target, so it won't be tested on 32-bit targets.
>>
>> On big endian (both AIX and Linux), it should match
>> (compare:CC (and:SI (subreg:SI (reg:DI 207) 4)
>>
>> On little endian (both AIX and Linux), it should match
>> (compare:CC (and:SI (subreg:SI (reg:DI 207) 0)
>>
>> So, the pattern in my patch should work fine.
>>
>> /* { dg-final { scan-rtl-dump-times {\(compare:CC \(and:SI \(subreg:SI 
>> \(reg:DI} 1 "combine" } } */
> 
> On powerpc64-linux:
> 
> FAIL: gcc.target/powerpc/pr56605.c scan-rtl-dump-times combine "\\(compare:CC 
> \\((?:and|zero_extend):(?:[SD]I) \\((?:sub)?reg:[SD]I" 1
> 
> It matches twice instead of once, namely:
> 
> (insn 19 18 20 2 (parallel [
> (set (reg:CC 208)
> (compare:CC (and:SI (subreg:SI (reg:DI 207) 4)
> (const_int 3 [0x3]))
> (const_int 0 [0])))
> (set (reg:SI 129 [ prolog_loop_niters.5 ])
> (and:SI (subreg:SI (reg:DI 207) 4)
> (const_int 3 [0x3])))
> ]) 208 {*andsi3_imm_mask_dot2}
>  (nil))
> 
> (insn 81 80 82 11 (parallel [
> (set (reg:CC 232)
> (compare:CC (and:DI (subreg:DI (reg:SI 136 [ niters.6 ]) 0)
> (const_int 7 [0x7]))
> (const_int 0 [0])))
> (clobber (scratch:DI))
> ]) 207 {*anddi3_imm_mask_dot}
>  (expr_list:REG_DEAD (reg:SI 136 [ niters.6 ])
> (nil)))
> 
> The paradoxical subreg in the latter wasn't expected :-)
> 
> 
> Segher

Re: [PATCH] fold, simplify-rtx: Punt on non-representable floating point constants [PR104522]

2022-04-19 Thread Qing Zhao via Gcc-patches



> On Apr 14, 2022, at 1:53 AM, Richard Biener  
> wrote:
> 
> On Wed, Apr 13, 2022 at 5:22 PM Qing Zhao  wrote:
>> 
>> Hi, Richard,
>> 
>> Thanks a lot for taking a look at this issue (and Sorry that I haven’t fixed 
>> this one yet, I was distracted by other tasks then just forgot this one….)
>> 
>>> On Apr 13, 2022, at 3:41 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Tue, Feb 15, 2022 at 5:31 PM Qing Zhao via Gcc-patches
>>>  wrote:
 
 
 
> On Feb 15, 2022, at 3:58 AM, Jakub Jelinek  wrote:
> 
> Hi!
> 
> For IBM double double I've added in PR95450 and PR99648 verification that
> when we at the tree/GIMPLE or RTL level interpret target bytes as a 
> REAL_CST
> or CONST_DOUBLE constant, we try to encode it back to target bytes and
> verify it is the same.
> This is because our real.c support isn't able to represent all valid 
> values
> of IBM double double which has variable precision.
> In PR104522, it has been noted that we have similar problem with the
> Intel/Motorola extended XFmode formats, our internal representation isn't
> able to record pseudo denormals, pseudo infinities, pseudo NaNs and 
> unnormal
> values.
> So, the following patch is an attempt to extend that verification to all
> floats.
> Unfortunately, it wasn't that straightforward, because the
> __builtin_clear_padding code exactly for the XFmode long doubles needs to
> discover what bits are padding and does that by interpreting memory of
> all 1s.  That is actually a valid supported value, a qNaN with negative
> sign with all mantissa bits set, but the verification includes also the
> padding bits (exactly what __builtin_clear_padding wants to figure out)
> and so fails the comparison check and so we ICE.
> The patch fixes that case by moving that verification from
> native_interpret_real to its caller, so that clear_padding_type can
> call native_interpret_real and avoid that extra check.
> 
> With this, the only thing that regresses in the testsuite is
> +FAIL: gcc.target/i386/auto-init-4.c scan-assembler-times 
> long\\t-16843010 5
> because it decides to use a pattern that has non-zero bits in the padding
> bits of the long double, so the simplify-rtx.cc change prevents folding
> a SUBREG into a constant.  We emit (the testcase is -O0 but we emit worse
> code at all opt levels) something like:
>  movabsq $-72340172838076674, %rax
>  movabsq $-72340172838076674, %rdx
>  movq%rax, -48(%rbp)
>  movq%rdx, -40(%rbp)
>  fldt-48(%rbp)
>  fstpt   -32(%rbp)
> instead of
>  fldt.LC2(%rip)
>  fstpt   -32(%rbp)
> ...
> .LC2:
>  .long   -16843010
>  .long   -16843010
>  .long   65278
>  .long   0
> Note, neither of those sequences actually stores the padding bits, fstpt
> simply doesn't touch them.
> For vars with clear_padding_real_needs_padding_p types that are allocated
> to memory at expansion time, I'd say much better would be to do the stores
> using integral modes rather than XFmode, so do that:
>  movabsq $-72340172838076674, %rax
> movq%rax, -32(%rbp)
> movq%rax, -24(%rbp)
> directly.  That is the only way to ensure the padding bits are initialized
> (or expand __builtin_clear_padding, but then you initialize separately the
> value bits and padding bits).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, though as mentioned
> above, the gcc.target/i386/auto-init-4.c case is unresolved.
 
 Thanks, I will try to fix this testing case in a later patch.
>>> 
>>> I've looked at this FAIL now and really wonder whether "pattern init" as
>>> implemented makes any sense for non-integral types.
>>> We end up with
>>> initializing a register (SSA name) with
>>> 
>>> VIEW_CONVERT_EXPR(0xfefefefefefefefefefefefefefefefe)
>>> 
>>> as we go building a TImode constant (we verified we have a TImode SET!)
>>> but then
>>> 
>>> /* Pun the LHS to make sure its type has constant size
>>>unless it is an SSA name where that's already known.  */
>>> if (TREE_CODE (lhs) != SSA_NAME)
>>>   lhs = build1 (VIEW_CONVERT_EXPR, itype, lhs);
>>> else
>>>   init = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), init);
>>> ...
>>> expand_assignment (lhs, init, false);
>>> 
>>> and generally registers do not have any padding.  This weird expansion
>>> then causes us to spill the TImode constant and reload the XFmode value,
>>> which is definitely not optimal here.
>>> 
>>> One approach to avoid the worse code generation would be to use mode
>>> specific patterns for registers (like using a NaN or a target specific
>>> value that
>>> can be loaded cheaply),
>> 
>> You mean that using “mode specific patterns” ONLY for registers?
>> Can we use

Re: Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-19 Thread Segher Boessenkool

On Tue, Apr 19, 2022 at 04:05:06PM +0800, HAO CHEN GUI wrote:
>I tested the test case on Linux and AIX with both big and little endian.
> The test case requires lp64 target, so it won't be tested on 32-bit targets.
> 
> On big endian (both AIX and Linux), it should match
> (compare:CC (and:SI (subreg:SI (reg:DI 207) 4)
> 
> On little endian (both AIX and Linux), it should match
> (compare:CC (and:SI (subreg:SI (reg:DI 207) 0)
> 
> So, the pattern in my patch should work fine.
> 
> /* { dg-final { scan-rtl-dump-times {\(compare:CC \(and:SI \(subreg:SI 
> \(reg:DI} 1 "combine" } } */

On powerpc64-linux:

FAIL: gcc.target/powerpc/pr56605.c scan-rtl-dump-times combine "\\(compare:CC 
\\((?:and|zero_extend):(?:[SD]I) \\((?:sub)?reg:[SD]I" 1

It matches twice instead of once, namely:

(insn 19 18 20 2 (parallel [
(set (reg:CC 208)
(compare:CC (and:SI (subreg:SI (reg:DI 207) 4)
(const_int 3 [0x3]))
(const_int 0 [0])))
(set (reg:SI 129 [ prolog_loop_niters.5 ])
(and:SI (subreg:SI (reg:DI 207) 4)
(const_int 3 [0x3])))
]) 208 {*andsi3_imm_mask_dot2}
 (nil))

(insn 81 80 82 11 (parallel [
(set (reg:CC 232)
(compare:CC (and:DI (subreg:DI (reg:SI 136 [ niters.6 ]) 0)
(const_int 7 [0x7]))
(const_int 0 [0])))
(clobber (scratch:DI))
]) 207 {*anddi3_imm_mask_dot}
 (expr_list:REG_DEAD (reg:SI 136 [ niters.6 ])
(nil)))

The paradoxical subreg in the latter wasn't expected :-)


Segher

Re: [PATCH] PR fortran/104812: generate error for constuct-name clash with symbols

2022-04-19 Thread Harald Anlauf via Gcc-patches


Hi Mike,

for contributing, you'd need to have a GNU copyright assignment or
DCO certification, and I cannot find your name in the usual place.

See e.g. https://gcc.gnu.org/dco.html for details.

Thanks,
Harald

Am 05.04.22 um 19:33 schrieb Mike Kashkarov via Gcc-patches:


Greetings,

Propose patch for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104812 to
reject non-conforming code when construct-name clashes with already
defined symbol names, e.g:

  subroutine s1
logical :: x
x: if (x) then ! Currently gfortran accepts 'x' as constuct-name
end if x
  end
  
Steve Kargl poited that (Fortran 2018, 19.4, p 498):


Identifiers of entities, other than statement or construct entities (19.4),
in the classes

  (1) named variables, ..., named constructs, ...,

   Within its scope, a local identifier of one class shall not be the
   same as another local identifier of the same class,
   


Regtested on x86_64-pc-linux-gnu, OK for mainline?

Thanks.

Re: [PATCH] libgo: Fix non-portable sed commands

2022-04-19 Thread Ian Lance Taylor via Gcc-patches

On Tue, Apr 19, 2022 at 11:06 AM Jonathan Wakely  wrote:
>
> This fixes the libgo build if /usr/bin/sed is found before
> /usr/xpg4/bin/sed on Solaris.
>
> Tested sparc-sun-solaris2.11, OK for trunk?

Thanks, already committed based on your earlier e-mail.

Ian

Re: Error when building gcc w/ Go language on Solaris

2022-04-19 Thread Ian Lance Taylor via Gcc-patches

On Tue, Apr 19, 2022 at 6:36 AM Jonathan Wakely  wrote:
>
> The 'check-tail' target in libgo/Makefile.am does:
>
> ...  | sed -n -e 's/.* \(version.*$$\)/\1/p'` >> libgo.sum
>
> This doesn't work with Solaris sed (and is documented by Autoconf as
> being non-portable). The $ needs to be outside the back-reference
> expression:
>
> ...  | sed -n -e 's/.* \(version.*\)$$/\1/p'` >> libgo.sum
>
> This should be OK to change, because the $ is just an anchor and
> doesn't need to be captured.
>
> More significantly, I see errors like:
>
> /export/home/jwakely/src/gcc/libgo/match.sh: line 114: ((: go1.13 :
> syntax error: invalid arithmetic operator (error token is ".13 ")
>
> That script uses \+ in a sed script, which is not supported by POSIX
> sed, because it's not in the BRE grammar. That seems to be the cause
> of the match.sh errors. The attached patch fixes it.

Thanks for looking.  Committed to mainline like so.

Ian
75f7b65d3f775f06be08c5d2a9573b49a4b4b1d5
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 2321f67ca5d..63238715bd0 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-22b0ccda3aa4d16f770a26a3eb251f8da615c318
+99ca6be406a5781be078ff23f45a72b4c84b16e3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index e0a1eec52a2..a5d4b6a3525 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -1305,7 +1305,7 @@ check-tail: check-recursive check-multi
if test "$$untested" -ne "0"; then \
  echo "# of untested testcases $$untested" >> libgo.sum; \
fi; \
-   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*$$\)/\1/p'` >> libgo.sum; \
+   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*\)$$/\1/p'` >> libgo.sum; \
echo >> libgo.log; \
echo "runtest completed at `date`" >> libgo.log; \
if test "$$fail" -ne "0"; then \
diff --git a/libgo/match.sh b/libgo/match.sh
index 139d0cdbe64..7ed587ff794 100755
--- a/libgo/match.sh
+++ b/libgo/match.sh
@@ -100,7 +100,7 @@ fi
 
 gobuild() {
 line=$(echo "$1" | sed -e 's|//go:build ||')
-line=$(echo "$line" | sed -e 's/go1\.[0-9]\+/1/g' -e 
's/goexperiment\./goexperiment/')
+line=$(echo "$line" | sed -e 's/go1\.[0-9][0-9]*/1/g' -e 
's/goexperiment\./goexperiment/')
 line=" $line "
 wrap='[ ()!&|]'
 for ones in $goarch $goos $cgotag $cmdlinetag gccgo 
goexperimentfieldtrack; do

Re: [PATCH, gcc-11 backport] gcov-profile: Allow negative counts of indirect calls [PR105282]

2022-04-19 Thread Martin Liška


On 4/19/22 21:28, Sergei Trofimovich wrote:

From: Sergei Trofimovich 

TOPN metrics are histograms that contain overall count and per-bucket
count. Overall count can be negative when two profiles merge and some
of per-bucket metrics are disacarded.


I'm fine with that but I think, as we're close to 11.3.0, it's up to release 
managers
who should approve that.

Cheers,
Martin



Noticed as an ICE on python PGO build where gcc crashes as:

 during IPA pass: modref
 a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
36 | }
   | ^
 stream_out_histogram_value(output_block*, histogram_value_t*)
 gcc/value-prof.cc:340

gcc/ChangeLog:

PR gcov-profile/105282
* value-prof.cc (stream_out_histogram_value): Allow negative counts
on HIST_TYPE_INDIR_CALL.

(cherry picked from commit 90a29845bfe7d6002e6c2fd49a97820b00fbc4a3)
---
  gcc/value-prof.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 42748771192..688089b04d2 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -336,6 +336,10 @@ stream_out_histogram_value (struct output_block *ob, 
histogram_value hist)
/* Note that the IOR counter tracks pointer values and these can have
   sign bit set.  */
;
+  else if (hist->type == HIST_TYPE_INDIR_CALL && i == 0)
+   /* 'all' counter overflow is stored as a negative value. Individual
+  counters and values are expected to be non-negative.  */
+   ;
else
gcc_assert (value >= 0);

[PATCH, gcc-11 backport] gcov-profile: Allow negative counts of indirect calls [PR105282]

2022-04-19 Thread Sergei Trofimovich via Gcc-patches

From: Sergei Trofimovich 

TOPN metrics are histograms that contain overall count and per-bucket
count. Overall count can be negative when two profiles merge and some
of per-bucket metrics are disacarded.

Noticed as an ICE on python PGO build where gcc crashes as:

during IPA pass: modref
a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
   36 | }
  | ^
stream_out_histogram_value(output_block*, histogram_value_t*)
gcc/value-prof.cc:340

gcc/ChangeLog:

PR gcov-profile/105282
* value-prof.cc (stream_out_histogram_value): Allow negative counts
on HIST_TYPE_INDIR_CALL.

(cherry picked from commit 90a29845bfe7d6002e6c2fd49a97820b00fbc4a3)
---
 gcc/value-prof.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 42748771192..688089b04d2 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -336,6 +336,10 @@ stream_out_histogram_value (struct output_block *ob, 
histogram_value hist)
/* Note that the IOR counter tracks pointer values and these can have
   sign bit set.  */
;
+  else if (hist->type == HIST_TYPE_INDIR_CALL && i == 0)
+   /* 'all' counter overflow is stored as a negative value. Individual
+  counters and values are expected to be non-negative.  */
+   ;
   else
gcc_assert (value >= 0);
 
-- 
2.35.1

[PATCH] libsanitizer: asan: Always skip first object from dl_iterate_phdr

2022-04-19 Thread Michael Forney

All platforms return the main executable as the first dl_phdr_info.
FreeBSD, NetBSD, Solaris, and Linux-musl place the executable name
in the dlpi_name field of this entry. It appears that only Linux-glibc
uses the empty string.

To make this work generically on all platforms, unconditionally
skip the first object (like is currently done for FreeBSD and NetBSD).
This fixes first DSO detection on Linux-musl. It also would likely
fix detection on Solaris/Illumos if it were to gain PIE support
(since dlpi_addr would not be NULL).

Additionally, only skip the Linux vDSO on Linux.

Finally, use the empty string as the "seen first dl_phdr_info"
marker rather than (char *)-1. If there was no other object, we
would try to dereference it for a string comparison.

Cherry-picked from upstream commit 795b07f5498c.

libsanitizer/

* asan/asan_linux.cpp: Always skip first object from
dl_iterate_phdr.
---
Is it possible that this change might make gcc 12? It fixes asan
on musl (without setting ASAN_OPTIONS=verify_asan_link_order=false),
which would be quite nice since this will be the first release that
libsanitizer works on musl.

 libsanitizer/asan/asan_linux.cpp | 30 --
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/libsanitizer/asan/asan_linux.cpp b/libsanitizer/asan/asan_linux.cpp
index ad3693d5e6a..89ee48db7a2 100644
--- a/libsanitizer/asan/asan_linux.cpp
+++ b/libsanitizer/asan/asan_linux.cpp
@@ -131,30 +131,24 @@ static int FindFirstDSOCallback(struct dl_phdr_info 
*info, size_t size,
   VReport(2, "info->dlpi_name = %s\tinfo->dlpi_addr = %p\n", info->dlpi_name,
   (void *)info->dlpi_addr);
 
-  // Continue until the first dynamic library is found
-  if (!info->dlpi_name || info->dlpi_name[0] == 0)
-return 0;
-
-  // Ignore vDSO
-  if (internal_strncmp(info->dlpi_name, "linux-", sizeof("linux-") - 1) == 0)
-return 0;
+  const char **name = (const char **)data;
 
-#if SANITIZER_FREEBSD || SANITIZER_NETBSD
   // Ignore first entry (the main program)
-  char **p = (char **)data;
-  if (!(*p)) {
-*p = (char *)-1;
+  if (!*name) {
+*name = "";
 return 0;
   }
-#endif
 
-#if SANITIZER_SOLARIS
-  // Ignore executable on Solaris
-  if (info->dlpi_addr == 0)
+#if SANITIZER_LINUX
+  // Ignore vDSO. glibc versions earlier than 2.15 (and some patched
+  // by distributors) return an empty name for the vDSO entry, so
+  // detect this as well.
+  if (!info->dlpi_name[0] ||
+  internal_strncmp(info->dlpi_name, "linux-", sizeof("linux-") - 1) == 0)
 return 0;
-#endif
+#endif
 
-  *(const char **)data = info->dlpi_name;
+  *name = info->dlpi_name;
   return 1;
 }
 
@@ -175,7 +169,7 @@ void AsanCheckDynamicRTPrereqs() {
   // Ensure that dynamic RT is the first DSO in the list
   const char *first_dso_name = nullptr;
   dl_iterate_phdr(FindFirstDSOCallback, &first_dso_name);
-  if (first_dso_name && !IsDynamicRTName(first_dso_name)) {
+  if (first_dso_name && first_dso_name[0] && !IsDynamicRTName(first_dso_name)) 
{
 Report("ASan runtime does not come first in initial library list; "
"you should either link runtime to your application or "
"manually preload it with LD_PRELOAD.\n");
-- 
2.35.1

[PATCH] libgo: Fix non-portable sed commands

2022-04-19 Thread Jonathan Wakely via Gcc-patches

This fixes the libgo build if /usr/bin/sed is found before
/usr/xpg4/bin/sed on Solaris.

Tested sparc-sun-solaris2.11, OK for trunk?

-- >8 --

Solaris sed does not allow '^' and '$' anchors inside groups, and does
not support the '+' meta-character.

ChangeLog:

* libgo/Makefile.am (check-tail): Fix non-portable sed command.
* libgo/Makefile.in: Regenerate.
* libgo/match.sh (gobuild): Fix non-portable sed command.
---
 libgo/Makefile.am | 2 +-
 libgo/Makefile.in | 2 +-
 libgo/match.sh| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index e0a1eec52a2..a5d4b6a3525 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -1305,7 +1305,7 @@ check-tail: check-recursive check-multi
if test "$$untested" -ne "0"; then \
  echo "# of untested testcases $$untested" >> libgo.sum; \
fi; \
-   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*$$\)/\1/p'` >> libgo.sum; \
+   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*\)$$/\1/p'` >> libgo.sum; \
echo >> libgo.log; \
echo "runtest completed at `date`" >> libgo.log; \
if test "$$fail" -ne "0"; then \
diff --git a/libgo/Makefile.in b/libgo/Makefile.in
index 7bef5df90d1..22f48a52938 100644
--- a/libgo/Makefile.in
+++ b/libgo/Makefile.in
@@ -3189,7 +3189,7 @@ check-tail: check-recursive check-multi
if test "$$untested" -ne "0"; then \
  echo "# of untested testcases $$untested" >> libgo.sum; \
fi; \
-   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*$$\)/\1/p'` >> libgo.sum; \
+   echo `echo $(GOC) | sed -e 's/ .*//'`  `$(GOC) -v 2>&1 | grep " 
version" | sed -n -e 's/.* \(version.*\)$$/\1/p'` >> libgo.sum; \
echo >> libgo.log; \
echo "runtest completed at `date`" >> libgo.log; \
if test "$$fail" -ne "0"; then \
diff --git a/libgo/match.sh b/libgo/match.sh
index 139d0cdbe64..7ed587ff794 100755
--- a/libgo/match.sh
+++ b/libgo/match.sh
@@ -100,7 +100,7 @@ fi
 
 gobuild() {
 line=$(echo "$1" | sed -e 's|//go:build ||')
-line=$(echo "$line" | sed -e 's/go1\.[0-9]\+/1/g' -e 
's/goexperiment\./goexperiment/')
+line=$(echo "$line" | sed -e 's/go1\.[0-9][0-9]*/1/g' -e 
's/goexperiment\./goexperiment/')
 line=" $line "
 wrap='[ ()!&|]'
 for ones in $goarch $goos $cgotag $cmdlinetag gccgo 
goexperimentfieldtrack; do
-- 
2.34.1

[PATCH] MAINTAINERS: Update my email address.

2022-04-19 Thread Richard Henderson via Gcc-patches

2022-04-19  Richard Henderson  

* MAINTAINERS: Update my email address.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 30f81b3dd52..15973503722 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -53,7 +53,7 @@ aarch64 port  Richard Earnshaw

 aarch64 port   Richard Sandiford   
 aarch64 port   Marcus Shawcroft
 aarch64 port   Kyrylo Tkachov  
-alpha port Richard Henderson   
+alpha port Richard Henderson   
 amdgcn portJulian Brown
 amdgcn portAndrew Stubbs   
 arc port   Joern Rennecke  
-- 
2.34.1

[PATCH v3] RISC-V: Add support for inlining subword atomic operations

2022-04-19 Thread Patrick O'Neill

RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.

This patch changes the default behavior to inline subword atomic calls 
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.

gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.

2022-04-19 Patrick O'Neill 

PR target/104338
* riscv-protos.h: Add helper function stubs.
* riscv.cc: Add helper functions for subword masking.
* riscv.opt: Add command-line flag.
* sync.md: Add masking logic and inline asm for fetch_and_op,
fetch_and_nand, CAS, and exchange ops.
* invoke.texi: Add blurb regarding command-line flag.
* inline-atomics-1.c: New test.
* inline-atomics-2.c: Likewise.
* inline-atomics-3.c: Likewise.
* inline-atomics-4.c: Likewise.
* inline-atomics-5.c: Likewise.
* inline-atomics-6.c: Likewise.
* inline-atomics-7.c: Likewise.
* inline-atomics-8.c: Likewise.
* atomic.c: Add reference to duplicate logic.

Signed-off-by: Patrick O'Neill 
---
There may be further concerns about the memory consistency of these 
operations, but this patch focuses on simply moving the logic inline.

See "[RFC 0/7] RISCV: Implement ISA Manual Table A.6 Mappings" sent to
the gcc-patches mailing list on 2022-04-07 for info about these
concerns.
---
See target/84568 on bugzilla for ABI break info.
---
The implementation for fetch_nand is clunky. I'm not convinced that this
is the best implementation since it duplicates all the logic in order to
change ~2 lines.
---
v2 Changelog:
 - Add texti blurb
 - Update target flag
 - add 'UNSPEC_SYNC_OLD_OP_SUBWORD' for subword ops
---
v3 Changelog:
 - Update target flag to be on by default
 - Add inline CAS and fetch&nand ops
 - Remove brittle tests & -latomic flag for inline tests
 - Add compare_and_exchange and exchange tests
 - Move duplicate masking logic to riscv.cc helper functions
---
 gcc/config/riscv/riscv-protos.h   |   2 +
 gcc/config/riscv/riscv.cc |  52 ++
 gcc/config/riscv/riscv.opt|   4 +
 gcc/config/riscv/sync.md  | 318 ++
 gcc/doc/invoke.texi   |   7 +
 .../gcc.target/riscv/inline-atomics-1.c   |  18 +
 .../gcc.target/riscv/inline-atomics-2.c   |  19 +
 .../gcc.target/riscv/inline-atomics-3.c   | 569 ++
 .../gcc.target/riscv/inline-atomics-4.c   | 566 +
 .../gcc.target/riscv/inline-atomics-5.c   |  87 +++
 .../gcc.target/riscv/inline-atomics-6.c   |  87 +++
 .../gcc.target/riscv/inline-atomics-7.c   |  69 +++
 .../gcc.target/riscv/inline-atomics-8.c   |  69 +++
 libgcc/config/riscv/atomic.c  |   2 +
 14 files changed, 1869 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-8.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 20c2381c21a..14f3c8f0d4e 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -74,6 +74,8 @@ extern bool riscv_expand_block_move (rtx, rtx, rtx);
 extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *);
 extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *);
 extern bool riscv_gpr_save_operation_p (rtx);
+extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *);
+extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
 
 /* Routines implemented in riscv-c.cc.  */
 void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ee756aab694..cfd2f7710db 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5587,6 +5587,58 @@ riscv_asan_shadow_offset (void)
   return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
 }
 
+/* Helper function for extracting a subword from memory.  */
+
+void
+riscv_subword_address (rtx mem, rtx *aligned_mem, rtx *shift, rtx *mask,
+  rtx *not_mask)
+{
+  /* Align the memory addess to a word.  */
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
+ gen_int_mode (-4, Pmode)));

[committed] sparc: Preserve ORIGINAL_REGNO in epilogue_renumber [PR105257]

2022-04-19 Thread Jakub Jelinek via Gcc-patches

Hi!

The following testcase ICEs, because the pic register is
(reg:DI 24 %i0 [109]) and is used in the delay slot of a return.
We invoke epilogue_renumber and that changes it to
(reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p
predicate, so we don't recognize the insn anymore.

The following patch fixes that by preserving ORIGINAL_REGNO if
specified, so we get (reg:DI 8 %o0 [109]) instead.

Rainer has kindly bootstrapped/regtested this on sparcv9-sun-solaris2.11,
acked by Eric in the PR, committed to trunk.

2022-04-19  Jakub Jelinek  

PR target/105257
* config/sparc/sparc.cc (epilogue_renumber): If ORIGINAL_REGNO,
use gen_raw_REG instead of gen_rtx_REG and copy over also
ORIGINAL_REGNO.  Use return 0; instead of /* fallthrough */.

* gcc.dg/pr105257.c: New test.

--- gcc/config/sparc/sparc.cc.jj2022-01-18 11:58:59.254986743 +0100
+++ gcc/config/sparc/sparc.cc   2022-04-19 11:12:48.396795736 +0200
@@ -8884,8 +8884,20 @@ epilogue_renumber (rtx *where, int test)
   if (REGNO (*where) >= 8 && REGNO (*where) < 24)  /* oX or lX */
return 1;
   if (! test && REGNO (*where) >= 24 && REGNO (*where) < 32)
-   *where = gen_rtx_REG (GET_MODE (*where), OUTGOING_REGNO 
(REGNO(*where)));
-  /* fallthrough */
+   {
+ if (ORIGINAL_REGNO (*where))
+   {
+ rtx n = gen_raw_REG (GET_MODE (*where),
+  OUTGOING_REGNO (REGNO (*where)));
+ ORIGINAL_REGNO (n) = ORIGINAL_REGNO (*where);
+ *where = n;
+   }
+ else
+   *where = gen_rtx_REG (GET_MODE (*where),
+ OUTGOING_REGNO (REGNO (*where)));
+   }
+  return 0;
+
 case SCRATCH:
 case PC:
 case CONST_INT:
--- gcc/testsuite/gcc.dg/pr105257.c.jj  2022-04-19 11:08:15.239628246 +0200
+++ gcc/testsuite/gcc.dg/pr105257.c 2022-04-19 11:10:02.951117000 +0200
@@ -0,0 +1,16 @@
+/* PR target/105257 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-fpic" { target fpic } } */
+
+extern int sigsetjmp (void **, int);
+void *buf[32];
+void (*fn) (void);
+
+const char *
+bar (void)
+{
+  sigsetjmp (buf, 0);
+  fn ();
+  return "";
+}

Jakub

Re: [PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Segher Boessenkool

On Tue, Apr 19, 2022 at 05:00:12PM +0200, Richard Biener wrote:
> On Tue, 19 Apr 2022, Segher Boessenkool wrote:
> > > > And that always is safe?  Why do we have REG_EH_REGION for those cases
> > > > at all, then?
> > > 
> > > It's the only "safe" thing to do at distribute_notes time I think.  If
> > > it is not "safe" (it might lose must-not-throw EH events, or lose
> > > optimization when dropping nothrow) we probably have to reject the
> > > combination earlier.
> > 
> > So assert this does not happen please?
> 
> I'm not sure how I can.

Losing optimisation is always safe.  If removing a must-not-throw is not
safe we should assert this does not happen (or if we know it does
happen, actually fix that).

> AFAICS I cannot rely on all other REG_EH_REGION
> notes be already present on 'i3' when processing notes from the
> original i1 or i2.  I can only assert we never have any REG_EH_REGION
> notes from i1 or i2 but I already know we do from the last round
> of testsuite failures ;)

The notes originally from i3 are distributed before those from i2, which
are before those from i1, and those are distributed before those from
i0.

> If you can point me to a (hopefully single) place in combine.cc
> where the set of original source insns is fixed and the original
> notes still present I can come up with a test for what should be
> safe input.

Up until
/* Get the old REG_NOTES and LOG_LINKS from all our insns and
   clear them.  */
all old notes are still intact.

The insns are i3, i2, etc.; their patterns can change during combine,
but the insns themselves don't.

> > > As I understand combining to multiple insns always happens via
> > > a split (so we combine up to three insns into one and then might
> > > end up splitting that into at most two insns).
> > 
> > Yes, but not necessarily a define_split or similar: combine itself knows
> > how to split things, too.  The obvious one is a parallel of multiple
> > sets, but it can also break the rhs of a set apart, for example.
> 
> I see.  So I was aiming at "distributing" the notes to the single
> combined insn _before_ splitting it and then making the split process
> DTRT - that's much easier to get right than the current setup where
> we receive notes from random insns a time to be distributed to another
> random insn.

That is impossible to get right, in general.  That is why there is a
from_insn argument to distribute_notes.  If you first move everything
to one combined insn you would have to pry things apart again :-(

> > > The only case we
> > > could in theory special-case is when _all_ original insns combined
> > > have the exact same REG_EH_REGION (all have it with the same
> > > landing pad number) or some have none but i3 at least has one.
> > > Then we should be able to distribute the note to all possibly
> > > two result insns.  But I can't see that distribute_note has
> > > this info readily available (that there not exist conflicting
> > > REG_EH_REGIONs, like MNT on the original i2 and a > 0 one on i3).
> > 
> > Not sure this would be worth the complexity.  Do we see this happen
> > ever, can we even test it?  :-)
> 
> We cannot test this in distribute_notes, we could test for this when
> we have all source insns and reject the cases we cannot possibly
> recover from in the current distribute_notes implementation.

Yes.

> > None of the insns other than i3 are call insns, ever.
> 
> Good.

An understatement :-)


Segher

Re: [PATCH] c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256]

2022-04-19 Thread Jason Merrill via Gcc-patches

On Tue, Apr 19, 2022, 6:53 AM Jakub Jelinek  wrote:

> On Mon, Apr 18, 2022 at 09:57:12AM -0400, Patrick Palka wrote:
> > > Hmm, Patrick made a similar change and then reverted it for PR90996.
> > > But it makes sense to me; when we replace placeholders, it's
> appropriate
> > > to look at the whole aggregate initialization rather than the innermost
> > > CONSTRUCTOR that has DMIs.  Patrick, was there a reason that change
> > > seemed wrong to you, or was it just unnecessary for the bug you were
> > > working on?
> >
> > The reverted change and Jakub's more general patch seem right/safe to
> > me FWIW, I just couldn't come up with a testcase that demonstrated its
> > need at the time unfortunately.
>
> So is the patch ok for trunk then?
> Apparently it is also a recent regression on 11 branch (since Marek's
> r11-9711) when compiling firefox, ok for 11.3 as well?
>

Ok for both.

> > > 2022-04-15  Jakub Jelinek  
> > > >
> > > >   PR c++/105256
> > > >   * typeck2.cc (process_init_constructor_array,
> > > >   process_init_constructor_record,
> process_init_constructor_union): Move
> > > >   CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR
> elements to the
> > > >   containing CONSTRUCTOR.
> > > >
> > > >   * g++.dg/cpp0x/pr105256.C: New test.
>
> Jakub
>
>

Ping: [PATCH 0/2] avr: Add support AVR-DA and DB series devices

2022-04-19 Thread Joel Holdsworth via Gcc-patches

Ping patch.

Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592668.html

Thanks
Joel Holdsworth

Re: [PATCH] maintainer-scripts: Adding GIT_CUSTOMREPO parameters to gcc_release script.

2022-04-19 Thread Joseph Myers

On Tue, 12 Apr 2022, Navid Rahimi via Gcc-patches wrote:

> Hi GCC community,
> 
> I need to have ability to point to custom repository in gcc_release 
> script. This small patch 1) does add a parameter "-g" to add custom 

The purpose of this script is for building official GCC releases, release 
candidates and snapshots for the GCC project; not anything else (although 
it may sometimes be useful to have functionality that's only relevant for 
testing changes to the script rather than as part of actual release, 
release candidate or snapshot builds).  Why would a custom repository be 
relevant for such releases, release candidates or snapshots built for the 
GCC project?

In general, *everything* in the maintainer-scripts/ directory only needs 
to work for the specific limited purposes for which it's run by the 
gccadmin account on gcc.gnu.org; unlike contrib/, it's not expected or 
intended to be more generally useful.

> repository to gcc_release , 2) does add a line to download prerequisites 
> before building GCC (download_prerequisites) which is not present in 
> gcc_release right now.

Official GCC releases, release candidates and snapshots are not meant to 
include those prerequisities in the source directory, so calling that 
script (which puts them there) seems incorrect to me; the script is for 
users to call after downloading such a release, release candidate or 
snapshot, if they don't have the prerequisites built or installed in some 
other way.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 0/4] Use pointer arithmetic for array references [PR102043]

2022-04-19 Thread Richard Biener via Gcc-patches

On Sat, Apr 16, 2022 at 6:57 PM Mikael Morin via Gcc-patches
 wrote:
>
> Hello,
>
> this is a fix for PR102043, which is a wrong code bug caused by the
> middle-end concluding from array indexing that the array index is
> non-negative.  This is a wrong assumption for "reversed arrays",
> that is arrays whose descriptor makes accesses to the array from
> last element to first element.  More generally the wrong cases are
> arrays with a descriptor having a negative stride for at least one
> dimension.
>
> I have been trying to fix this by stopping the front-end from generating
> bogus code, by either stripping array-ness from descriptor data
> pointers, or by changing the initialization of data pointers to point
> to the first element in memory order instead of the first element in
> access order (which is the last in memory order for reversed arrays).
> Both ways are very impacting changes to the frontend and I haven’t
> been able to eliminate all the regressions in time using either way.
>
> However, Richi showed with a patch attached to the PR that array
> references are crucial for the problem to appear, and everything works
> if array indexing is replaced with pointer arithmetic.  This is
> much simpler and doesn’t imply invasive changes to the frontend.
>
> I have built on top of his patch to keep the array indexing in cases
> where the change to pointer arithmetic is not necessary, either because
> the array is not a fortran array with a descriptor, or because it’s
> known to be contiguous.  This has the benefit of reducing the churn
> in the dumped code patterns used in the testsuite.  It also avoids
> ICE regression such as interface_12.f90 or result_in_spec.f90, but
> I can’t exclude that those could be a real problem made latent.
>
> Patches 1 to 3 are preliminary changes to avoid regressions.  The main
> change is number 4, the last in the series.
>
> Regression tested on x86_64-pc-linux-gnu.  OK for master?

I've also tested the patch and built SPEC CPU 2017 successfully
on x86_64 with -Ofast -flto -march=znver2.  For 548.exchange2_r
I see a ~3% runtime regression caused by the change, the
other 6 Fortran using benchmarks show no runtime behavior change.
I have not analyzed the 548.exchange2_r regression (but confirmed
it with a 3-run).

That said, I believe this is the only reasonable thing to do for GCC 12,
all other options require invasive changes in the middle-end.

So OK from my side, I'm not familiar with the GFortran frontend enough
to review the changes besides the gfc_build_array_ref chage though.

Thanks,
Richard.


>
> Mikael Morin (4):
>   fortran: Pre-evaluate string pointers. [PR102043]
>   fortran: Update index extraction code. [PR102043]
>   fortran: Generate an array temporary reference [PR102043]
>   fortran: Use pointer arithmetic to index arrays [PR102043]
>
>  gcc/fortran/trans-array.cc|  60 +-
>  gcc/fortran/trans-expr.cc |   9 +-
>  gcc/fortran/trans-io.cc   |  48 -
>  gcc/fortran/trans.cc  |  42 +++-
>  gcc/fortran/trans.h   |   4 +-
>  .../gfortran.dg/array_reference_3.f90 | 195 ++
>  gcc/testsuite/gfortran.dg/c_loc_test_22.f90   |   4 +-
>  gcc/testsuite/gfortran.dg/dependency_49.f90   |   3 +-
>  gcc/testsuite/gfortran.dg/finalize_10.f90 |   2 +-
>  .../gfortran.dg/negative_stride_1.f90 |  25 +++
>  .../gfortran.dg/vector_subscript_8.f90|  16 ++
>  .../gfortran.dg/vector_subscript_9.f90|  21 ++
>  12 files changed, 401 insertions(+), 28 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/array_reference_3.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/negative_stride_1.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/vector_subscript_8.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/vector_subscript_9.f90
>
> --
> 2.35.1
>

Re: [PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Richard Biener via Gcc-patches

On Tue, 19 Apr 2022, Segher Boessenkool wrote:

> On Tue, Apr 19, 2022 at 02:58:26PM +0200, Richard Biener wrote:
> > On Tue, 19 Apr 2022, Segher Boessenkool wrote:
> > The assert was for any landing pad which obviously failed - the
> > testsuite fails were all for MUST_NOT_THROW (negative) regions
> > which do not end basic-blocks.
> 
> I see, thanks.
> 
> > > > We are also considering all REG_EH_REGION equal, including
> > > > must-not-throw and nothrow kinds but when those are not from i3
> > > > we have no good idea what should happen to them, so the following
> > > > simply drops them.
> > > 
> > > And that always is safe?  Why do we have REG_EH_REGION for those cases
> > > at all, then?
> > 
> > It's the only "safe" thing to do at distribute_notes time I think.  If
> > it is not "safe" (it might lose must-not-throw EH events, or lose
> > optimization when dropping nothrow) we probably have to reject the
> > combination earlier.
> 
> So assert this does not happen please?

I'm not sure how I can.  AFAICS I cannot rely on all other REG_EH_REGION
notes be already present on 'i3' when processing notes from the
original i1 or i2.  I can only assert we never have any REG_EH_REGION
notes from i1 or i2 but I already know we do from the last round
of testsuite failures ;)

If you can point me to a (hopefully single) place in combine.cc
where the set of original source insns is fixed and the original
notes still present I can come up with a test for what should be
safe input.

> > As I understand combining to multiple insns always happens via
> > a split (so we combine up to three insns into one and then might
> > end up splitting that into at most two insns).
> 
> Yes, but not necessarily a define_split or similar: combine itself knows
> how to split things, too.  The obvious one is a parallel of multiple
> sets, but it can also break the rhs of a set apart, for example.

I see.  So I was aiming at "distributing" the notes to the single
combined insn _before_ splitting it and then making the split process
DTRT - that's much easier to get right than the current setup where
we receive notes from random insns a time to be distributed to another
random insn.

> > The only case we
> > could in theory special-case is when _all_ original insns combined
> > have the exact same REG_EH_REGION (all have it with the same
> > landing pad number) or some have none but i3 at least has one.
> > Then we should be able to distribute the note to all possibly
> > two result insns.  But I can't see that distribute_note has
> > this info readily available (that there not exist conflicting
> > REG_EH_REGIONs, like MNT on the original i2 and a > 0 one on i3).
> 
> Not sure this would be worth the complexity.  Do we see this happen
> ever, can we even test it?  :-)

We cannot test this in distribute_notes, we could test for this when
we have all source insns and reject the cases we cannot possibly
recover from in the current distribute_notes implementation.

> > > > +   /* A REG_EH_REGION note transfering control can only ever 
> > > > come
> > > > +  from i3.  */
> > > > +   gcc_assert (lp_nr <= 0 || from_insn == i3);
> > > 
> > >   if (lp_nr > 0)
> > > gcc_assert (from_insn == i3);
> > > is less obfuscated ;-)
> > 
> > I find that less obvious, but you can have it your way if you like.
> 
> It corresponds more directly to the comment, the narrative guides the
> reader?  But please use whichever you think best.

I've adjusted to how you like it.

> > > > +   /* For nothrow (lp_nr == 0 or lp_nr == INT_MIN) and
> > > > +  for insns in a MUST_NOT_THROW region (lp_nr < 0)
> > > > +  it's difficult to decide what to do for notes
> > > > +  coming from an insn that is not i3.  Just drop
> > > > +  those.  */
> > > 
> > > That sounds like we do not know what is correct to do, so just sweep it
> > > under the carpet and hope it works out.  "Just drop those, that is
> > > always safe"?  (Is it always safe?)
> > 
> > If it is not safe we should have rejected the combination.  I fully
> > expect that we'd need to have a piece during analysis constraining
> > what cases we feed into here to be really "safe".  I'm really not
> > familiar with combine so I know nothing of the constraints it has
> > (like is only i3 ever going to be a CALL_INSN_P?)
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/combine.cc;h=53dcac92abc248a80fc32dd1d3bb641a650d4d9a;hb=HEAD#l1882
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/combine.cc;h=53dcac92abc248a80fc32dd1d3bb641a650d4d9a;hb=HEAD#l2644
> 
> None of the insns other than i3 are call insns, ever.

Good.

Richard.

> combine does not consider EH_REGIONs anywhere else.  It uses
> insn_nothrow_p in some places though.
> 
> Segher

Re: [PATCH] Add condition coverage profiling

2022-04-19 Thread Jørgen Kvalsvik via Gcc-patches

On 07/04/2022 14:04, Martin Liška wrote:
> On 3/28/22 16:40, Jørgen Kvalsvik via Gcc-patches wrote:
>> ... And with another tiny change that fixes Martin's while (1); case.
> 
> Hello.
> 
> Back to this ;) Thank you for the updated version of the patch. I have a 
> couple
> of comments/requests:

Sorry for the late response, this email was eaten by spam filtering for some 
reason.

> 1) I noticed the following cannot be linked:
> 
> $ cat errors.C
> char trim_filename_name;
> int r;
> 
> void trim_filename() {
>   if (trim_filename_name)
>     r = 123;
>   while (trim_filename_name)
>     ;
> }
> 
> int
> main() {}
> 
> $ g++ errors.C -fprofile-conditions -O2
> mold: error: undefined symbol: /tmp/ccmZANB5.o: __gcov8._Z13trim_filenamev
> collect2: error: ld returned 1 exit status

I was able to reproduce this, and I think I have it fixed (that is, I cannot
reproduce it anymore). I looks like the counters were allocated, but never used
in the absence of an output file. I changed it so that the writes are guarded
behind an output check and the counter instrumentation is unconditional which
makes it go away.

The coverage data dependencies are more interleaved (at least currently) than
all the other counters, which makes the flow awkward. I am not convinced a
proper fix is actually worth it, or if anything it would be an separate next 
step.


> Btw. how much have you tested the patch set so far? Have you tried building
> something bigger
> with -fprofile-conditions enabled?
> 

My own test suite plus zlib, and it looks like linux is compiling fine. I
haven't rigorously studied the _output_ in these projects yet, and it is very
time consuming without a benchmark.

> 2) As noticed by Sebastian, please support the new tag in gcov-dump:
> 
> $ gcov-dump -l a.gcno
> ...
> a.gcno:    0145:  28:LINES
> a.gcno:  block 7:`a.c':11
> a.gcno:    0147:   8:UNKNOWN
> 

Will do.

> 3) Then I have some top-level formatting comments:
> 
> a) please re-run check_GNU_style.py, I still see:
> === ERROR type #1: blocks of 8 spaces should be replaced with tabs (35 
> error(s))
> ===
> ...
> 
> b) write comments for each newly added function (and separate it by one empty
> line from
> the function definition)
> 
> c) do not create visual alignment, we don't use it:
> 
> +   cond->set ("count",   new json::integer_number (count));
> +   cond->set ("covered", new json::integer_number (covered));
> 
> and similar examples
> 
> d) finish multiple comments after a dot on the same line:
> 
> +    /* Number of expressions found - this value is the number of entries in 
> the
> +   gcov output and the parameter to gcov_counter_alloc ().
> +   */
> 
> should be ... gcov_counter_alloc ().  */
> 
> e) for long lines with a function call like:
> 
> +   const int n = find_conditions_masked_by
> +   (block, expr, flags + k, path, CONDITIONS_MAX_TERMS);
> 
> do rather
> const int n
>   = find_conditions_masked_by (block, expr,
>    next_arg, ...);
> 
> f) for function params:
> 
> +int
> +collect_reachable_conditionals (
> +    basic_block pre,
> +    basic_block post,
> +    basic_block *out,
> +    int maxsize,
> +    sbitmap expr)
> 
> use rather:
> 
> int
> collect_reachable_conditionals (basic_block pre,
>     second_arg,...
> 

Consider it done for the next revision.

> In the next round, I'm going to take a look at the CFG algorithm that 
> identifies
> and instruments the sub-expressions.

Thank you.

Re: [PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Segher Boessenkool

On Tue, Apr 19, 2022 at 02:58:26PM +0200, Richard Biener wrote:
> On Tue, 19 Apr 2022, Segher Boessenkool wrote:
> The assert was for any landing pad which obviously failed - the
> testsuite fails were all for MUST_NOT_THROW (negative) regions
> which do not end basic-blocks.

I see, thanks.

> > > We are also considering all REG_EH_REGION equal, including
> > > must-not-throw and nothrow kinds but when those are not from i3
> > > we have no good idea what should happen to them, so the following
> > > simply drops them.
> > 
> > And that always is safe?  Why do we have REG_EH_REGION for those cases
> > at all, then?
> 
> It's the only "safe" thing to do at distribute_notes time I think.  If
> it is not "safe" (it might lose must-not-throw EH events, or lose
> optimization when dropping nothrow) we probably have to reject the
> combination earlier.

So assert this does not happen please?

> As I understand combining to multiple insns always happens via
> a split (so we combine up to three insns into one and then might
> end up splitting that into at most two insns).

Yes, but not necessarily a define_split or similar: combine itself knows
how to split things, too.  The obvious one is a parallel of multiple
sets, but it can also break the rhs of a set apart, for example.

> The only case we
> could in theory special-case is when _all_ original insns combined
> have the exact same REG_EH_REGION (all have it with the same
> landing pad number) or some have none but i3 at least has one.
> Then we should be able to distribute the note to all possibly
> two result insns.  But I can't see that distribute_note has
> this info readily available (that there not exist conflicting
> REG_EH_REGIONs, like MNT on the original i2 and a > 0 one on i3).

Not sure this would be worth the complexity.  Do we see this happen
ever, can we even test it?  :-)

> > > + /* A REG_EH_REGION note transfering control can only ever come
> > > +from i3.  */
> > > + gcc_assert (lp_nr <= 0 || from_insn == i3);
> > 
> > if (lp_nr > 0)
> >   gcc_assert (from_insn == i3);
> > is less obfuscated ;-)
> 
> I find that less obvious, but you can have it your way if you like.

It corresponds more directly to the comment, the narrative guides the
reader?  But please use whichever you think best.

> > > + /* For nothrow (lp_nr == 0 or lp_nr == INT_MIN) and
> > > +for insns in a MUST_NOT_THROW region (lp_nr < 0)
> > > +it's difficult to decide what to do for notes
> > > +coming from an insn that is not i3.  Just drop
> > > +those.  */
> > 
> > That sounds like we do not know what is correct to do, so just sweep it
> > under the carpet and hope it works out.  "Just drop those, that is
> > always safe"?  (Is it always safe?)
> 
> If it is not safe we should have rejected the combination.  I fully
> expect that we'd need to have a piece during analysis constraining
> what cases we feed into here to be really "safe".  I'm really not
> familiar with combine so I know nothing of the constraints it has
> (like is only i3 ever going to be a CALL_INSN_P?)

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/combine.cc;h=53dcac92abc248a80fc32dd1d3bb641a650d4d9a;hb=HEAD#l1882
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/combine.cc;h=53dcac92abc248a80fc32dd1d3bb641a650d4d9a;hb=HEAD#l2644

None of the insns other than i3 are call insns, ever.

combine does not consider EH_REGIONs anywhere else.  It uses
insn_nothrow_p in some places though.


Segher

Re: [PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Richard Biener via Gcc-patches

On Tue, 19 Apr 2022, Segher Boessenkool wrote:

> Hi!
> 
> So the assert last week was for a landing pad <= 0?  < or =?

The assert was for any landing pad which obviously failed - the
testsuite fails were all for MUST_NOT_THROW (negative) regions
which do not end basic-blocks.

> On Tue, Apr 19, 2022 at 01:02:09PM +0200, Richard Biener wrote:
> > The following mitigates a problem in combine distribute_notes which
> > places an original REG_EH_REGION based on only may_trap_p which is
> > good to test whether a non-call insn can possibly throw but not if
> > actually it does or we care.  That's something we decided at RTL
> > expansion time where we possibly still know the insn evaluates
> > to a constant.
> > 
> > In fact, the REG_EH_REGION can only come from the original i3 and
> > an assert is added to that effect.  That means we only need to
> > retain the note on i3 or, if that cannot trap, drop it but we
> > should never move it to i2.  If splitting of i3 ever becomes a
> > problem here the insn split should be rejected instead.
> > 
> > We are also considering all REG_EH_REGION equal, including
> > must-not-throw and nothrow kinds but when those are not from i3
> > we have no good idea what should happen to them, so the following
> > simply drops them.
> 
> And that always is safe?  Why do we have REG_EH_REGION for those cases
> at all, then?

It's the only "safe" thing to do at distribute_notes time I think.  If
it is not "safe" (it might lose must-not-throw EH events, or lose
optimization when dropping nothrow) we probably have to reject the
combination earlier.

As I understand combining to multiple insns always happens via
a split (so we combine up to three insns into one and then might
end up splitting that into at most two insns).  The only case we
could in theory special-case is when _all_ original insns combined
have the exact same REG_EH_REGION (all have it with the same
landing pad number) or some have none but i3 at least has one.
Then we should be able to distribute the note to all possibly
two result insns.  But I can't see that distribute_note has
this info readily available (that there not exist conflicting
REG_EH_REGIONs, like MNT on the original i2 and a > 0 one on i3).

> > + {
> > +   int lp_nr = INTVAL (XEXP (note, 0));
> > +   /* A REG_EH_REGION note transfering control can only ever come
> > +  from i3.  */
> > +   gcc_assert (lp_nr <= 0 || from_insn == i3);
> 
>   if (lp_nr > 0)
> gcc_assert (from_insn == i3);
> is less obfuscated ;-)

I find that less obvious, but you can have it your way if you like.

> > +   /* For nothrow (lp_nr == 0 or lp_nr == INT_MIN) and
> > +  for insns in a MUST_NOT_THROW region (lp_nr < 0)
> > +  it's difficult to decide what to do for notes
> > +  coming from an insn that is not i3.  Just drop
> > +  those.  */
> 
> That sounds like we do not know what is correct to do, so just sweep it
> under the carpet and hope it works out.  "Just drop those, that is
> always safe"?  (Is it always safe?)

If it is not safe we should have rejected the combination.  I fully
expect that we'd need to have a piece during analysis constraining
what cases we feed into here to be really "safe".  I'm really not
familiar with combine so I know nothing of the constraints it has
(like is only i3 ever going to be a CALL_INSN_P?)

> Okay for trunk with maybe a word or two more there.  Thanks!

I'll see if there are more comments before pushing.

Thanks,
Richard.

回复：[PATCH] Asan changes for RISC-V.

2022-04-19 Thread joshua via Gcc-patches

Does Asan work for RISC-V currently? It seems that '-fsanitize=address' is 
still unsupported for RISC-V. If I add '--enable-libsanitizer' in Makefile.in 
to reconfigure, there are compiling errors.
Is it because # libsanitizer not supported rv32, but it will break the rv64 
multi-lib build, so we disable that temporally until rv32 supported# in 
Makefile.in?
--
发件人：Jim Wilson 
发送时间：2020年10月29日(星期四) 07:59
收件人：gcc-patches 
抄　送：cooper.joshua ; Jim Wilson 

主　题：[PATCH] Asan changes for RISC-V.

We have only riscv64 asan support, there is no riscv32 support as yet.  So I
need to be able to conditionally enable asan support for the riscv target.  I
implemented this by returning zero from the asan_shadow_offset function.  This
requires a change to toplev.c and docs in target.def.

The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
The problem is that the asan high memory region is a small wedge below
0x40.  The new kernel puts shared libraries at 0x3f and going
down which works.  But the old kernel puts shared libraries at 0x20
and going up which does not work, as it isn't in any recognized memory
region.  This might be fixable with more asan work, but we don't really need
support for old kernel versions.

The asan port is curious in that it uses 1<<29 for the shadow offset, but all
other 64-bit targets use a number larger than 1<<32.  But what we have is
working OK for now.

I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running on
qemu and the results look reasonable.

  === gcc Summary ===

# of expected passes  1905
# of unexpected failures 11
# of unsupported tests  224

  === g++ Summary ===

# of expected passes  2002
# of unexpected failures 6
# of unresolved testcases 1
# of unsupported tests  175

OK?

Jim

2020-10-28  Jim Wilson  

 gcc/
 * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
 (TARGET_ASAN_SHADOW_OFFSET): New.
 * doc/tm.texi: Regenerated.
 * target.def (asan_shadow_offset); Mention that it can return zero.
 * toplev.c (process_options): Check for and handle zero return from
 targetm.asan_shadow_offset call.

Co-Authored-By: cooper.joshua 
---
 gcc/config/riscv/riscv.c | 16 
 gcc/doc/tm.texi  |  3 ++-
 gcc/target.def   |  3 ++-
 gcc/toplev.c |  3 ++-
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..6909e200de1 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
   return true;
 }

+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+riscv_asan_shadow_offset (void)
+{
+  /* We only have libsanitizer support for RV64 at present.
+
+ This number must match kRiscv*_ShadowOffset* in the file
+ libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
+ even though 1<<36 makes more sense.  */
+  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
 #undef TARGET_NEW_ADDRESS_PROFITABLE_P
 #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p

+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 24c37f655c8..39c596b647a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12078,7 +12078,8 @@ is zero, which disables this optimization.
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET 
(void)
 Return the offset bitwise ored into shifted address to get corresponding
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
-supported by the target.
+supported by the target.  May return 0 if Address Sanitizer is not supported
+by a subtarget.
 @end deftypefn

 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK 
(unsigned HOST_WIDE_INT @var{val})
diff --git a/gcc/target.def b/gcc/target.def
index ed2da154e30..268b56b6ebd 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4452,7 +4452,8 @@ DEFHOOK
 (asan_shadow_offset,
  "Return the offset bitwise ored into shifted address to get corresponding\n\
 Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
-supported by the target.",
+supported by the target.  May return 0 if Address Sanitizer is not supported\n\
+by a subtarget.",
  unsigned HOST_WIDE_INT, (void),
  NULL)

diff --git a/gcc/toplev.c b/gcc/toplev.c
index 20e231f4d2a..cf89598252c 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1834,7 +1834,8 @@ process_options (void)
 }

   if ((flag_sanitize & SANITIZE_USER_ADDRESS)
-  && targetm.asan_shadow_o

Re: [AArch64] PR105162: emit barrier for sync and atomic builtins on CPUs without LSE

2022-04-19 Thread Wilco Dijkstra via Gcc-patches

Hi Sebastian,

> Wilco pointed out in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105162#c7 
> that
> "Only __sync needs the extra full barrier, but __atomic does not."
> The attached patch does that by adding out-of-line functions for 
> MEMMODEL_SYNC_*.
> Those new functions contain a barrier on the path without LSE instructions.

Yes, adding _sync versions of the outline functions is the correct approach. 
However
there is no need to have separate _acq/_rel/_seq variants for every function 
since all
but one are _seq. Also we should ensure we generate the same sequence as the 
inlined
versions so that they are consistent. This means ensuring the LDXR macro 
ignores the
'A' for the _sync variants and the swp function switches to acquire semantics.

Cheers,
Wilco

Re: [PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Segher Boessenkool

Hi!

So the assert last week was for a landing pad <= 0?  < or =?

On Tue, Apr 19, 2022 at 01:02:09PM +0200, Richard Biener wrote:
> The following mitigates a problem in combine distribute_notes which
> places an original REG_EH_REGION based on only may_trap_p which is
> good to test whether a non-call insn can possibly throw but not if
> actually it does or we care.  That's something we decided at RTL
> expansion time where we possibly still know the insn evaluates
> to a constant.
> 
> In fact, the REG_EH_REGION can only come from the original i3 and
> an assert is added to that effect.  That means we only need to
> retain the note on i3 or, if that cannot trap, drop it but we
> should never move it to i2.  If splitting of i3 ever becomes a
> problem here the insn split should be rejected instead.
> 
> We are also considering all REG_EH_REGION equal, including
> must-not-throw and nothrow kinds but when those are not from i3
> we have no good idea what should happen to them, so the following
> simply drops them.

And that always is safe?  Why do we have REG_EH_REGION for those cases
at all, then?

> +   {
> + int lp_nr = INTVAL (XEXP (note, 0));
> + /* A REG_EH_REGION note transfering control can only ever come
> +from i3.  */
> + gcc_assert (lp_nr <= 0 || from_insn == i3);

if (lp_nr > 0)
  gcc_assert (from_insn == i3);
is less obfuscated ;-)

> + /* For nothrow (lp_nr == 0 or lp_nr == INT_MIN) and
> +for insns in a MUST_NOT_THROW region (lp_nr < 0)
> +it's difficult to decide what to do for notes
> +coming from an insn that is not i3.  Just drop
> +those.  */

That sounds like we do not know what is correct to do, so just sweep it
under the carpet and hope it works out.  "Just drop those, that is
always safe"?  (Is it always safe?)

Okay for trunk with maybe a word or two more there.  Thanks!


Segher

[PATCH] tree-optimization/104880 - move testcase

2022-04-19 Thread Richard Biener via Gcc-patches

This renames the testcase to something picked up by the suites regexp.

Tested on x86_64-unknown-linux-gnu, pushed.

2022-04-19  Richard Biener  

PR tree-optimization/104880
* g++.dg/opt/pr104880.C: Rename from pr104880.cc.
---
 gcc/testsuite/g++.dg/opt/{pr104880.cc => pr104880.C} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename gcc/testsuite/g++.dg/opt/{pr104880.cc => pr104880.C} (100%)

diff --git a/gcc/testsuite/g++.dg/opt/pr104880.cc 
b/gcc/testsuite/g++.dg/opt/pr104880.C
similarity index 100%
rename from gcc/testsuite/g++.dg/opt/pr104880.cc
rename to gcc/testsuite/g++.dg/opt/pr104880.C
-- 
2.34.1

[x86_64 PATCH] PR middle-end/105135: Catch more cmov idioms in combine.

2022-04-19 Thread Roger Sayle


This patch addresses PR middle-end/105135, a missed-optimization regression
affecting mainline.  I agree with Jakub's comment that the middle-end
optimizations are sound, reducing basic blocks and conditional expressions
at the tree-level, but requiring backend's to recognize conditional move
instructions/idioms if/when beneficial.  This patch introduces two new
define_insn_and_split in i386.md to recognize two additional cmove idioms.

The first recognizes (PR105135's):

int foo(int x, int y, int z)
{
  return ((x < y) << 5) + z;
}

and transforms (the 6 insns, 13 bytes):

xorl%eax, %eax  ;; 2 bytes
cmpl%esi, %edi  ;; 2 bytes
setl%al ;; 3 bytes
sall$5, %eax;; 3 bytes
addl%edx, %eax  ;; 2 bytes
ret ;; 1 byte

into (the 4 insns, 9 bytes):

cmpl%esi, %edi  ;; 2 bytes
leal32(%rdx), %eax  ;; 3 bytes
cmovge  %edx, %eax  ;; 3 bytes
ret ;; 1 byte


The second catches the very closely related (from PR 98865):

int bar(int x, int y, int z)
{
  return -(x < y) & z;
}

and transforms the (6 insns, 12 bytes):
xorl%eax, %eax  ;; 2 bytes
cmpl%esi, %edi  ;; 2 bytes
setl%al ;; 3 bytes
negl%eax;; 2 bytes
andl%edx, %eax  ;; 2 bytes
ret ;; 1 byte

into (4 insns, 8 bytes):
xorl%eax, %eax  ;; 2 bytes
cmpl%esi, %edi  ;; 2 bytes
cmovl   %edx, %eax  ;; 3 bytes
ret ;; 1 byte

They both have in common that they recognize a setcc followed by two
instructions, and replace them with one instruction and a cmov, which
is typically a performance win, but always a size win.  Fine tuning
these decisions based on microarchitecture is much easier in the
backend, than the middle-end.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-04-19  Roger Sayle  

gcc/ChangeLog
PR target/105135
* config/i386/i386.md (*xor_cmov): Transform setcc, negate
then and into mov $0, followed by a cmov.
(*lea_cmov): Transform setcc, ashift const then plus into
lea followed by cmov.

gcc/testsuite/ChangeLog
PR target/105135
* gcc.target/i386/cmov10.c: New test case.
* gcc.target/i386/cmov11.c: New test case.
* gcc.target/i386/pr105135.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c74edd1..5887688 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -20751,6 +20751,52 @@
   operands[9] = replace_rtx (operands[6], operands[0], operands[1], true);
 })
 
+;; Transform setcc;negate;and into mov_zero;cmov
+(define_insn_and_split "*xor_cmov"
+  [(set (match_operand:SWI248 0 "register_operand")
+   (and:SWI248
+ (neg:SWI248 (match_operator:SWI248 1 "ix86_comparison_operator"
+   [(match_operand 2 "flags_reg_operand")
+(const_int 0)]))
+ (match_operand:SWI248 3 "register_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_CMOVE && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 4) (const_int 0))
+   (set (match_dup 0)
+   (if_then_else:SWI248 (match_op_dup 1 [(match_dup 2) (const_int 0)])
+(match_dup 3) (match_dup 4)))]
+{
+  operands[4] = gen_reg_rtx (mode);
+})
+
+;; Transform setcc;ashift_const;plus into lea_const;cmov
+(define_insn_and_split "*lea_cmov"
+  [(set (match_operand:SWI 0 "register_operand")
+   (plus:SWI (ashift:SWI (match_operator:SWI 1 "ix86_comparison_operator"
+   [(match_operand 2 "flags_reg_operand")
+(const_int 0)])
+ (match_operand:SWI 3 "const_int_operand"))
+ (match_operand:SWI 4 "register_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_CMOVE && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(set (match_dup 5) (plus: (match_dup 4) (match_dup 6)))
+   (set (match_dup 0)
+   (if_then_else: (match_op_dup 1 [(match_dup 2) (const_int 0)])
+   (match_dup 5) (match_dup 4)))]
+{
+  operands[5] = gen_reg_rtx (mode);
+  operands[6] = GEN_INT (1 << INTVAL (operands[3]));
+  if (mode != mode)
+{
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+}
+})
+
 (define_insn "movhf_mask"
   [(set (match_operand:HF 0 "nonimmediate_operand" "=v,m,v")
(unspec:HF
diff --git a/gcc/testsuite/gcc.target/i386/cmov10.c 
b/gcc/testsuite/gcc.target/i386/cmov10.c
new file mode 100644
index 000..c04fdd8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cmov10.c
@@ -0,0 +1,9 @@

Re: [PATCH] libstdc++: Add pretty printer for std::span

2022-04-19 Thread Jonathan Wakely via Gcc-patches

On Tue, 19 Apr 2022 at 12:33, Philipp Fent via Libstdc++
 wrote:
>
> On 19.04.22 12:28, Jonathan Wakely wrote:
> > Thanks, but we still need the DCO sign-off as I mailed about last week.
>
> Thanks for the clarification, your last mail didn't appear to have
> content, so I might have missed that part. I've now added my DCO sign-off.

Huh, I see no content in my sent copy either. Weird. I distinctly
remember pasting the https://gcc.gnu.org/dco.html link into the email,
but gmail must have had other ideas. Sorry about that!

Thanks for the sign-off, I'll get this committed today.

The GCC 11 backport will have to wait a few days because the branch is
currently frozen for the 11.3 release.

Re: [PATCH] libstdc++: Add pretty printer for std::span

2022-04-19 Thread Philipp Fent via Gcc-patches


On 19.04.22 12:28, Jonathan Wakely wrote:

Thanks, but we still need the DCO sign-off as I mailed about last week.


Thanks for the clarification, your last mail didn't appear to have 
content, so I might have missed that part. I've now added my DCO sign-off.


Best
PhilippFrom 64b6779c2694f57981e15b9c1dfa59b192e99a16 Mon Sep 17 00:00:00 2001
From: Philipp Fent 
Date: Mon, 4 Apr 2022 12:52:57 +0200
Subject: [PATCH] libstdc++: Add pretty printer for std::span

This improves the debug output for C++20 spans.
Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffb9a8,
_M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}

Signed-off-by: Philipp Fent 
---
 libstdc++-v3/python/libstdcxx/v6/printers.py  | 38 +++
 .../libstdc++-prettyprinters/cxx20.cc | 11 ++
 2 files changed, 49 insertions(+)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index f7a7f9961a7..6d8b765f2da 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1654,6 +1654,43 @@ class StdRegexStatePrinter:
 s = "{}, {}={}".format(s, v, self.val['_M_' + v])
 return "{%s}" % (s)
 
+class StdSpanPrinter:
+"Print a std::span"
+
+class _iterator(Iterator):
+def __init__(self, begin, size):
+self.count = 0
+self.begin = begin
+self.size = size
+
+def __iter__ (self):
+return self
+
+def __next__ (self):
+if self.count == self.size:
+raise StopIteration
+
+count = self.count
+self.count = self.count + 1
+return '[%d]' % count, (self.begin + count).dereference()
+
+def __init__(self, typename, val):
+self.typename = typename
+self.val = val
+if val.type.template_argument(1) == gdb.parse_and_eval('static_cast(-1)'):
+self.size = val['_M_extent']['_M_extent_value']
+else:
+self.size = val.type.template_argument(1)
+
+def to_string(self):
+return '%s of length %d' % (self.typename, self.size)
+
+def children(self):
+return self._iterator(self.val['_M_ptr'], self.size)
+
+def display_hint(self):
+return 'array'
+
 # A "regular expression" printer which conforms to the
 # "SubPrettyPrinter" protocol from gdb.printing.
 class RxPrinter(object):
@@ -2170,6 +2207,7 @@ def build_libstdcxx_dictionary ():
 libstdcxx_printer.add_version('std::', 'partial_ordering', StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'weak_ordering', StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'strong_ordering', StdCmpCatPrinter)
+libstdcxx_printer.add_version('std::', 'span', StdSpanPrinter)
 
 # Extensions.
 libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
index b0de25c27ec..76023df93fa 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
@@ -18,8 +18,10 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+#include 
 #include 
 #include 
+#include 
 
 struct X
 {
@@ -54,6 +56,15 @@ main()
   auto c10 = 0.0 <=> __builtin_nan("");
 // { dg-final { note-test c10 "std::partial_ordering::unordered" } }
 
+  auto il = {1, 2};
+  auto s1 = std::span(il);
+  static_assert(s1.extent == std::size_t(-1));
+// { dg-final { note-test s1 {std::span of length 2 = {1, 2}} } }
+  auto a = std::array{3, 4};
+  auto s2 = std::span(a);
+  static_assert(s2.extent == std::size_t(2));
+// { dg-final { note-test s2 {std::span of length 2 = {3, 4}} } }
+
   std::cout << "\n";
   return 0;			// Mark SPOT
 }
-- 
2.35.3

[PATCH][v2] tree-optimization/104912 - ensure cost model is checked first

2022-04-19 Thread Richard Biener via Gcc-patches

The following makes sure that when we build the versioning condition
for vectorization including the cost model check, we check for the
cost model and branch over other versioning checks.  That is what
the cost modeling assumes, since the cost model check is the only
one accounted for in the scalar outside cost.  Currently we emit
all checks as straight-line code combined with bitwise ops which
can result in surprising ordering of checks in the final assembly.

Since loop_version accepts only a single versioning condition
the splitting is done after the fact.

The result is a 1.5% speedup of 416.gamess on x86_64 when compiling
with -Ofast and tuning for generic or skylake.  That's not enough
to recover from the slowdown when vectorizing but it now cuts off
the expensive alias versioning test.

This is an update to the previously posted patch splitting the
probability between the two branches as outlined in
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592597.html

I've re-bootstrapped and tested this on x86_64-unknown-linux-gnu.

Honza - is the approach to splitting the probabilities sensible?
This fixes a piece of a P1 regression.

Thanks,
Richard.

2022-03-21  Richard Biener  

PR tree-optimization/104912
* tree-vect-loop-manip.cc (vect_loop_versioning): Split
the cost model check to a separate BB to make sure it is
checked first and not combined with other version checks.
---
 gcc/tree-vect-loop-manip.cc | 60 +++--
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 63fb6f669a0..e4381eb7079 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -3445,13 +3445,34 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
cond_expr = expr;
 }
 
+  tree cost_name = NULL_TREE;
+  profile_probability prob2 = profile_probability::uninitialized ();
+  if (cond_expr
+  && !integer_truep (cond_expr)
+  && (version_niter
+ || version_align
+ || version_alias
+ || version_simd_if_cond))
+{
+  cost_name = cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
+ &cond_expr_stmt_list,
+ is_gimple_val, NULL_TREE);
+  /* Split prob () into two so that the overall probability of passing
+both the cost-model and versioning checks is the orig prob.  */
+  prob2 = prob.split (prob);
+}
+
   if (version_niter)
 vect_create_cond_for_niters_checks (loop_vinfo, &cond_expr);
 
   if (cond_expr)
-cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
-   &cond_expr_stmt_list,
-   is_gimple_condexpr, NULL_TREE);
+{
+  gimple_seq tem = NULL;
+  cond_expr = force_gimple_operand_1 (unshare_expr (cond_expr),
+ &tem,
+ is_gimple_condexpr, NULL_TREE);
+  gimple_seq_add_seq (&cond_expr_stmt_list, tem);
+}
 
   if (version_align)
 vect_create_cond_for_align_checks (loop_vinfo, &cond_expr,
@@ -3655,6 +3676,39 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   update_ssa (TODO_update_ssa);
 }
 
+  /* Split the cost model check off to a separate BB.  Costing assumes
+ this is the only thing we perform when we enter the scalar loop
+ from a failed cost decision.  */
+  if (cost_name && TREE_CODE (cost_name) == SSA_NAME)
+{
+  gimple *def = SSA_NAME_DEF_STMT (cost_name);
+  /* All uses of the cost check are 'true' after the check we
+are going to insert.  */
+  replace_uses_by (cost_name, boolean_true_node);
+  /* And we're going to build the new single use of it.  */
+  gcond *cond = gimple_build_cond (NE_EXPR, cost_name, boolean_false_node,
+  NULL_TREE, NULL_TREE);
+  edge e = split_block (gimple_bb (def), def);
+  gimple_stmt_iterator gsi = gsi_for_stmt (def);
+  gsi_insert_after (&gsi, cond, GSI_NEW_STMT);
+  edge true_e, false_e;
+  extract_true_false_edges_from_block (e->dest, &true_e, &false_e);
+  e->flags &= ~EDGE_FALLTHRU;
+  e->flags |= EDGE_TRUE_VALUE;
+  edge e2 = make_edge (e->src, false_e->dest, EDGE_FALSE_VALUE);
+  e->probability = prob2;
+  e2->probability = prob2.invert ();
+  set_immediate_dominator (CDI_DOMINATORS, false_e->dest, e->src);
+  auto_vec adj;
+  for (basic_block son = first_dom_son (CDI_DOMINATORS, e->dest);
+  son;
+  son = next_dom_son (CDI_DOMINATORS, son))
+   if (EDGE_COUNT (son->preds) > 1)
+ adj.safe_push (son);
+  for (auto son : adj)
+   set_immediate_dominator (CDI_DOMINATORS, son, e->src);
+}
+
   if (version_niter)
 {
   /* The versioned loop could be infinite, we need to clear existing

[PING] Re: [PATCH] tree-optimization/100810 - avoid undefs in IVOPT rewrites

2022-04-19 Thread Richard Biener via Gcc-patches

On Fri, 1 Apr 2022, Richard Biener wrote:

> The following attempts to avoid IVOPTs rewriting uses using
> IV candidates that involve undefined behavior by using uninitialized
> SSA names.  First we restrict the set of candidates we produce
> for such IVs to the original ones and mark them as not important.
> Second we try to only allow expressing uses with such IV if they
> originally use them.  That is to avoid rewriting all such uses
> in terms of other IVs.  Since cand->iv and use->iv seem to never
> exactly match up we resort to comparing the IV bases.
> 
> The approach ends up similar to the one posted by Roger at
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578441.html
> but it marks IV candidates rather than use groups and the cases
> we allow in determine_group_iv_cost_generic are slightly different.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> OK for trunk?

Ping?  Any opinions?

Thanks,
Richard.

> Thanks,
> Richard.
> 
> 2022-01-04  Richard Biener  
> 
>   PR tree-optimization/100810
>   * tree-ssa-loop-ivopts.cc (struct iv_cand): Add involves_undefs flag.
>   (find_ssa_undef): New function.
>   (add_candidate_1): Avoid adding derived candidates with
>   undefined SSA names and mark the original ones.
>   (determine_group_iv_cost_generic): Reject rewriting
>   uses with a different IV when that involves undefined SSA names.
> 
>   * gcc.dg/torture/pr100810.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/torture/pr100810.c | 34 +
>  gcc/tree-ssa-loop-ivopts.cc | 31 ++
>  2 files changed, 65 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100810.c
> 
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100810.c 
> b/gcc/testsuite/gcc.dg/torture/pr100810.c
> new file mode 100644
> index 000..63566f530f7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100810.c
> @@ -0,0 +1,34 @@
> +/* { dg-do run } */
> +
> +int a, b = 1, c = 1, e, f = 1, g, h, j;
> +volatile int d;
> +static void k()
> +{
> +  int i;
> +  h = b;
> +  if (c && a >= 0) {
> +  while (a) {
> +   i++;
> +   h--;
> +  }
> +  if (g)
> + for (h = 0; h < 2; h++)
> +   ;
> +  if (!b)
> + i &&d;
> +  }
> +}
> +static void l()
> +{
> +  for (; j < 1; j++)
> +if (!e && c && f)
> +  k();
> +}
> +int main()
> +{
> +  if (f)
> +l();
> +  if (h != 1)
> +__builtin_abort();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index 935d2d4d8f3..b0305c494cd 100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -452,6 +452,7 @@ struct iv_cand
>unsigned id;   /* The number of the candidate.  */
>bool important;/* Whether this is an "important" candidate, i.e. such
>  that it should be considered by all uses.  */
> +  bool involves_undefs; /* Whether the IV involves undefined values.  */
>ENUM_BITFIELD(iv_position) pos : 8;/* Where it is computed.  */
>gimple *incremented_at;/* For original biv, the statement where it is
>  incremented.  */
> @@ -3068,6 +3069,19 @@ get_loop_invariant_expr (struct ivopts_data *data, 
> tree inv_expr)
>return *slot;
>  }
>  
> +/* Find the first undefined SSA name in *TP.  */
> +
> +static tree
> +find_ssa_undef (tree *tp, int *walk_subtrees, void *)
> +{
> +  if (TREE_CODE (*tp) == SSA_NAME
> +  && ssa_undefined_value_p (*tp, false))
> +return *tp;
> +  if (!EXPR_P (*tp))
> +*walk_subtrees = 0;
> +  return NULL;
> +}
> +
>  /* Adds a candidate BASE + STEP * i.  Important field is set to IMPORTANT and
> position to POS.  If USE is not NULL, the candidate is set as related to
> it.  If both BASE and STEP are NULL, we add a pseudocandidate for the
> @@ -3095,6 +3109,17 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
> tree step, bool important,
>if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
>  return NULL;
>  
> +  /* If BASE contains undefined SSA names make sure we only record
> + the original IV.  */
> +  bool involves_undefs = false;
> +  if (walk_tree (&base, find_ssa_undef, NULL, NULL))
> +{
> +  if (pos != IP_ORIGINAL)
> + return NULL;
> +  important = false;
> +  involves_undefs = true;
> +}
> +
>/* For non-original variables, make sure their values are computed in a 
> type
>   that does not invoke undefined behavior on overflows (since in general,
>   we cannot prove that these induction variables are non-wrapping).  */
> @@ -3143,6 +3168,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
> tree step, bool important,
> cand->var_after = cand->var_before;
>   }
>cand->important = important;
> +  cand->involves_undefs = involves_undefs;
>cand->incremented_at = incremented_at;
>cand->doloop_p = doloop;
>data->vca

Re: [PATCH] gimple-fold: fix further missing stmt locations [PR104308]

2022-04-19 Thread Richard Biener via Gcc-patches

On Thu, Apr 14, 2022 at 3:25 PM David Malcolm via Gcc-patches
 wrote:
>
> PR analyzer/104308 initially reported about a
> -Wanalyzer-use-of-uninitialized-value diagnostic using UNKNOWN_LOCATION
> when complaining about certain memmove operations where the source
> is uninitialized.
>
> In r12-7856-g875342766d4298 I fixed the missing location for
> a stmt generated by gimple_fold_builtin_memory_op, but the reporter
> then found another way to generate such a stmt with UNKNOWN_LOCATION.
>
> I've now gone through gimple_fold_builtin_memory_op looking at all
> statement creation, and found three places in which a new statement
> doesn't have a location set on it (either directly via
> gimple_set_location, or indirectly via gsi_replace), one of which is
> the new reproducer.
>
> This patch adds a gimple_set_location to these three cases, and adds
> test coverage for one of them (the third hunk within the patch), fixing
> the new reproducer for PR analyzer/104308.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk in stage 4?  Or in stage 1?

OK for stage4.

> Thanks
> Dave
>
> gcc/ChangeLog:
> PR analyzer/104308
> * gimple-fold.cc (gimple_fold_builtin_memory_op): Explicitly set
> the location of new_stmt in all places that don't already set it,
> whether explicitly, or via a call to gsi_replace.
>
> gcc/testsuite/ChangeLog:
> PR analyzer/104308
> * gcc.dg/analyzer/pr104308.c: Add test coverage.
>
> Signed-off-by: David Malcolm 
> ---
>  gcc/gimple-fold.cc   |  3 +++
>  gcc/testsuite/gcc.dg/analyzer/pr104308.c | 13 -
>  2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index ac22adfd9b1..863ee3d3912 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -1048,6 +1048,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> *gsi,
>   gsi_replace (gsi, new_stmt, false);
>   return true;
> }
> + gimple_set_location (new_stmt, loc);
>   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>   goto done;
> }
> @@ -1302,6 +1303,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator 
> *gsi,
>new_stmt);
>   gimple_assign_set_lhs (new_stmt, srcvar);
>   gimple_set_vuse (new_stmt, gimple_vuse (stmt));
> + gimple_set_location (new_stmt, loc);
>   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> }
>   new_stmt = gimple_build_assign (destvar, srcvar);
> @@ -1338,6 +1340,7 @@ set_vop_and_replace:
>   gsi_replace (gsi, new_stmt, false);
>   return true;
> }
> +  gimple_set_location (new_stmt, loc);
>gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>  }
>
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104308.c 
> b/gcc/testsuite/gcc.dg/analyzer/pr104308.c
> index 9cd5ee6feee..a3a0cbb7317 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/pr104308.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr104308.c
> @@ -1,8 +1,19 @@
> +/* Verify that we have source locations for
> +   -Wanalyzer-use-of-uninitialized-value warnings involving folded
> +   memory ops.  */
> +
>  #include 
>
> -int main()
> +int test_memmove_within_uninit (void)
>  {
>char s[5]; /* { dg-message "region created on stack here" } */
>memmove(s, s + 1, 2); /* { dg-warning "use of uninitialized value" } */
>return 0;
>  }
> +
> +int test_memcpy_from_uninit (void)
> +{
> +  char a1[5];
> +  char a2[5]; /* { dg-message "region created on stack here" } */
> +  return (memcpy(a1, a2, 5) == a1); /* { dg-warning "use of uninitialized 
> value" } */
> +}
> --
> 2.26.3
>

[PATCH][v2] rtl-optimization/105231 - distribute_notes and REG_EH_REGION

2022-04-19 Thread Richard Biener via Gcc-patches

The following mitigates a problem in combine distribute_notes which
places an original REG_EH_REGION based on only may_trap_p which is
good to test whether a non-call insn can possibly throw but not if
actually it does or we care.  That's something we decided at RTL
expansion time where we possibly still know the insn evaluates
to a constant.

In fact, the REG_EH_REGION can only come from the original i3 and
an assert is added to that effect.  That means we only need to
retain the note on i3 or, if that cannot trap, drop it but we
should never move it to i2.  If splitting of i3 ever becomes a
problem here the insn split should be rejected instead.

We are also considering all REG_EH_REGION equal, including
must-not-throw and nothrow kinds but when those are not from i3
we have no good idea what should happen to them, so the following
simply drops them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

2022-04-19  Richard Biener  

PR rtl-optimization/105231
* combine.cc (distribute_notes): Assert that a REG_EH_REGION
with landing pad > 0 is from i3 and only keep it there or drop
it if the insn can not trap.  Throw away REG_EH_REGION with
landing pad <= 0 if it does not originate from i3.

* gcc.dg/torture/pr105231.c: New testcase.
---
 gcc/combine.cc  | 44 +++--
 gcc/testsuite/gcc.dg/torture/pr105231.c | 15 +
 2 files changed, 42 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr105231.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 53dcac92abc..ca9d6f0e6e0 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -14175,23 +14175,33 @@ distribute_notes (rtx notes, rtx_insn *from_insn, 
rtx_insn *i3, rtx_insn *i2,
  break;
 
case REG_EH_REGION:
- /* These notes must remain with the call or trapping instruction.  */
- if (CALL_P (i3))
-   place = i3;
- else if (i2 && CALL_P (i2))
-   place = i2;
- else
-   {
- gcc_assert (cfun->can_throw_non_call_exceptions);
- if (may_trap_p (i3))
-   place = i3;
- else if (i2 && may_trap_p (i2))
-   place = i2;
- /* ??? Otherwise assume we've combined things such that we
-can now prove that the instructions can't trap.  Drop the
-note in this case.  */
-   }
- break;
+ {
+   int lp_nr = INTVAL (XEXP (note, 0));
+   /* A REG_EH_REGION note transfering control can only ever come
+  from i3.  */
+   gcc_assert (lp_nr <= 0 || from_insn == i3);
+   /* For nothrow (lp_nr == 0 or lp_nr == INT_MIN) and
+  for insns in a MUST_NOT_THROW region (lp_nr < 0)
+  it's difficult to decide what to do for notes
+  coming from an insn that is not i3.  Just drop
+  those.  */
+   if (from_insn != i3)
+ ;
+   /* Otherwise the note must remain with the call or trapping
+  instruction.  */
+   else if (CALL_P (i3))
+ place = i3;
+   else
+ {
+   gcc_assert (cfun->can_throw_non_call_exceptions);
+   /* If i3 can still trap preserve the note, otherwise we've
+  combined things such that we can now prove that the
+  instructions can't trap.  Drop the note in this case.  */
+   if (may_trap_p (i3))
+ place = i3;
+ }
+   break;
+ }
 
case REG_ARGS_SIZE:
  /* ??? How to distribute between i3-i1.  Assume i3 contains the
diff --git a/gcc/testsuite/gcc.dg/torture/pr105231.c 
b/gcc/testsuite/gcc.dg/torture/pr105231.c
new file mode 100644
index 000..50459219c08
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr105231.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32plus } */
+/* { dg-require-effective-target dfp } */
+/* { dg-additional-options "-fsanitize-coverage=trace-pc -fnon-call-exceptions 
--param=max-cse-insns=1 -frounding-math" } */
+/* { dg-additional-options "-mstack-arg-probe" { target x86_64-*-* i?86-*-* } 
} */
+
+void baz (int *);
+void bar (double, double, _Decimal64);
+
+void
+foo (void)
+{
+  int s __attribute__((cleanup (baz)));
+  bar (0xfffe, 0xebf3fff2fbebaf7f, 0xff);
+}
-- 
2.34.1

[committed] libstdc++: Fix syntax error in libbacktrace configuration

2022-04-19 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux, pushed to trunk.

-- >8 --

Using == instead of = causes a configuration error with dash as the
shell:

checking whether to build libbacktrace support... 
/home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77471: 
test: auto: unexpected operator
/home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77474: 
test: auto: unexpected operator
auto

This means we fail to change the value from "auto" to "no" and so this
test passes:
GLIBCXX_CONDITIONAL(ENABLE_BACKTRACE, [test "$enable_libstdcxx_backtrace" != 
no])

This leads to the libbacktrace directory being included in the build
without being configured properly, and bootstrap fails.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Fix shell operators.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 6 +++---
 libstdc++-v3/configure| 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 6aece2adff8..138bd58d86c 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5007,10 +5007,10 @@ esac
 BACKTRACE_CPPFLAGS="$BACKTRACE_CPPFLAGS -DBACKTRACE_ELF_SIZE=$elfsize"
 
   AC_MSG_CHECKING([whether to build libbacktrace support])
-  if test "$enable_libstdcxx_backtrace" == "auto"; then
+  if test "$enable_libstdcxx_backtrace" = "auto"; then
 enable_libstdcxx_backtrace=no
   fi
-  if test "$enable_libstdcxx_backtrace" == "yes"; then
+  if test "$enable_libstdcxx_backtrace" = "yes"; then
 BACKTRACE_SUPPORTED=1
 
 AC_CHECK_HEADERS(sys/mman.h)
@@ -5057,7 +5057,7 @@ BACKTRACE_CPPFLAGS="$BACKTRACE_CPPFLAGS 
-DBACKTRACE_ELF_SIZE=$elfsize"
 BACKTRACE_SUPPORTS_THREADS=0
   fi
   AC_MSG_RESULT($enable_libstdcxx_backtrace)
-  GLIBCXX_CONDITIONAL(ENABLE_BACKTRACE, [test "$enable_libstdcxx_backtrace" != 
no])
+  GLIBCXX_CONDITIONAL(ENABLE_BACKTRACE, [test "$enable_libstdcxx_backtrace" = 
yes])
 ])
 
 # Macros from the top-level gcc directory.

Re: [PATCH] c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256]

2022-04-19 Thread Jakub Jelinek via Gcc-patches

On Mon, Apr 18, 2022 at 09:57:12AM -0400, Patrick Palka wrote:
> > Hmm, Patrick made a similar change and then reverted it for PR90996.
> > But it makes sense to me; when we replace placeholders, it's appropriate
> > to look at the whole aggregate initialization rather than the innermost
> > CONSTRUCTOR that has DMIs.  Patrick, was there a reason that change
> > seemed wrong to you, or was it just unnecessary for the bug you were
> > working on?
> 
> The reverted change and Jakub's more general patch seem right/safe to
> me FWIW, I just couldn't come up with a testcase that demonstrated its
> need at the time unfortunately.

So is the patch ok for trunk then?
Apparently it is also a recent regression on 11 branch (since Marek's
r11-9711) when compiling firefox, ok for 11.3 as well?

> > > 2022-04-15  Jakub Jelinek  
> > >
> > >   PR c++/105256
> > >   * typeck2.cc (process_init_constructor_array,
> > >   process_init_constructor_record, process_init_constructor_union): 
> > > Move
> > >   CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to 
> > > the
> > >   containing CONSTRUCTOR.
> > >
> > >   * g++.dg/cpp0x/pr105256.C: New test.

Jakub

Re: [PATCH] libstdc++: Stop defining _GLIBCXX_ASSERTIONS in floating_to_chars.cc

2022-04-19 Thread Jonathan Wakely via Gcc-patches

On Thu, 14 Apr 2022 at 20:48, Patrick Palka via Libstdc++
 wrote:
>
> Assertions were originally enabled in the compiled-in floating-point
> std::to_chars implementation to help shake out any bugs, but they
> apparently impose a significant performance penalty, in particular for
> the hex formatting which is around 25% slower with assertions enabled.
> This seems too high of a cost for unconditionally enabling them.
>
> The newly added calls to __builtin_unreachable work around the compiler
> no longer knowing that the set of valid values of 'fmt' is limited (which
> was previously upheld by an assert).
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK, thanks.

>
> libstdc++-v3/ChangeLog:
>
> * src/c++17/floating_to_chars.cc: Don't define
> _GLIBCXX_ASSERTIONS.
> (__floating_to_chars_shortest): Add __builtin_unreachable calls to
> squelch false-positive -Wmaybe-uninitialized and -Wreturn-type
> warnings.
> (__floating_to_chars_precision): Likewise.
> ---
>  libstdc++-v3/src/c++17/floating_to_chars.cc | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
> b/libstdc++-v3/src/c++17/floating_to_chars.cc
> index 66bd457cbe2..4599d68a39c 100644
> --- a/libstdc++-v3/src/c++17/floating_to_chars.cc
> +++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
> @@ -22,9 +22,6 @@
>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  // .
>
> -// Activate __glibcxx_assert within this file to shake out any bugs.
> -#define _GLIBCXX_ASSERTIONS 1
> -
>  #include 
>
>  #include 
> @@ -1114,6 +,7 @@ template
>}
>
>  __glibcxx_assert(false);
> +__builtin_unreachable();
>}
>
>  template
> @@ -1202,6 +1200,8 @@ template
> effective_precision = min(precision, 
> max_eff_scientific_precision);
> output_specifier = "%.*Lg";
>   }
> +   else
> + __builtin_unreachable();
> const int excess_precision = (fmt != chars_format::general
>   ? precision - effective_precision : 0);
>
> @@ -1234,6 +1234,8 @@ template
>   output_length_upper_bound = sign + strlen("0");
> output_length_upper_bound += sizeof(radix) + effective_precision;
>   }
> +   else
> + __builtin_unreachable();
>
> // Do the sprintf into the local buffer.
> char buffer[output_length_upper_bound+1];
> @@ -1570,6 +1572,7 @@ template
>}
>
>  __glibcxx_assert(false);
> +__builtin_unreachable();
>}
>
>  // Define the overloads for float.
> --
> 2.36.0.rc2.10.g1ac7422e39
>

Re: [PATCH] libstdc++: Update incorrect statement about mainline in docs

2022-04-19 Thread Jonathan Wakely via Gcc-patches

On Tue, 19 Apr 2022 at 08:01, Richard Biener  wrote:
>
> On Thu, 14 Apr 2022, Jonathan Wakely wrote:
>
> > On Thu, 14 Apr 2022 at 11:55, Richard Biener  wrote:
> > >
> > > On Thu, 14 Apr 2022, Jonathan Wakely wrote:
> > >
> > > > On Thu, 14 Apr 2022 at 11:36, Richard Biener  wrote:
> > > > >
> > > > > On Thu, 14 Apr 2022, Jonathan Wakely wrote:
> > > > >
> > > > > > This fixes some misleading text in the libstdc++ manual that says 
> > > > > > the
> > > > > > docs for the gcc-11 branch refer to mainline.
> > > > > >
> > > > > > Richi, is this OK for the gcc-11 branch now? It's been wrong for 
> > > > > > 11.1
> > > > > > and 11.2, but it would still be nice to fix.
> > > > >
> > > > > Yes, it's OK.  I notice the same problem exists on the GCC 10 branch
> > > > > but GCC 9 at least mentions GCC 9 once ;)
> > > >
> > > > Yes, I fixed it for gcc-9.3.0, but forgot to do it for gcc-10 and 
> > > > gcc-11.
> > > >
> > > > I pushed r10-10534 to fix gcc-10 (since that's open for doc changes)
> > > > and have now pushed r11-9881
> > > > as well.
> > > >
> > > > Maybe this year I'll remember to do it for gcc-12 after we branch from 
> > > > trunk!
> > >
> > > Add an entry to branching.html!
> >
> > Like this? OK for wwwdocs?
>
> Maybe
>
> "Notify libstdc++ maintainers to update ..."
>
> ?
>
> OK with that change.

Yes, that's better. Pushed with that change, thanks.

Re: [PATCH] libstdc++: Add pretty printer for std::span

2022-04-19 Thread Jonathan Wakely via Gcc-patches

On Tue, 19 Apr 2022 at 10:34, Philipp Fent wrote:
>
> On 04.04.22 13:39, Jonathan Wakely wrote:
> > Nice, thanks. I'll get this committed in time for GCC 12 (and backport
> > it to release branches too).
>
> I've attached a rebased patch for trunk and tested it on x86_64-linux.
> I also backported it for the release branches, gcc-11 tests also pass,
> on gcc-10 the prettyprinters testsuite reports "unsupported", and gcc-9
> didn't have std::span yet.

Thanks, but we still need the DCO sign-off as I mailed about last week.

There's no need for you to provide the backported patches, I will take
care of that.

FW: [PATCH] PR fortran/104812: generate error for constuct-name clash with symbols

2022-04-19 Thread Mikhail Kashkarov via Gcc-patches



0001-Fortran-add-error-for-constuct-name-conflicts-with-s.patch
Description: Binary data


rcptInfo.txt
Description: Binary data

Re: [PATCH] c++, coroutines: Avoid expanding within templates [PR103868]

2022-04-19 Thread Martin Liška


On 4/18/22 21:55, Iain Sandoe wrote:

IIRC the original code takes a few minutes to compile on my laptop, so i’d 
expect it might be very time-consuming to do - perhaps Martin has some ideas?


Yes, the original testcase is huge to reduce and I'm not willing to waste time 
on it.

That said, I would rather not install the test-case to our testsuite as it's 
pretty big.

Cheers,
Martin

Re: [PATCH] gcov-profile: Allow negavive counts of indirect calls [PR105282]

2022-04-19 Thread Martin Liška


Hi.

Thanks you for the patch, please apply it.

Cheers,
Martin

Re: [PATCH] libstdc++: Add pretty printer for std::span

2022-04-19 Thread Philipp Fent via Gcc-patches


On 04.04.22 13:39, Jonathan Wakely wrote:

Nice, thanks. I'll get this committed in time for GCC 12 (and backport
it to release branches too).


I've attached a rebased patch for trunk and tested it on x86_64-linux.
I also backported it for the release branches, gcc-11 tests also pass, 
on gcc-10 the prettyprinters testsuite reports "unsupported", and gcc-9 
didn't have std::span yet.From 0f4ae81980ea1181aca4deca0508628f9f30e72b Mon Sep 17 00:00:00 2001
From: Philipp Fent 
Date: Mon, 4 Apr 2022 12:52:57 +0200
Subject: [PATCH] libstdc++: Add pretty printer for std::span

This improves the debug output for C++20 spans.
Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffb9a8,
_M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}
---
 libstdc++-v3/python/libstdcxx/v6/printers.py  | 38 +++
 .../libstdc++-prettyprinters/cxx20.cc | 11 ++
 2 files changed, 49 insertions(+)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 74c629a710c..790d83fecff 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1535,6 +1535,43 @@ class StdErrorCodePrinter:
 pass
 return '%s = {"%s": %s}' % (self.typename, category, strval)
 
+class StdSpanPrinter:
+"Print a std::span"
+
+class _iterator(Iterator):
+def __init__(self, begin, size):
+self.count = 0
+self.begin = begin
+self.size = size
+
+def __iter__ (self):
+return self
+
+def __next__ (self):
+if self.count == self.size:
+raise StopIteration
+
+count = self.count
+self.count = self.count + 1
+return '[%d]' % count, (self.begin + count).dereference()
+
+def __init__(self, typename, val):
+self.typename = typename
+self.val = val
+if val.type.template_argument(1) == gdb.parse_and_eval('static_cast(-1)'):
+self.size = val['_M_extent']['_M_extent_value']
+else:
+self.size = val.type.template_argument(1)
+
+def to_string(self):
+return '%s of length %d' % (self.typename, self.size)
+
+def children(self):
+return self._iterator(self.val['_M_ptr'], self.size)
+
+def display_hint(self):
+return 'array'
+
 # A "regular expression" printer which conforms to the
 # "SubPrettyPrinter" protocol from gdb.printing.
 class RxPrinter(object):
@@ -2043,6 +2080,7 @@ def build_libstdcxx_dictionary ():
 libstdcxx_printer.add_version('std::', 'partial_ordering', StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'weak_ordering', StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'strong_ordering', StdCmpCatPrinter)
+libstdcxx_printer.add_version('std::', 'span', StdSpanPrinter)
 
 # Extensions.
 libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
index 9a868c4baf7..0887f1868f2 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
@@ -18,8 +18,10 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+#include 
 #include 
 #include 
+#include 
 
 struct X
 {
@@ -54,6 +56,15 @@ main()
   auto c10 = 0.0 <=> __builtin_nan("");
 // { dg-final { note-test c10 "std::partial_ordering::unordered" } }
 
+  auto il = {1, 2};
+  auto s1 = std::span(il);
+  static_assert(s1.extent == std::size_t(-1));
+// { dg-final { note-test s1 {std::span of length 2 = {1, 2}} } }
+  auto a = std::array{3, 4};
+  auto s2 = std::span(a);
+  static_assert(s2.extent == std::size_t(2));
+// { dg-final { note-test s2 {std::span of length 2 = {3, 4}} } }
+
   std::cout << "\n";
   return 0;			// Mark SPOT
 }
-- 
2.35.3

From c4331b7532dc2825429e82e46fda1a04dd943bd4 Mon Sep 17 00:00:00 2001
From: Philipp Fent 
Date: Mon, 4 Apr 2022 12:52:57 +0200
Subject: [PATCH] libstdc++: Add pretty printer for std::span

This improves the debug output for C++20 spans.
Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffb9a8,
_M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}
---
 libstdc++-v3/python/libstdcxx/v6/printers.py  | 38 +++
 .../libstdc++-prettyprinters/cxx20.cc | 11 ++
 2 files changed, 49 insertions(+)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index f7a7f9961a7..6d8b765f2da 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1654,6 +1654,43 @@ class StdRegexStatePrinter:
 s = "{}, {}={}".format(s, v, self.val['_M_' + v])
 return "{%s}" % (s)
 
+clas

Re: [PATCH] PR105169 Fix references to discarded sections

2022-04-19 Thread Richard Biener via Gcc-patches

On Thu, 14 Apr 2022, Giuliano Belinassi wrote:

> When -fpatchable-function-entry= is enabled, certain C++ codes fails to
> link because of generated references to discarded sections in
> __patchable_function_entry section. This commit fixes this problem by
> puting those references in a COMDAT section.
> 
> Boostrapped and regtested on x86_64 linux.
> 
> OK for Stage4?
> 
> 2022-04-13  Giuliano Belinassi  
> 
>   PR c++/105169
>   * targhooks.cc (default_print_patchable_function_entry_1): Handle 
> COMDAT case.
>   * varasm.cc (handle_vtv_comdat_section): Rename to...
>   (switch_to_comdat_section): Generalize to also cover
>   __patchable_function_entry case.
>   (assemble_variable): Rename call from handle_vtv_comdat_section to
>   switch_to_comdat_section.
>   (output_object_block): Same as above.
>   * varasm.h: Declare switch_to_comdat_section.
> 
> 2022-04-13  Giuliano Belinassi  
> 
>   PR c++/105169
>   * g++.dg/modules/pr105169.h: New file.
>   * g++.dg/modules/pr105169_a.C: New test.
>   * g++.dg/modules/pr105169_b.C: New file.
> 
> Signed-off-by: Giuliano Belinassi 
> ---
>  gcc/targhooks.cc  |  8 ++--
>  gcc/testsuite/ChangeLog   |  7 +++
>  gcc/testsuite/g++.dg/modules/pr105169.h   | 22 
>  gcc/testsuite/g++.dg/modules/pr105169_a.C | 25 +++
>  gcc/testsuite/g++.dg/modules/pr105169_b.C | 12 +++
>  gcc/varasm.cc | 25 +--
>  gcc/varasm.h  |  1 +
>  7 files changed, 87 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr105169.h
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr105169_a.C
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr105169_b.C
> 
> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> index e22bc66a6c8..540460e7db9 100644
> --- a/gcc/targhooks.cc
> +++ b/gcc/targhooks.cc
> @@ -1995,8 +1995,12 @@ default_print_patchable_function_entry_1 (FILE *file,
>patch_area_number++;
>ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE", patch_area_number);
>  
> -  switch_to_section (get_section ("__patchable_function_entries",
> -   flags, current_function_decl));
> +  section *sect = get_section ("__patchable_function_entries",
> +   flags, current_function_decl);
> +  if (HAVE_COMDAT_GROUP && DECL_COMDAT_GROUP (current_function_decl))
> + switch_to_comdat_section (sect, current_function_decl);

You are passing a decl here, but ...

> +  else
> + switch_to_section (sect);
>assemble_align (POINTER_SIZE);
>fputs (asm_op, file);
>assemble_name_raw (file, buf);
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 9ab7a178bf8..524a546a832 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,10 @@
> +2022-04-13  Giuliano Belinassi  
> +
> + PR c++/105169
> + * g++.dg/modules/pr105169.h: New file.
> + * g++.dg/modules/pr105169_a.C: New test.
> + * g++.dg/modules/pr105169_b.C: New file.
> +
>  2022-04-12  Antoni Boucher  
>  
>   PR jit/104293
> diff --git a/gcc/testsuite/g++.dg/modules/pr105169.h 
> b/gcc/testsuite/g++.dg/modules/pr105169.h
> new file mode 100644
> index 000..a7e76270531
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/pr105169.h
> @@ -0,0 +1,22 @@
> +class IPXAddressClass
> +{
> +public:
> +IPXAddressClass(void);
> +};
> +
> +class WinsockInterfaceClass
> +{
> +
> +public:
> +WinsockInterfaceClass(void);
> +
> +virtual void Set_Broadcast_Address(void*){};
> +
> +virtual int Get_Protocol(void)
> +{
> +return 0;
> +};
> +
> +protected:
> +};
> +
> diff --git a/gcc/testsuite/g++.dg/modules/pr105169_a.C 
> b/gcc/testsuite/g++.dg/modules/pr105169_a.C
> new file mode 100644
> index 000..66dc4b7901f
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/pr105169_a.C
> @@ -0,0 +1,25 @@
> +/* { dg-module-do link } */
> +/* { dg-options "-std=c++11 -fpatchable-function-entry=1 -O2" } */
> +/* { dg-additional-options "-std=c++11 -fpatchable-function-entry=1 -O2" } */
> +
> +/* This test is in the "modules" package because it supports multiple files
> +   linkage.  */
> +
> +#include "pr105169.h"
> +
> +WinsockInterfaceClass* PacketTransport;
> +
> +IPXAddressClass::IPXAddressClass(void)
> +{
> +}
> +
> +int function()
> +{
> +  return PacketTransport->Get_Protocol();
> +}
> +
> +int main()
> +{
> +  IPXAddressClass ipxaddr;
> +  return 0;
> +}
> diff --git a/gcc/testsuite/g++.dg/modules/pr105169_b.C 
> b/gcc/testsuite/g++.dg/modules/pr105169_b.C
> new file mode 100644
> index 000..5f8b00dfe51
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/pr105169_b.C
> @@ -0,0 +1,12 @@
> +/* { dg-module-do link } */
> +/* { dg-options "-std=c++11 -fpatchable-function-entry=1 -O2" } */
> +/* { dg-additiona

Re: Ping^2 [PATCH, rs6000] Correct match pattern in pr56605.c

2022-04-19 Thread HAO CHEN GUI via Gcc-patches

Hi,
   I tested the test case on Linux and AIX with both big and little endian.
The test case requires lp64 target, so it won't be tested on 32-bit targets.

On big endian (both AIX and Linux), it should match
(compare:CC (and:SI (subreg:SI (reg:DI 207) 4)

On little endian (both AIX and Linux), it should match
(compare:CC (and:SI (subreg:SI (reg:DI 207) 0)

So, the pattern in my patch should work fine.

/* { dg-final { scan-rtl-dump-times {\(compare:CC \(and:SI \(subreg:SI 
\(reg:DI} 1 "combine" } } */

Thanks.

On 14/4/2022 上午 5:30, Segher Boessenkool wrote:
> On Mon, Apr 11, 2022 at 08:54:14PM -0300, Alexandre Oliva wrote:
>> On Apr  7, 2022, HAO CHEN GUI via Gcc-patches  
>> wrote:
>>
>>>   Gentle ping this:
>>>https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590958.html
>>> Thanks
>>
 On 28/2/2022 上午 11:17, HAO CHEN GUI wrote:
>>
> This patch corrects the match pattern in pr56605.c. The former pattern
> is wrong and test case fails with GCC11. It should match following insn on
> each subtarget after mode promotion is disabled. The patch need to be
> backported to GCC11.
>>
> -/* { dg-final { scan-rtl-dump-times {\(compare:CC 
> \((?:and|zero_extend):(?:DI) \((?:sub)?reg:[SD]I} 1 "combine" } } */
> +/* { dg-final { scan-rtl-dump-times {\(compare:CC \(and:SI \(subreg:SI 
> \(reg:DI} 1 "combine" } } */
>>
>>
>> How about this less strict change instead?
>>
>>
>> ppc: testsuite: PROMOTE_MODE fallout pr56605 [PR102146]
>>
>> The test expects a compare of DImode values, but after the removal of
>> PROMOTE_MODE from rs6000/, we get SImode.  Adjust the expectations.
>>
>> Tested with gcc-11 targeting ppc64-vx7r2.  Ok to install?
> 
> This should have been tested on Linux as well: it is now broken on both
> -m32 and -m64 there.  Please revert?
> 
> 
> Segher

Re: [pushed] libgccjit: Fix a bootstrap break for some targets.

2022-04-19 Thread Richard Biener via Gcc-patches

On Thu, Apr 14, 2022 at 9:19 PM Iain Sandoe via Gcc-patches
 wrote:
>
> Some targets use 'long long unsigned int' for unsigned HW int, and this
> leads to a Werror=format= fail for two print cases in jit-playback.cc
> introduced in r12-8117-g30f7c83e9cfe (Add support for bitcasts [PR104071])
>
> As discussed on IRC, casting to (long) seems entirely reasonable for the
> values (since they are type sizes).
>
> tested that this fixes bootstrap on x86_64-darwin19 and running check-jit.
> pushed to master, thanks
> Iain
>
> Signed-off-by: Iain Sandoe 
>
> gcc/jit/ChangeLog:
>
> * jit-playback.cc (new_bitcast): Cast values returned by tree_to_uhwi
> to 'long' to match the print format.
> ---
>  gcc/jit/jit-playback.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
> index b1e72fbcf8a..6be6bdf8dea 100644
> --- a/gcc/jit/jit-playback.cc
> +++ b/gcc/jit/jit-playback.cc
> @@ -1440,10 +1440,10 @@ new_bitcast (location *loc,
>  active_playback_ctxt->add_error (loc,
>"bitcast with types of different sizes");
>  fprintf (stderr, "input expression (size: %ld):\n",
> -  tree_to_uhwi (expr_size));
> +  (long) tree_to_uhwi (expr_size));

You could use "size: " HOST_WIDE_INT_PRINT_DEC "):\n",
see hwint.h for the full set of formats available for HOST_WIDE_INT.

>  debug_tree (t_expr);
>  fprintf (stderr, "requested type (size: %ld):\n",
> -  tree_to_uhwi (type_size));
> +  (long) tree_to_uhwi (type_size));
>  debug_tree (t_dst_type);
>}
>tree t_bitcast = build1 (VIEW_CONVERT_EXPR, t_dst_type, t_expr);
> --
> 2.24.3 (Apple Git-128)
>

Re: [PATCH] doc/invoke.texi: CRIS: Remove references to cris-axis-linux-gnu

2022-04-19 Thread Richard Biener via Gcc-patches

On Mon, Apr 18, 2022 at 6:51 PM Hans-Peter Nilsson  wrote:
>
> I'm about to commit this to master.
>
> I'd like to also install this on the gcc-11 branch.
>
> Ok?

OK.

> -- 8< --
>
> ...and related options.  These stale bits were overlooked when support
> for "Linux/GNU" and CRIS v32 was removed, before the gcc-11 release.
>
> Resulting pdf, html and info inspected for sanity.
>
> gcc:
> * doc/install.texi : Remove references to removed websites and
> adjust for cris-*-elf being the only remaining toolchain.
> ---
>  gcc/doc/invoke.texi | 29 +++--
>  1 file changed, 7 insertions(+), 22 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 1a51759e6e45..cee625c92dd6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -854,12 +854,12 @@ Objective-C and Objective-C++ Dialects}.
>  -msim  -msdata=@var{sdata-type}}
>
>  @emph{CRIS Options}
> -@gccoptlist{-mcpu=@var{cpu}  -march=@var{cpu}  -mtune=@var{cpu} @gol
> --mmax-stack-frame=@var{n}  -melinux-stacksize=@var{n} @gol
> +@gccoptlist{-mcpu=@var{cpu}  -march=@var{cpu}
> +-mtune=@var{cpu} -mmax-stack-frame=@var{n} @gol
>  -metrax4  -metrax100  -mpdebug  -mcc-init  -mno-side-effects @gol
>  -mstack-align  -mdata-align  -mconst-align @gol
> --m32-bit  -m16-bit  -m8-bit  -mno-prologue-epilogue  -mno-gotplt @gol
> --melf  -maout  -melinux  -mlinux  -sim  -sim2 @gol
> +-m32-bit  -m16-bit  -m8-bit  -mno-prologue-epilogue @gol
> +-melf  -maout  -sim  -sim2 @gol
>  -mmul-bug-workaround  -mno-mul-bug-workaround}
>
>  @emph{CR16 Options}
> @@ -22365,8 +22365,7 @@ These options are defined specifically for the CRIS 
> ports.
>  Generate code for the specified architecture.  The choices for
>  @var{architecture-type} are @samp{v3}, @samp{v8} and @samp{v10} for
>  respectively ETRAX@w{ }4, ETRAX@w{ }100, and ETRAX@w{ }100@w{ }LX@.
> -Default is @samp{v0} except for cris-axis-linux-gnu, where the default is
> -@samp{v10}.
> +Default is @samp{v0}.
>
>  @item -mtune=@var{architecture-type}
>  @opindex mtune
> @@ -22450,27 +22449,13 @@ option only together with visual inspection of the 
> compiled code: no
>  warnings or errors are generated when call-saved registers must be saved,
>  or storage for local variables needs to be allocated.
>
> -@item -mno-gotplt
> -@itemx -mgotplt
> -@opindex mno-gotplt
> -@opindex mgotplt
> -With @option{-fpic} and @option{-fPIC}, don't generate (do generate)
> -instruction sequences that load addresses for functions from the PLT part
> -of the GOT rather than (traditional on other architectures) calls to the
> -PLT@.  The default is @option{-mgotplt}.
> -
>  @item -melf
>  @opindex melf
> -Legacy no-op option only recognized with the cris-axis-elf and
> -cris-axis-linux-gnu targets.
> -
> -@item -mlinux
> -@opindex mlinux
> -Legacy no-op option only recognized with the cris-axis-linux-gnu target.
> +Legacy no-op option.
>
>  @item -sim
>  @opindex sim
> -This option, recognized for the cris-axis-elf, arranges
> +This option arranges
>  to link with input-output functions from a simulator library.  Code,
>  initialized data and zero-initialized data are allocated consecutively.
>
> --
> 2.30.2
>

Re: [PATCH] doc/install.texi: CRIS: Remove gone websites. Adjust CRIS targets

2022-04-19 Thread Richard Biener via Gcc-patches

On Mon, Apr 18, 2022 at 6:48 PM Hans-Peter Nilsson  wrote:
>
> I'm about to commit this to master.
>
> I'd like to also install this on the gcc-11 branch.

OK.

> Ok?

OK.

> -- 8< --
>
> That is, support for cris-linux-gnu was removed in gcc-11, but
> install.texi wasn't adjusted accordingly.  Also, unfortunately the
> developer-related sites are gone with no replacements.  And, CRIS is
> used in other chip series as well, but allude rather than list.
>
> The generated manpages, info, pdf and html were sanity-checked.
>
> gcc:
> * doc/install.texi : Remove references to removed websites and
> adjust for cris-*-elf being the only remaining toolchain.
> ---
>  gcc/doc/install.texi | 21 -
>  1 file changed, 4 insertions(+), 17 deletions(-)
>
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index ab67a639836b..304785767027 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -3901,8 +3901,8 @@ configure GCC@ for building a CR16 uclinux 
> cross-compiler.
>  @end html
>  @anchor{cris}
>  @heading CRIS
> -CRIS is the CPU architecture in Axis Communications ETRAX system-on-a-chip
> -series.  These are used in embedded applications.
> +CRIS is a CPU architecture in Axis Communications systems-on-a-chip, for
> +example the ETRAX series.  These are used in embedded applications.
>
>  @ifnothtml
>  @xref{CRIS Options,, CRIS Options, gcc, Using the GNU Compiler
> @@ -3913,21 +3913,8 @@ See ``CRIS Options'' in the main manual
>  @end ifhtml
>  for a list of CRIS-specific options.
>
> -There are a few different CRIS targets:
> -@table @code
> -@item cris-axis-elf
> -Mainly for monolithic embedded systems.  Includes a multilib for the
> -@samp{v10} core used in @samp{ETRAX 100 LX}.
> -@item cris-axis-linux-gnu
> -A GNU/Linux port for the CRIS architecture, currently targeting
> -@samp{ETRAX 100 LX} by default.
> -@end table
> -
> -Pre-packaged tools can be obtained from
> -@uref{ftp://ftp.axis.com/@/pub/@/axis/@/tools/@/cris/@/compiler-kit/}.  More
> -information about this platform is available at
> -@uref{http://developer.axis.com/}.
> -
> +Use @samp{configure --target=cris-elf} to configure GCC@ for building
> +a cross-compiler for CRIS.
>  @html
>  
>  @end html
> --
> 2.30.2
>

Re: [PATCH] testsuite: Skip target not support -pthread [pr104676].

2022-04-19 Thread Richard Biener via Gcc-patches

On Tue, 19 Apr 2022, jiawei wrote:

> The "ftree-parallelize-loops=" imply -pthread option in gcc/gcc.cc,
> some target are not support pthread like elf target use newlib,
> and will get an error:
> 
> "*-*-elf-gcc: error: unrecognized command-line option '-pthread'"
> 
> so we add an additional condition "{target pthread}" to make sure the
> dg-additional-options runs on support targets.

OK.

> ---
>  gcc/testsuite/gcc.dg/torture/pr104676.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/torture/pr104676.c 
> b/gcc/testsuite/gcc.dg/torture/pr104676.c
> index 50845bb9e15..0991b78f758 100644
> --- a/gcc/testsuite/gcc.dg/torture/pr104676.c
> +++ b/gcc/testsuite/gcc.dg/torture/pr104676.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-additional-options "-ftree-loop-distribution 
> -ftree-parallelize-loops=2" } */
> +/* { dg-additional-options "-ftree-loop-distribution 
> -ftree-parallelize-loops=2" { target pthread } } */
>  
>  struct S {
>int f;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Re: [PATCH] libstdc++: Update incorrect statement about mainline in docs

2022-04-19 Thread Richard Biener via Gcc-patches

On Thu, 14 Apr 2022, Jonathan Wakely wrote:

> On Thu, 14 Apr 2022 at 11:55, Richard Biener  wrote:
> >
> > On Thu, 14 Apr 2022, Jonathan Wakely wrote:
> >
> > > On Thu, 14 Apr 2022 at 11:36, Richard Biener  wrote:
> > > >
> > > > On Thu, 14 Apr 2022, Jonathan Wakely wrote:
> > > >
> > > > > This fixes some misleading text in the libstdc++ manual that says the
> > > > > docs for the gcc-11 branch refer to mainline.
> > > > >
> > > > > Richi, is this OK for the gcc-11 branch now? It's been wrong for 11.1
> > > > > and 11.2, but it would still be nice to fix.
> > > >
> > > > Yes, it's OK.  I notice the same problem exists on the GCC 10 branch
> > > > but GCC 9 at least mentions GCC 9 once ;)
> > >
> > > Yes, I fixed it for gcc-9.3.0, but forgot to do it for gcc-10 and gcc-11.
> > >
> > > I pushed r10-10534 to fix gcc-10 (since that's open for doc changes)
> > > and have now pushed r11-9881
> > > as well.
> > >
> > > Maybe this year I'll remember to do it for gcc-12 after we branch from 
> > > trunk!
> >
> > Add an entry to branching.html!
> 
> Like this? OK for wwwdocs?

Maybe

"Notify libstdc++ maintainers to update ..."

?

OK with that change.

Richard.

Re: [PATCH] tree-optimization/104010 - fix SLP scalar costing with patterns

2022-04-19 Thread Richard Biener via Gcc-patches

On Thu, 14 Apr 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Thu, 14 Apr 2022, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > When doing BB vectorization the scalar cost compute is derailed
> >> > by patterns, causing lanes to be considered live and thus not
> >> > costed on the scalar side.  For the testcase in PR104010 this
> >> > prevents vectorization which was done by GCC 11.  PR103941
> >> > shows similar cases of missed optimizations that are fixed by
> >> > this patch.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >> >
> >> > I'm only considering this now because PR104010 is identified
> >> > as regression on arm - Richards, what do you think?  I do think
> >> > this will enable vectorization of more stuff now which might
> >> > be good or bad - who knowns, but at least it needs to involve
> >> > patterns.
> >> >
> >> > Thanks,
> >> > Richard.
> >> >
> >> > 2022-04-13  Richard Biener  
> >> >
> >> >  PR tree-optimization/104010
> >> >  PR tree-optimization/103941
> >> >  * tree-vect-slp.cc (vect_bb_slp_scalar_cost): When
> >> >  we run into stmts in patterns continue walking those
> >> >  for uses outside of the vectorized region instead of
> >> >  marking the lane live.
> >> >
> >> >  * gcc.target/i386/pr103941-1.c: New testcase.
> >> >  * gcc.target/i386/pr103941-2.c: Likewise.
> >> > ---
> >> >  gcc/testsuite/gcc.target/i386/pr103941-1.c | 14 +++
> >> >  gcc/testsuite/gcc.target/i386/pr103941-2.c | 12 ++
> >> >  gcc/tree-vect-slp.cc   | 47 --
> >> >  3 files changed, 61 insertions(+), 12 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr103941-1.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr103941-2.c
> >> >
> >> > diff --git a/gcc/testsuite/gcc.target/i386/pr103941-1.c 
> >> > b/gcc/testsuite/gcc.target/i386/pr103941-1.c
> >> > new file mode 100644
> >> > index 000..524fdd0b4b1
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/i386/pr103941-1.c
> >> > @@ -0,0 +1,14 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-O2 -msse2" } */
> >> > +
> >> > +unsigned char ur[16], ua[16], ub[16];
> >> > +
> >> > +void avgu_v2qi (void)
> >> > +{
> >> > +  int i;
> >> > +
> >> > +  for (i = 0; i < 2; i++)
> >> > +ur[i] = (ua[i] + ub[i] + 1) >> 1;
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler "pavgb" } } */
> >> > diff --git a/gcc/testsuite/gcc.target/i386/pr103941-2.c 
> >> > b/gcc/testsuite/gcc.target/i386/pr103941-2.c
> >> > new file mode 100644
> >> > index 000..972a32be997
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/i386/pr103941-2.c
> >> > @@ -0,0 +1,12 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-O2 -msse2" } */
> >> > +
> >> > +void foo (int *c, float *x, float *y)
> >> > +{
> >> > +  c[0] = x[0] < y[0];
> >> > +  c[1] = x[1] < y[1];
> >> > +  c[2] = x[2] < y[2];
> >> > +  c[3] = x[3] < y[3];
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler "cmpltps" } } */
> >> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> >> > index 4ac2b70303c..c7687065374 100644
> >> > --- a/gcc/tree-vect-slp.cc
> >> > +++ b/gcc/tree-vect-slp.cc
> >> > @@ -5185,22 +5185,45 @@ vect_bb_slp_scalar_cost (vec_info *vinfo,
> >> >   the scalar cost.  */
> >> >if (!STMT_VINFO_LIVE_P (stmt_info))
> >> >  {
> >> > -  FOR_EACH_PHI_OR_STMT_DEF (def_p, orig_stmt, op_iter, 
> >> > SSA_OP_DEF)
> >> > +  auto_vec worklist;
> >> > +  hash_set *worklist_visited = NULL;
> >> > +  worklist.quick_push (orig_stmt);
> >> > +  do
> >> >  {
> >> > -  imm_use_iterator use_iter;
> >> > -  gimple *use_stmt;
> >> > -  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR 
> >> > (def_p))
> >> > -if (!is_gimple_debug (use_stmt))
> >> > -  {
> >> > -stmt_vec_info use_stmt_info = vinfo->lookup_stmt 
> >> > (use_stmt);
> >> > -if (!use_stmt_info
> >> > -|| !vectorized_scalar_stmts.contains 
> >> > (use_stmt_info))
> >> > +  gimple *work_stmt = worklist.pop ();
> >> > +  FOR_EACH_PHI_OR_STMT_DEF (def_p, work_stmt, op_iter, 
> >> > SSA_OP_DEF)
> >> > +{
> >> > +  imm_use_iterator use_iter;
> >> > +  gimple *use_stmt;
> >> > +  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter,
> >> > + DEF_FROM_PTR (def_p))
> >> > +if (!is_gimple_debug (use_stmt))
> >> >{
> >> > -(*life)[i] = true;
> >> > -break;
> >> > +stmt_vec_info use_stmt_info
> >> > +  = vinfo->lookup_stmt (use_stmt);
> >> > +if (!use_stmt_info
> >> > +

58 matches

Mail list logo