[PATCH] dwarf2out: Fix up split-dwarf .debug_macro handling [PR99319]

2021-03-02 Thread Jakub Jelinek via Gcc-patches
Hi!

The -gsplit-dwarf changes came a few months after .debug_macro
and the r0-120109 changes just changed the 2nd operand of
DW_MACRO_GNU_{define,undef}_indirect from the usual .debug_str
section offset argument to leb128 index into .debug_str_offsets
without changing the opcodes.

DWARF5 standardized different opcodes for those, but GCC hasn't been changed
yet for that.
This patch starts using DW_MACRO_define_strx and DW_MACRO_undef_strx
instead of DW_MACRO_define_strp and DW_MACRO_undef_strp when -gsplit-dwarf
-gdwarf-5 -g3.  I'm not sure what to do if anything with the -gdwarf-4
-gsplit-dwarf -g3 -gno-strict-dwarf case, we've been emitting it that way
for 8 years and it is an extension, so presumably the consumers that cared
have already hacks to handle DW_MACRO_GNU_{define,undef}_indirect
differently in .debug_macro 4 sections depending on if it is
.debug_macro.dwo or .debug_macro.
Another change the patch does is that it will use
DW_MACRO_{define,undef}_str{p,x} even with -gdwarf-5 -gstrict-dwarf -g3,
for DWARF 4 we were doing that only for -gno-strict-dwarf as we've emitted
.debug_macro section only in that case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-02  Jakub Jelinek  

PR debug/99319
* dwarf2out.c (output_macinfo_op): Use DW_MACRO_*_str* even with
-gdwarf-5 -gstrict-dwarf.  For -gsplit-dwarf -gdwarf-5 use
DW_MACRO_*_strx instead of DW_MACRO_*_strp.  Handle
DW_MACRO_define_strx and DW_MACRO_undef_strx.
(save_macinfo_strings): Use DW_MACRO_*_str* even with
-gdwarf-5 -gstrict-dwarf.  Handle DW_MACRO_define_strx and
DW_MACRO_undef_strx.

--- gcc/dwarf2out.c.jj  2021-02-27 10:40:22.759330948 +0100
+++ gcc/dwarf2out.c 2021-03-01 16:14:55.150163790 +0100
@@ -28267,13 +28267,17 @@ output_macinfo_op (macinfo_entry *ref)
 case DW_MACINFO_define:
 case DW_MACINFO_undef:
   len = strlen (ref->info) + 1;
-  if (!dwarf_strict
+  if ((!dwarf_strict || dwarf_version >= 5)
  && len > (size_t) dwarf_offset_size
  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
{
- ref->code = ref->code == DW_MACINFO_define
- ? DW_MACRO_define_strp : DW_MACRO_undef_strp;
+ if (dwarf_split_debug_info && dwarf_version >= 5)
+   ref->code = ref->code == DW_MACINFO_define
+   ? DW_MACRO_define_strx : DW_MACRO_undef_strx;
+ else
+   ref->code = ref->code == DW_MACINFO_define
+   ? DW_MACRO_define_strp : DW_MACRO_undef_strp;
  output_macinfo_op (ref);
  return;
}
@@ -28285,7 +28289,18 @@ output_macinfo_op (macinfo_entry *ref)
   dw2_asm_output_nstring (ref->info, -1, "The macro");
   break;
 case DW_MACRO_define_strp:
+  dw2_asm_output_data (1, ref->code, "Define macro strp");
+  goto do_DW_MACRO_define_strpx;
 case DW_MACRO_undef_strp:
+  dw2_asm_output_data (1, ref->code, "Undefine macro strp");
+  goto do_DW_MACRO_define_strpx;
+case DW_MACRO_define_strx:
+  dw2_asm_output_data (1, ref->code, "Define macro strx");
+  goto do_DW_MACRO_define_strpx;
+case DW_MACRO_undef_strx:
+  dw2_asm_output_data (1, ref->code, "Undefine macro strx");
+  /* FALLTHRU */
+do_DW_MACRO_define_strpx:
   /* NB: dwarf2out_finish performs:
   1. save_macinfo_strings
   2. hash table traverse of index_string
@@ -28302,10 +28317,6 @@ output_macinfo_op (macinfo_entry *ref)
   gcc_assert (node
  && (node->form == DW_FORM_strp
  || node->form == dwarf_FORM (DW_FORM_strx)));
-  dw2_asm_output_data (1, ref->code,
-  ref->code == DW_MACRO_define_strp
-  ? "Define macro strp"
-  : "Undefine macro strp");
   dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
   (unsigned long) ref->lineno);
   if (node->form == DW_FORM_strp)
@@ -28475,7 +28486,7 @@ save_macinfo_strings (void)
   case DW_MACINFO_define:
   case DW_MACINFO_undef:
 len = strlen (ref->info) + 1;
-if (!dwarf_strict
+   if ((!dwarf_strict || dwarf_version >= 5)
 && len > (unsigned) dwarf_offset_size
 && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
 && (debug_str_section->common.flags & SECTION_MERGE) != 0)
@@ -28489,6 +28500,8 @@ save_macinfo_strings (void)
/* Fall through. */
  case DW_MACRO_define_strp:
  case DW_MACRO_undef_strp:
+ case DW_MACRO_define_strx:
+ case DW_MACRO_undef_strx:
 set_indirect_string (find_AT_string (ref->info));
 break;
   default:

Jakub



[PATCH] vrp: Improve register_edge_assert_for [PR95757]

2021-03-02 Thread Jakub Jelinek via Gcc-patches
Hi!

The Wstringop-overflow-25.c testcase doesn't emit one of the expected
warnings on targets that don't do short curcuiting due to target costs
(or e.g. with --param=logical-op-non-short-circuit=0 on all targets).

The problem is that only reassoc2 optimizes:
  _49 ={v} unsigned_value_source;
  if (_49 == 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  if (_49 > 2)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 268435457]:
  _53 = _49 + 1;
into:
  _49 ={v} unsigned_value_source;
  _48 = _49 + 18446744073709551615;
  _1 = _48 > 1;
  if (_1 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 268435457]:
  _53 = _49 + 1;
(but, note the _1 = _48 > 1; if (_1 != 0)),
then dom3 is run and because of that if (_1 != 0) vs. if (_48 > 1) doesn't
register edge asserts for _48 and _49) and so we don't get
SSA_NAME_RANGE_INFO for _53 (and ditto for vrp2) and only afterwards comes
forwprop4 that canonicalizes it to if (_48 > 1).  While with
--param=logical-op-non-short-circuit=1 it is already reassoc1 that optimizes
it and forwprop3 that propagates it, so we have on the SSA_NAME
corresponding to _53 above SSA_NAME_RANGE_INFO and during expansion we warn.

The following patch fixes it by handling those not yet propagated
comparisons into GIMPLE_COND in register_edge_assert_for.  We already
have all the infrastructure there to handle the 
--param=logical-op-non-short-circuit=1
| and &s.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-02  Jakub Jelinek  

PR middle-end/95757
* tree-vrp.c (register_edge_assert_for): Remove superfluous ()s around
condition.  Call register_edge_assert_for_1 for == 0, != 0, == 1 and
!= 1 comparisons if name is lhs of a comparison.

--- gcc/tree-vrp.c.jj   2021-02-25 10:16:46.495355574 +0100
+++ gcc/tree-vrp.c  2021-03-01 18:50:48.586354899 +0100
@@ -2180,8 +2180,8 @@ register_edge_assert_for (tree name, edg
   /* In the case of NAME == 1 or NAME != 0, for BIT_AND_EXPR defining
  statement of NAME we can assert both operands of the BIT_AND_EXPR
  have nonzero value.  */
-  if (((comp_code == EQ_EXPR && integer_onep (val))
-   || (comp_code == NE_EXPR && integer_zerop (val
+  if ((comp_code == EQ_EXPR && integer_onep (val))
+  || (comp_code == NE_EXPR && integer_zerop (val)))
 {
   gimple *def_stmt = SSA_NAME_DEF_STMT (name);
 
@@ -2193,28 +2193,36 @@ register_edge_assert_for (tree name, edg
  register_edge_assert_for_1 (op0, NE_EXPR, e, asserts);
  register_edge_assert_for_1 (op1, NE_EXPR, e, asserts);
}
+  else if (is_gimple_assign (def_stmt)
+  && (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+  == tcc_comparison))
+   register_edge_assert_for_1 (name, NE_EXPR, e, asserts);
 }
 
   /* In the case of NAME == 0 or NAME != 1, for BIT_IOR_EXPR defining
  statement of NAME we can assert both operands of the BIT_IOR_EXPR
  have zero value.  */
-  if (((comp_code == EQ_EXPR && integer_zerop (val))
-   || (comp_code == NE_EXPR && integer_onep (val
+  if ((comp_code == EQ_EXPR && integer_zerop (val))
+  || (comp_code == NE_EXPR
+ && integer_onep (val)
+ && TYPE_PRECISION (TREE_TYPE (name)) == 1))
 {
   gimple *def_stmt = SSA_NAME_DEF_STMT (name);
 
   /* For BIT_IOR_EXPR only if NAME == 0 both operands have
 necessarily zero value, or if type-precision is one.  */
   if (is_gimple_assign (def_stmt)
- && (gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR
- && (TYPE_PRECISION (TREE_TYPE (name)) == 1
- || comp_code == EQ_EXPR)))
+ && gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR)
{
  tree op0 = gimple_assign_rhs1 (def_stmt);
  tree op1 = gimple_assign_rhs2 (def_stmt);
  register_edge_assert_for_1 (op0, EQ_EXPR, e, asserts);
  register_edge_assert_for_1 (op1, EQ_EXPR, e, asserts);
}
+  else if (is_gimple_assign (def_stmt)
+  && (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
+  == tcc_comparison))
+   register_edge_assert_for_1 (name, EQ_EXPR, e, asserts);
 }
 
   /* Sometimes we can infer ranges from (NAME & MASK) == VALUE.  */

Jakub



Re: [PATCH] vrp: Improve register_edge_assert_for [PR95757]

2021-03-02 Thread Richard Biener
On Tue, 2 Mar 2021, Jakub Jelinek wrote:

> Hi!
> 
> The Wstringop-overflow-25.c testcase doesn't emit one of the expected
> warnings on targets that don't do short curcuiting due to target costs
> (or e.g. with --param=logical-op-non-short-circuit=0 on all targets).
> 
> The problem is that only reassoc2 optimizes:
>   _49 ={v} unsigned_value_source;
>   if (_49 == 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 536870913]:
>   if (_49 > 2)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 268435457]:
>   _53 = _49 + 1;
> into:
>   _49 ={v} unsigned_value_source;
>   _48 = _49 + 18446744073709551615;
>   _1 = _48 > 1;
>   if (_1 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 268435457]:
>   _53 = _49 + 1;
> (but, note the _1 = _48 > 1; if (_1 != 0)),
> then dom3 is run and because of that if (_1 != 0) vs. if (_48 > 1) doesn't
> register edge asserts for _48 and _49) and so we don't get
> SSA_NAME_RANGE_INFO for _53 (and ditto for vrp2) and only afterwards comes
> forwprop4 that canonicalizes it to if (_48 > 1).  While with
> --param=logical-op-non-short-circuit=1 it is already reassoc1 that optimizes
> it and forwprop3 that propagates it, so we have on the SSA_NAME
> corresponding to _53 above SSA_NAME_RANGE_INFO and during expansion we warn.
> 
> The following patch fixes it by handling those not yet propagated
> comparisons into GIMPLE_COND in register_edge_assert_for.  We already
> have all the infrastructure there to handle the 
> --param=logical-op-non-short-circuit=1
> | and &s.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-03-02  Jakub Jelinek  
> 
>   PR middle-end/95757
>   * tree-vrp.c (register_edge_assert_for): Remove superfluous ()s around
>   condition.  Call register_edge_assert_for_1 for == 0, != 0, == 1 and
>   != 1 comparisons if name is lhs of a comparison.
> 
> --- gcc/tree-vrp.c.jj 2021-02-25 10:16:46.495355574 +0100
> +++ gcc/tree-vrp.c2021-03-01 18:50:48.586354899 +0100
> @@ -2180,8 +2180,8 @@ register_edge_assert_for (tree name, edg
>/* In the case of NAME == 1 or NAME != 0, for BIT_AND_EXPR defining
>   statement of NAME we can assert both operands of the BIT_AND_EXPR
>   have nonzero value.  */
> -  if (((comp_code == EQ_EXPR && integer_onep (val))
> -   || (comp_code == NE_EXPR && integer_zerop (val
> +  if ((comp_code == EQ_EXPR && integer_onep (val))
> +  || (comp_code == NE_EXPR && integer_zerop (val)))
>  {
>gimple *def_stmt = SSA_NAME_DEF_STMT (name);
>  
> @@ -2193,28 +2193,36 @@ register_edge_assert_for (tree name, edg
> register_edge_assert_for_1 (op0, NE_EXPR, e, asserts);
> register_edge_assert_for_1 (op1, NE_EXPR, e, asserts);
>   }
> +  else if (is_gimple_assign (def_stmt)
> +&& (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
> +== tcc_comparison))
> + register_edge_assert_for_1 (name, NE_EXPR, e, asserts);
>  }
>  
>/* In the case of NAME == 0 or NAME != 1, for BIT_IOR_EXPR defining
>   statement of NAME we can assert both operands of the BIT_IOR_EXPR
>   have zero value.  */
> -  if (((comp_code == EQ_EXPR && integer_zerop (val))
> -   || (comp_code == NE_EXPR && integer_onep (val
> +  if ((comp_code == EQ_EXPR && integer_zerop (val))
> +  || (comp_code == NE_EXPR
> +   && integer_onep (val)
> +   && TYPE_PRECISION (TREE_TYPE (name)) == 1))
>  {
>gimple *def_stmt = SSA_NAME_DEF_STMT (name);
>  
>/* For BIT_IOR_EXPR only if NAME == 0 both operands have
>necessarily zero value, or if type-precision is one.  */
>if (is_gimple_assign (def_stmt)
> -   && (gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR
> -   && (TYPE_PRECISION (TREE_TYPE (name)) == 1
> -   || comp_code == EQ_EXPR)))
> +   && gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR)
>   {
> tree op0 = gimple_assign_rhs1 (def_stmt);
> tree op1 = gimple_assign_rhs2 (def_stmt);
> register_edge_assert_for_1 (op0, EQ_EXPR, e, asserts);
> register_edge_assert_for_1 (op1, EQ_EXPR, e, asserts);
>   }
> +  else if (is_gimple_assign (def_stmt)
> +&& (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt))
> +== tcc_comparison))
> + register_edge_assert_for_1 (name, EQ_EXPR, e, asserts);
>  }
>  
>/* Sometimes we can infer ranges from (NAME & MASK) == VALUE.  */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] RISC-V: Implment __builtin_thread_pointer

2021-03-02 Thread Kito Cheng via Gcc-patches
Hi Matthias:

Sure, backported to gcc 10 branch, let's wait 10.3 :)

On Mon, Mar 1, 2021 at 4:17 PM Matthias Klose  wrote:
>
> On 7/8/20 9:59 PM, Jim Wilson wrote:
> > On Tue, Jul 7, 2020 at 2:52 AM Kito Cheng  wrote:
> >> gcc/ChangeLog:
> >> * gcc/config/riscv/riscv.md (): New.
> >> (TP_REGNUM): Ditto.
> >> * doc/extend.texi (Target Builtins): Add RISC-V built-in section.
> >> Document __builtin_thread_pointer.
> >> gcc/testsuite/ChangeLog:
> >> * gcc.target/riscv/read-thread-pointer.c: New.
> >
> > It looks OK to me in general.
> >
> > You added builtin_thread_pointer but not builtin_set_thread_pointer.
> > Maybe we should implement both as long as we are implementing one?  If
> > clang only implements one, maybe it should implement the other also?
> > This doesn't have to be part of this patch.  This could be a separate
> > issue.
> >
> > The builtin_thread_pointer docs looks out-of-date.  It is documented
> > for alpha and SH, but it is implemented in gcc/builtins.c not in the
> > backends.  A scan of md files show that quite a few targets support it
> > but don't document it.  I think it should be documented in the generic
> > builtins section not in the target dependent builtins sections with
> > some language that says not all targets support it.  This doesn't have
> > to be part of this patch.  This could be a separate issue.
> >
> > We have two existing undocumented builtins.  __builtin_riscv_fsflags
> > and __builtin_riscv_frflags for setting or reading the FP flags.  I
> > don't know if anyone uses them though.  newlib and glbic both use
> > extended asms for these operations.  This doesn't have to be part of
> > this patch.  This could be a separate issue.
> >
> > There is a document https://github.com/riscv/riscv-c-api-doc for
> > coordinating gcc and llvm work that has an empty list of builtin
> > functions.  I'm not sure if this document is still useful.  If this is
> > a RISC-V specific builtin then it should be listed here, but I don't
> > think it should be considered a RISC-V specific builtin.  There is an
> > unresolved pull request for the frflags and fsflags builtins.  I guess
> > I forgot about that.
> >
> > Jim
>
> LLVM 12 now uses __builtin_thread_pointer, and fails to build with GCC 10 (but
> builds with GCC 11).  Could you consider backporting this one to GCC 10?
>
> Thanks, Matthias


[Committed] IBM Z: Run mul-signed-overflow tests only on z14

2021-03-02 Thread Andreas Krebbel via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/s390/mul-signed-overflow-1.c: Run only on z14.
* gcc.target/s390/mul-signed-overflow-2.c: Run only on z14.
---
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c | 2 +-
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
index fdf56d6e695..be95acc54aa 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
index d0088188aa2..f5fbf276c5f 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
-- 
2.29.2



Re: [PATCH] C++: target attribute - local decl

2021-03-02 Thread Martin Liška

On 3/1/21 8:58 PM, Jason Merrill wrote:

On 3/1/21 11:59 AM, Martin Liška wrote:

On 3/1/21 5:36 PM, Jason Merrill wrote:

On 3/1/21 7:43 AM, Martin Liška wrote:

On 2/22/21 11:53 PM, Jason Merrill wrote:

The problem seems to be with the handling of local decls.  If 
DECL_LOCAL_DECL_P, you need to look at DECL_LOCAL_DECL_ALIAS to find the 
namespace-scope decl.  But then if there is no preceding namespace-scope 
declaration, the new decl created by push_local_extern_decl_alias doesn't have 
a cgraph node, either.  I guess maybe_function_versions also needs to look 
through DECL_LOCAL_DECL_ALIAS.


Ah, I see. Are you sure about the name 'maybe_function_versions'? I can't find 
it.


Ah, it's maybe_version_functions, sorry.


Thanks, I see the function now.
So about your guess:


I guess maybe_function_versions also needs to look through 
DECL_LOCAL_DECL_ALIAS.


Do you mean maybe_version_functions's argument 'record' should depend on 
DECL_LOCAL_DECL_ALIAS of newdecl/oldddecl
(if present)? Or that DECL_FUNCTION_VERSIONED should be set for 
DECL_LOCAL_DECL_ALIASes of the newdecl/olddecl
function declarations?


The latter.


I see, but will not help us. Problem is that
#2  0x015d8899 in ix86_get_function_versions_dispatcher 
(decl=0x77755000) at 
/home/marxin/Programming/gcc/gcc/config/i386/i386-features.c:2862

is called for a declaration for which

Breakpoint 5, maybe_version_functions (newdecl=, 
olddecl=, record=false) at 
/home/marxin/Programming/gcc/gcc/cp/decl.c:1118

is called with record=false. So that cgraph_node is not created for it.

Or is a possible solution that get_function_version_dispatcher should look 
through the DECL_LOCAL_DECL_ALIAS?

Thanks,
Martin



Jason






[PATCH] analyzer: remove dead code

2021-03-02 Thread Martin Liška

The issue is reported by Clang:

warning: private field 'm_engine' is not used [-Wunused-private-field]

gcc/analyzer/ChangeLog:

* diagnostic-manager.cc (diagnostic_manager::emit_saved_diagnostics):
Do not pass engine.
---
 gcc/analyzer/diagnostic-manager.cc | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/analyzer/diagnostic-manager.cc 
b/gcc/analyzer/diagnostic-manager.cc
index f0f447f14dc..7f20841768b 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -545,8 +545,6 @@ public:
 class dedupe_winners
 {
 public:
-  dedupe_winners (engine *eng) : m_engine (eng) {}
-
   ~dedupe_winners ()
   {
 /* Delete all keys, but not the saved_diagnostics.  */
@@ -642,8 +640,6 @@ public:
   }
 
 private:

-  engine *m_engine;
-
   /* This maps from each dedupe_key to a current best saved_diagnostic.  */
 
   typedef hash_map
@@ -679,7 +675,7 @@ diagnostic_manager::emit_saved_diagnostics (const 
exploded_graph &eg)
  instance.  This partitions the saved diagnostics by dedupe_key,
  generating exploded_paths for them, and retaining the best one in each
  partition.  */
-  dedupe_winners best_candidates (eg.get_engine ());
+  dedupe_winners best_candidates;
 
   int i;

   saved_diagnostic *sd;
--
2.30.1



Re: [PATCH] correct fix to avoid false positives for vectorized stores (PR 96963, 94655)

2021-03-02 Thread Richard Biener via Gcc-patches
On Fri, Jan 22, 2021 at 12:39 AM Martin Sebor via Gcc-patches
 wrote:
>
> The hack I put in compute_objsize() last January for pr93200 isn't
> quite correct.  It happened to suppress the false positive there
> but, due to what looks like a thinko on my part, not in some other
> test cases involving vectorized stores.
>
> The attached change adjusts the hack to have compute_objsize() give
> up on all MEM_REFs with a vector type.  This effectively disables
> -Wstringop-{overflow,overread} for vector accesses (either written
> by the user or synthesized by GCC from ordinary accesses).  It
> doesn't affect -Warray-bounds because this warning doesn't use
> compute_objsize() yet.  When it does (it should considerably
> simplify the code) some additional changes will be needed to
> preserve -Warray-bounds for out of bounds vector accesses.
> The test this patch adds should serve as a reminder to make
> it if we forget.
>
> Tested on x86_64-linux.  Since PR 94655 was reported against GCC
> 10 I'd like to apply this fix to both the trunk and the 10 branch.

The proposed code reads even more confusingly than before.
What is called 'ptr' is treated as a reference and what you
then name ptrtype is the type of the reference.

That leads to code like

+   if (code == MEM_REF && TREE_CODE (ptrtype) == VECTOR_TYPE)

what?  the pointer type is a VECTOR_TYPE?!

I think you are looking for whether the reference type is a VECTOR_TYPE.

Part of the confusion is likely because the code commons

  if (code == ARRAY_REF || code == MEM_REF)

but in one case operand zero is a pointer to the object (MEM_REF) and
in one case
it is the object (ARRAY_REF).

Please refactor this code so one can actually read it literally
without second-guessing proper variable names.

Thanks,
Richard.



> Martin


Re: [PATCH] clarify nonaliasing property of attribute malloc (PR 99295)

2021-03-02 Thread Richard Biener via Gcc-patches
On Mon, Mar 1, 2021 at 7:32 PM Martin Sebor via Gcc-patches
 wrote:
>
> The documentation change made for pr83023 - branch probabilities
> pessimize malloc, introduced an ambiguity into the description of
> attribute malloc pointed out in pr99295.
>
> The change suggests that GCC assumes that most calls only to malloc
> and calloc but not to realloc return non-null.  Subsequent changes
> exacerbated this problem by inserting additional text in between.
>
> The attached change rewords the description to avoid this unintended
> reading.

OK.

Richard.

>
> Martin


[Committed 1/2] IBM Z: arch14: Add command line options

2021-03-02 Thread Andreas Krebbel via Gcc-patches
Prepare GCC for a future architecture extension.

gcc/ChangeLog:

* common/config/s390/s390-common.c (processor_flags_table): New entry.
* config.gcc: Enable arch14 for --with-arch and --with-tune.
* config/s390/driver-native.c (s390_host_detect_local_cpu): Pick
arch14 for unknown CPU models.
* config/s390/s390-opts.h (enum processor_type): Add PROCESSOR_ARCH14.
* config/s390/s390.c (s390_issue_rate): Add case for PROCESSOR_ARCH14.
(s390_get_sched_attrmask): Likewise.
(s390_get_unit_mask): Likewise.
* config/s390/s390.h (enum processor_flags): Add PF_NNPA and PF_ARCH14.
(TARGET_CPU_ARCH14, TARGET_CPU_ARCH14_P, TARGET_CPU_NNPA)
(TARGET_CPU_NNPA_P, TARGET_ARCH14, TARGET_ARCH14_P, TARGET_NNPA)
(TARGET_NNPA_P): New macro definitions.
* config/s390/s390.md ("cpu_facility", "enabled"): Add arch14 and nnpa.
* config/s390/s390.opt: Add PROCESSOR_ARCH14.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add check for nnpa facility.
---
 gcc/common/config/s390/s390-common.c  |  4 
 gcc/config.gcc|  2 +-
 gcc/config/s390/driver-native.c   |  2 +-
 gcc/config/s390/s390-opts.h   |  1 +
 gcc/config/s390/s390.c|  4 
 gcc/config/s390/s390.h| 20 +++-
 gcc/config/s390/s390.md   | 12 ++--
 gcc/config/s390/s390.opt  |  3 +++
 gcc/testsuite/lib/target-supports.exp | 16 
 9 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index d066cf7395b..b6bc8501742 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -48,8 +48,12 @@ EXPORTED_CONST int processor_flags_table[] =
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14,
 /* z15 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+| PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
+| PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15,
+/* arch14 */ PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15
+| PF_NNPA | PF_ARCH14
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index c8853009e55..966cbc888cb 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5122,7 +5122,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 )
+   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 | arch14 )
# OK
;;
*)
diff --git a/gcc/config/s390/driver-native.c b/gcc/config/s390/driver-native.c
index 4a065a52c17..c0247154c0b 100644
--- a/gcc/config/s390/driver-native.c
+++ b/gcc/config/s390/driver-native.c
@@ -124,7 +124,7 @@ s390_host_detect_local_cpu (int argc, const char **argv)
  cpu = "z15";
  break;
default:
- cpu = "z15";
+ cpu = "arch14";
  break;
}
}
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index d5751809ba5..4141b4d36dd 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,6 +38,7 @@ enum processor_type
   PROCESSOR_2964_Z13,
   PROCESSOR_3906_Z14,
   PROCESSOR_8561_Z15,
+  PROCESSOR_ARCH14,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2cee950d0..fcb26316632 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -337,6 +337,7 @@ const struct s390_processor processor_table[] =
   { "z13","z13",PROCESSOR_2964_Z13,&zEC12_cost,  11 },
   { "z14","arch12", PROCESSOR_3906_Z14,&zEC12_cost,  12 },
   { "z15","arch13", PROCESSOR_8561_Z15,&zEC12_cost,  13 },
+  { "arch14", "",   PROCESSOR_ARCH14,  &zEC12_cost,  14 },
   { "native", "",   PROCESSOR_NATIVE,  NULL, 0  }
 };
 
@@ -8409,6 +8410,7 @@ s390_issue_rate (void)
 case PROCESSOR_2827_ZEC12:
 case PROCESSOR_2964_Z13:
 case PROCESSOR_3906_Z14:
+case PROCESSOR_ARCH14:
 default:
   return 1;
 }
@@ -14768,6 +14770,7 @@ s390_get_sched_attrmask (rtx_insn *insn)
mask |= S390_SCHED_ATTR_MASK_GROUPOFTWO;
   break;
 c

[Committed 2/2] IBM Z: arch14: New instrinsics

2021-03-02 Thread Andreas Krebbel via Gcc-patches
This adds support for 5 new builtins.

gcc/ChangeLog:

* config/s390/s390-builtin-types.def (BT_FN_V4SF_V8HI_UINT): New
builtin signature.
(BT_FN_V8HI_V8HI_UINT): Likewise.
(BT_FN_V8HI_V4SF_V4SF_UINT): Likewise.
* config/s390/s390-builtins.def (B_NNPA): New macro definition.
(s390_vclfnhs, s390_vclfnls, s390_vcrnfs, s390_vcfn, s390_vcnf):
New builtin definitions.
* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Bump
vector extension version.
* config/s390/s390.c (s390_expand_builtin): Check if builtins are
available with current -march level.
* config/s390/s390.md (UNSPEC_NNPA_VCLFNHS_V8HI)
(UNSPEC_NNPA_VCLFNLS_V8HI, UNSPEC_NNPA_VCRNFS_V8HI)
(UNSPEC_NNPA_VCFN_V8HI, UNSPEC_NNPA_VCNF_V8HI): New constants.
* config/s390/vecintrin.h (vec_extend_to_fp32_hi): New macro.
(vec_extend_to_fp32_lo): Likewise.
(vec_round_from_fp32): Likewise.
(vec_convert_to_fp16): Likewise.
(vec_convert_from_fp16): Likewise.
* config/s390/vx-builtins.md (vclfnhs_v8hi): New insn pattern.
(vclfnls_v8hi): Likewise.
(vcrnfs_v8hi): Likewise.
(vcfn_v8hi): Likewise.
(vcnf_v8hi): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/vec-nnpa-fp16-convert.c: New test.
* gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c: New test.
* gcc.target/s390/zvector/vec_convert_from_fp16.c: New test.
* gcc.target/s390/zvector/vec_convert_to_fp16.c: New test.
* gcc.target/s390/zvector/vec_extend_to_fp32_hi.c: New test.
* gcc.target/s390/zvector/vec_extend_to_fp32_lo.c: New test.
* gcc.target/s390/zvector/vec_round_from_fp32.c: New test.
---
 gcc/config/s390/s390-builtin-types.def|  3 +
 gcc/config/s390/s390-builtins.def | 12 
 gcc/config/s390/s390-c.c  |  2 +-
 gcc/config/s390/s390.c|  6 ++
 gcc/config/s390/s390.md   |  7 +++
 gcc/config/s390/vecintrin.h   |  6 ++
 gcc/config/s390/vx-builtins.md| 55 +++
 .../s390/zvector/vec-nnpa-fp16-convert.c  | 34 
 .../s390/zvector/vec-nnpa-fp32-convert-1.c| 27 +
 .../s390/zvector/vec_convert_from_fp16.c  | 12 
 .../s390/zvector/vec_convert_to_fp16.c| 12 
 .../s390/zvector/vec_extend_to_fp32_hi.c  | 12 
 .../s390/zvector/vec_extend_to_fp32_lo.c  | 12 
 .../s390/zvector/vec_round_from_fp32.c| 12 
 14 files changed, 211 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec-nnpa-fp16-convert.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_convert_from_fp16.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_convert_to_fp16.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_extend_to_fp32_hi.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_extend_to_fp32_lo.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_round_from_fp32.c

diff --git a/gcc/config/s390/s390-builtin-types.def 
b/gcc/config/s390/s390-builtin-types.def
index a2b7d4a9a32..52ef5728539 100644
--- a/gcc/config/s390/s390-builtin-types.def
+++ b/gcc/config/s390/s390-builtin-types.def
@@ -267,6 +267,7 @@ DEF_FN_TYPE_2 (BT_FN_V2DI_V4SI_V4SI, BT_V2DI, BT_V4SI, 
BT_V4SI)
 DEF_FN_TYPE_2 (BT_FN_V4SF_FLT_INT, BT_V4SF, BT_FLT, BT_INT)
 DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_UCHAR, BT_V4SF, BT_V4SF, BT_UCHAR)
 DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_V4SF, BT_V4SF, BT_V4SF, BT_V4SF)
+DEF_FN_TYPE_2 (BT_FN_V4SF_V8HI_UINT, BT_V4SF, BT_V8HI, BT_UINT)
 DEF_FN_TYPE_2 (BT_FN_V4SI_BV4SI_V4SI, BT_V4SI, BT_BV4SI, BT_V4SI)
 DEF_FN_TYPE_2 (BT_FN_V4SI_INT_VOIDCONSTPTR, BT_V4SI, BT_INT, BT_VOIDCONSTPTR)
 DEF_FN_TYPE_2 (BT_FN_V4SI_UV4SI_UV4SI, BT_V4SI, BT_UV4SI, BT_UV4SI)
@@ -278,6 +279,7 @@ DEF_FN_TYPE_2 (BT_FN_V8HI_BV8HI_V8HI, BT_V8HI, BT_BV8HI, 
BT_V8HI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_UV8HI_UV8HI, BT_V8HI, BT_UV8HI, BT_UV8HI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V16QI_V16QI, BT_V8HI, BT_V16QI, BT_V16QI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V4SI_V4SI, BT_V8HI, BT_V4SI, BT_V4SI)
+DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_UINT, BT_V8HI, BT_V8HI, BT_UINT)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI)
 DEF_FN_TYPE_2 (BT_FN_VOID_UINT64PTR_UINT64, BT_VOID, BT_UINT64PTR, BT_UINT64)
 DEF_FN_TYPE_2 (BT_FN_VOID_V2DF_FLTPTR, BT_VOID, BT_V2DF, BT_FLTPTR)
@@ -345,6 +347,7 @@ DEF_FN_TYPE_3 (BT_FN_V4SI_V4SI_V4SI_V4SI, BT_V4SI, BT_V4SI, 
BT_V4SI, BT_V4SI)
 DEF_FN_TYPE_3 (BT_FN_V4SI_V8HI_V8HI_V4SI, BT_V4SI, BT_V8HI, BT_V8HI, BT_V4SI)
 DEF_FN_TYPE_3 (BT_FN_V8HI_UV8HI_UV8HI_INTPTR, BT_V8HI, BT_UV8HI, BT_UV8HI, 
BT_INTPTR)
 DEF_FN_TYPE_3 (BT_FN_V8HI_V16QI_V16QI_V8HI, BT_V8HI, BT_V16QI, BT_V16QI, 
BT_V8HI)
+DEF_FN_TYPE_3 (BT_FN_V8HI_V4SF_V4SF_UINT, BT_V8HI, BT_V4SF, BT_V4SF, BT

Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-02 Thread Alexandre Oliva
On Feb 26, 2021, Segher Boessenkool  wrote:

> On Fri, Feb 26, 2021 at 12:31:16PM -0500, David Edelsohn wrote:
>> On Fri, Feb 26, 2021 at 11:09 AM Alexandre Oliva  wrote:
>> >
>> > This patch avoids an ICE in gimplefe-28.c, in our ppc64-vxworks7r2
>> > tests.  Tested on x86_64-linux-gnu, and on the affected platform.  Ok to
>> > install?
>> 
>> I'm sort of surprised that sqrt instruction would be available for the
>> target but not enabled by default.  Is this really the method that
>> customers would use?  It seems that it might be more appropriate to
>> xfail or skip the test for PPC64 VxWorks.

> You should essentially never use -mpowerpc-gpopt, but instead use a
> -mcpu= that has it.  You also of course whould not do that in run tests
> if the cpu does not support those insns.

I think the feedback is missing the point of the obvious bug that Eric's
patch fixes.

We have a dejagnu proc that should return any target-specific options
necessary for a sqrt insn to be available:

# Return any additional options to enable square root intructions.

powerpc has an optional sqrt insn, but the option that enables it is not
returned by that proc.  That's a blatang bug in that proc.  Do you see
that, or do you have any sensible reasoning to share to support the
position that the proc is NOT buggy?


This proc is presumably only used in compile tests; it wouldn't make
sense to assume an optional sqrt insn to be available on whatever
hardware variant you happen to be using for testing.

But the bug fixed by Eric's patch is pretty blatant, and I don't think
it makes sense to reject this fix, and insist on a fix of another bug
instead.  That would just leave *this* bug in the dejagnu proc unfixed.


Now, maybe a different flag is to be returned, but -mpowerpc-gpopt is
the flag that enables sqrt with minimal disturbance of any other cpu
selections in effect, so I believe that's the option that best serves
the purpose of making the hardware sqrt insn available, regardless of
whatever would be recommended to end users.


> But, Alex, you say this avoids an ICE?  You shouldn't avoid the ICE in
> that testcase then -- instead, fix it!  (Or report it).

Orthogonal issue, IMHO.  If you restate the response as "the proposed
patch is fine as long as a bug report is on file for the error
encountered when attempting to expand . when a sqrt intrinsic is
not available", I can go along with it.  But saying "we don't want to
fix this bug, we want to fix another vaguely related bug *instead*",
will make our stances mutually incompatible, and would likely result in
my pursuing neither bug.


While at that, if -mpowerpc-gpopt is not to be used, maybe you'll want
to fix, or file a bug report, about the multiple tests
gcc.target/powerpc/pr46728-*.c, that use it explicitly, and for the very
purpose of enabling the fsqrt insn.  Several of these are execution
tests, so they fail on systems that don't offer fsqrt.  I suppose these
should mention *_sqrt_insn target requirements and add options.  I
haven't looked into how they interact, if at all.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[PATCH 0/4] openacc: Worker partitioning in the middle end

2021-03-02 Thread Julian Brown
This series contains updated parts of the patch series that was previously
sent upstream in November 2019:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534547.html

The purpose of the series is to enable multiple workers for OpenACC
(workers being one of the dimensions of parallelism supported by the
standard) on targets such as AMD GCN. (NVPTX uses its own scheme for
supporting multiple workers, implemented mostly in the backend.)

Tested with offloading to AMD GCN and (separately) to NVPTX.

Further commentary is provided alongside individual patches. I'm posting
these patches for review now, but I don't expect to commit them until
stage 1.

Thanks,

Julian

Julian Brown (4):
  openacc: Middle-end worker-partitioning support
  openacc: Fix async bugs in several OpenACC test cases
  amdgcn: Enable OpenACC worker partitioning for AMD GCN
  openacc: Reference-typed reduction and private variable rewriting

 gcc/Makefile.in   |1 +
 gcc/config/gcn/gcn-protos.h   |2 +-
 gcc/config/gcn/gcn-tree.c |6 +-
 gcc/config/gcn/gcn.c  |   23 +-
 gcc/config/gcn/gcn.opt|5 -
 gcc/doc/tm.texi   |   10 +
 gcc/doc/tm.texi.in|4 +
 gcc/gimplify.c|  117 ++
 gcc/oacc-neuter-bcast.c   | 1471 +
 gcc/oacc-neuter-bcast.h   |   26 +
 gcc/omp-builtins.def  |8 +
 gcc/omp-low.c |   47 +-
 gcc/omp-offload.c |  159 +-
 gcc/omp-offload.h |1 +
 gcc/passes.def|2 +
 gcc/target.def|   13 +
 gcc/targhooks.h   |1 +
 .../goacc/classify-kernels-unparallelized.c   |8 +-
 .../c-c++-common/goacc/classify-kernels.c |8 +-
 .../c-c++-common/goacc/classify-parallel.c|8 +-
 .../c-c++-common/goacc/classify-routine.c |8 +-
 .../c-c++-common/goacc/classify-serial.c  |8 +-
 .../gcc.dg/goacc/loop-processing-1.c  |2 +-
 .../goacc/classify-kernels-unparallelized.f95 |8 +-
 .../gfortran.dg/goacc/classify-kernels.f95|8 +-
 .../gfortran.dg/goacc/classify-parallel.f95   |8 +-
 .../gfortran.dg/goacc/classify-routine.f95|8 +-
 .../gfortran.dg/goacc/classify-serial.f95 |8 +-
 gcc/tree-core.h   |4 +-
 gcc/tree-pass.h   |2 +
 gcc/tree.c|   11 +-
 gcc/tree.h|2 +
 libgomp/plugin/plugin-gcn.c   |4 +-
 .../libgomp.oacc-c++/privatized-ref-2.C   |   64 +
 .../libgomp.oacc-c++/privatized-ref-3.C   |   64 +
 .../libgomp.oacc-c-c++-common/deep-copy-10.c  |   14 +-
 .../loop-dim-default.c|   11 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |   13 +-
 .../libgomp.oacc-fortran/lib-16-2.f90 |5 +
 .../testsuite/libgomp.oacc-fortran/lib-16.f90 |5 +
 .../libgomp.oacc-fortran/parallel-dims-aux.c  |9 +-
 .../libgomp.oacc-fortran/privatized-ref-1.f95 |   71 +
 42 files changed, 2112 insertions(+), 145 deletions(-)
 create mode 100644 gcc/oacc-neuter-bcast.c
 create mode 100644 gcc/oacc-neuter-bcast.h
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95

-- 
2.29.2



[PATCH 2/4] openacc: Fix async bugs in several OpenACC test cases

2021-03-02 Thread Julian Brown
Enabling worker-partitioning support in the middle end (for AMD GCN)
reveals several bugs in existing tests relating to async usage.
This patch fixes those up.

Tested with offloading to AMD GCN. OK for stage 1? (Or now?)

Julian

2021-03-02  Julian Brown  

libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c: Fix async
behaviour and increase number of iterations.
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Fix async behaviour.
* testsuite/libgomp.oacc-fortran/lib-16.f90: Likewise.
---
 .../libgomp.oacc-c-c++-common/deep-copy-10.c   | 14 --
 .../testsuite/libgomp.oacc-fortran/lib-16-2.f90|  5 +
 libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90  |  5 +
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c
index 573a8214bf0..dadb6d37942 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c
@@ -1,6 +1,8 @@
 #include 
 
-/* Test asyncronous attach and detach operation.  */
+#define ITERATIONS 1023
+
+/* Test asynchronous attach and detach operation.  */
 
 typedef struct {
   int *a;
@@ -25,13 +27,13 @@ main (int argc, char* argv[])
 
 #pragma acc enter data copyin(m)
 
-  for (int i = 0; i < 99; i++)
+  for (int i = 0; i < ITERATIONS; i++)
 {
   int j;
-#pragma acc parallel loop copy(m.a[0:N]) async(i % 2)
+#pragma acc parallel loop copy(m.a[0:N]) async(0)
   for (j = 0; j < N; j++)
m.a[j]++;
-#pragma acc parallel loop copy(m.b[0:N]) async((i + 1) % 2)
+#pragma acc parallel loop copy(m.b[0:N]) async(1)
   for (j = 0; j < N; j++)
m.b[j]++;
 }
@@ -40,9 +42,9 @@ main (int argc, char* argv[])
 
   for (i = 0; i < N; i++)
 {
-  if (m.a[i] != 99)
+  if (m.a[i] != ITERATIONS)
abort ();
-  if (m.b[i] != 99)
+  if (m.b[i] != ITERATIONS)
abort ();
 }
 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90 
b/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
index ddd557d3be0..e2e47c967fa 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
@@ -27,6 +27,9 @@ program main
 
   if (acc_is_present (h) .neqv. .TRUE.) stop 1
 
+  ! We must wait for the update to be done.
+  call acc_wait (async)
+
   h(:) = 0
 
   call acc_copyout_async (h, sizeof (h), async)
@@ -45,6 +48,8 @@ program main
   
   if (acc_is_present (h) .neqv. .TRUE.) stop 3
 
+  call acc_wait (async)
+
   do i = 1, N
 if (h(i) /= i + i) stop 4
   end do 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 
b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
index ccd1ce6ee18..ef9a6f6626c 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
@@ -27,6 +27,9 @@ program main
 
   if (acc_is_present (h) .neqv. .TRUE.) stop 1
 
+  ! We must wait for the update to be done.
+  call acc_wait (async)
+
   h(:) = 0
 
   call acc_copyout_async (h, sizeof (h), async)
@@ -45,6 +48,8 @@ program main
   
   if (acc_is_present (h) .neqv. .TRUE.) stop 3
 
+  call acc_wait (async)
+
   do i = 1, N
 if (h(i) /= i + i) stop 4
   end do 
-- 
2.29.2



[PATCH 1/4] openacc: Middle-end worker-partitioning support

2021-03-02 Thread Julian Brown
A version of this patch was previously posted here:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534553.html

This patch implements worker-partitioning support in the middle end,
by rewriting gimple. The OpenACC execution model requires that code
can run in either "worker single" mode where only a single worker per
gang is active, or "worker partitioned" mode, where multiple workers
per gang are active. This means we need to do something equivalent
to spawning additional workers when transitioning from worker-single
to worker-partitioned mode. However, GPUs typically fix the number of
threads of invoked kernels at launch time, so we need to do something
with the "extra" threads when they are not wanted.

The scheme used is to conditionalise each basic block that executes
in "worker single" mode for worker 0 only. Conditional branches
are handled specially so "idle" (non-0) workers follow along with
worker 0. On transitioning to "worker partitioned" mode, any variables
modified by worker 0 are propagated to the other workers via GPU shared
memory. Special care is taken for routine calls, writes through pointers,
and so forth, as follows:

  - There are two types of function calls to consider in worker-single
mode: "normal" calls to maths library routines, etc. are called from
worker 0 only. OpenACC routines may contain worker-partitioned loops
themselves, so are called from all workers, including "idle" ones.

  - SSA names set in worker-single mode, but used in worker-partitioned
mode, are copied to shared memory in worker 0. Other workers retrieve
the value from the appropriate shared-memory location after a barrier,
and new phi nodes are introduced at the convergence point to resolve
the worker 0/other worker copies of the value.

  - Local scalar variables (on the stack) also need special handling. We
broadcast any variables that are written in the current worker-single
block, and that are read in any worker-partitioned block.  (This is
believed to be safe, and is flow-insensitive to ease analysis.)

  - Local aggregates (arrays and composites) on the stack are *not*
broadcast. Instead we force gimple stmts modifying elements/fields of
local aggregates into fully-partitioned mode. The RHS of the
assignment is a scalar, and is thus subject to broadcasting as above.

  - Writes through pointers may affect any local variable that has
its address taken. We use points-to analysis to determine the set
of potentially-affected variables for a given pointer indirection.
We broadcast any such variable which is used in worker-partitioned
mode, on a per-block basis for any block containing a write through
a pointer.

Some slides about the implementation (from 2018) are available at:

  https://jtb20.github.io/gcnworkers.pdf

This version of the patch includes several follow-on bug fixes by myself
and Kwok. This version also avoids moving SESE-region finding code out
of the NVPTX backend, since that code isn't used by the middle-end worker
partitioning neutering/broadcasting implementation yet.

This patch (and the rest of the series) should be applied on top of the
private variable patches posted previously here:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565925.html

Tested with offloading to AMD GCN. OK for stage 1?

Julian

2021-03-02  Julian Brown  
Nathan Sidwell  
Kwok Cheung Yeung  

gcc/
* Makefile.in (OBJS): Add oacc-neuter-bcast.o.
* doc/tm.texi.in (TARGET_GOACC_WORKER_PARTITIONING,
TARGET_GOACC_CREATE_PROPAGATION_RECORD): Add documentation hooks.
* doc/tm.texi: Regenerate.
* oacc-neuter-bcast.c: New file.
* oacc-neuter-bcast.h: New file.
* omp-builtins.def (BUILT_IN_GOACC_BARRIER, BUILT_IN_GOACC_SINGLE_START,
BUILT_IN_GOACC_SINGLE_COPY_START, BUILT_IN_GOACC_SINGLE_COPY_END): New
builtins.
* omp-offload.c (oacc-neuter-bcast.h): Include header.
(oacc_loop_xform_head_tail): Call update_stmt for modified builtin
calls.
(oacc_loop_process): Likewise.
(default_goacc_create_propagation_record): New function.
(execute_oacc_loop_designation): New.  Split out of oacc_device_lower.
(execute_oacc_gimple_workers): New.  Likewise.
(execute_oacc_device_lower): Recreate dims array.
(pass_data_oacc_loop_designation, pass_data_oacc_gimple_workers): New.
(pass_oacc_loop_designation, pass_oacc_gimple_workers): New.
(make_pass_oacc_loop_designation, make_pass_oacc_gimple_workers): New.
* omp-offload.h (oacc_fn_attrib_level): Add prototype.
* passes.def (pass_oacc_loop_designation, pass_oacc_gimple_workers):
Add passes.
* target.def (worker_partitioning, create_propagation_record): Add
target hooks.
* targhooks.h (default_goacc_create_propagation_record): Add prototype.
* tree-pass.h (make_

[PATCH 3/4] amdgcn: Enable OpenACC worker partitioning for AMD GCN

2021-03-02 Thread Julian Brown
This patch enables worker-partitioning support via gimple rewriting for
AMD GCN. Older (and currently unused) parts of this support are already
present in the AMD GCN backend: those vestigial parts are enabled or
updated, as appropriate.

I can probably self-approve this -- I will commit if/when the other
patches in the series are committed in stage 1.

Julian

2021-03-02  Julian Brown  
Kwok Cheung Yeung  

gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_propagation_record):
Rename prototype to...
(gcn_goacc_create_propagation_record): This.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_propagation_record): Rename
function to...
(gcn_goacc_create_propagation_record): This.  Adjust comment.
* config/gcn/gcn.c (gcn_init_builtins): Override decls for
BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START,
BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER.
(gcn_goacc_validate_dims): Turn on worker partitioning unconditionally.
(gcn_fork_join): Update comment.
(TARGET_GOACC_ADJUST_PROPAGATION_RECORD): Rename to...
(TARGET_GOACC_CREATE_PROPAGATION_RECORD): This.
(TARGET_GOACC_WORKER_PARTITIONING): Define target hook.
* config/gcn/gcn.opt (flag_worker_partitioning): Remove.
(macc_experimental_workers): Remove unused option.

libgomp/
* plugin/plugin-gcn.c (gcn_exec): Change default number of workers to
16.
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (check): Skip
vector dimension test for AMD GCN.  Enable multiple workers.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Enable multiple
workers.  Update line numbers for scan tests.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: Support AMD GCN.
---
 gcc/config/gcn/gcn-protos.h   |  2 +-
 gcc/config/gcn/gcn-tree.c |  6 ++---
 gcc/config/gcn/gcn.c  | 23 +++
 gcc/config/gcn/gcn.opt|  5 
 libgomp/plugin/plugin-gcn.c   |  4 +---
 .../loop-dim-default.c| 11 +
 .../libgomp.oacc-c-c++-common/parallel-dims.c | 13 ---
 .../libgomp.oacc-fortran/parallel-dims-aux.c  |  9 +---
 8 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index 7ef7ae8af46..6238bdc8a96 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -38,7 +38,7 @@ extern rtx gcn_full_exec ();
 extern rtx gcn_full_exec_reg ();
 extern rtx gcn_gen_undef (machine_mode);
 extern bool gcn_global_address_p (rtx);
-extern tree gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
+extern tree gcn_goacc_create_propagation_record (tree record_type, bool sender,
 const char *name);
 extern tree gcn_goacc_adjust_private_decl (tree var, int level);
 extern void gcn_goacc_reduction (gcall *call);
diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c
index 75ea50c59dd..a457121c72b 100644
--- a/gcc/config/gcn/gcn-tree.c
+++ b/gcc/config/gcn/gcn-tree.c
@@ -548,12 +548,12 @@ gcn_goacc_reduction (gcall *call)
 }
 }
 
-/* Implement TARGET_GOACC_ADJUST_PROPAGATION_RECORD.
+/* Implement TARGET_GOACC_CREATE_PROPAGATION_RECORD.
  
-   Tweak (worker) propagation record, e.g. to put it in shared memory.  */
+   Create (worker) propagation record in shared memory.  */
 
 tree
-gcn_goacc_adjust_propagation_record (tree record_type, bool sender,
+gcn_goacc_create_propagation_record (tree record_type, bool sender,
 const char *name)
 {
   tree type = record_type;
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 1ea919bf058..fe4fa68f4ce 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -3588,8 +3588,6 @@ gcn_init_builtins (void)
   TREE_NOTHROW (gcn_builtin_decls[i]) = 1;
 }
 
-/* FIXME: remove the ifdef once OpenACC support is merged upstream.  */
-#ifdef BUILT_IN_GOACC_SINGLE_START
   /* These builtins need to take/return an LDS pointer: override the generic
  versions here.  */
 
@@ -3606,7 +3604,6 @@ gcn_init_builtins (void)
 
   set_builtin_decl (BUILT_IN_GOACC_BARRIER,
gcn_builtin_decls[GCN_BUILTIN_ACC_BARRIER], false);
-#endif
 }
 
 /* Expand the CMP_SWAP GCN builtins.  We have our own versions that do
@@ -4865,11 +4862,7 @@ gcn_goacc_validate_dims (tree decl, int dims[], int 
fn_level,
 unsigned /*used*/)
 {
   bool changed = false;
-
-  /* FIXME: remove -facc-experimental-workers when they're ready.  */
-  int max_workers = flag_worker_partitioning ? 16 : 1;
-
-  gcc_assert (!flag_worker_partitioning);
+  const int max_workers = 16;
 
   /* The vector size must appear to be 64, to the user, unless this is a
  SEQ routine.  The real, internal value is

[PATCH 4/4] openacc: Reference-typed reduction and private variable rewriting

2021-03-02 Thread Julian Brown
A version of this patch was previously posted for mainline here:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534552.html

Reference-type private variables or reference-type variables used as
reduction targets do not work well with the scheme to implement worker
partitioning on AMD GCN. This patch (originally by Cesar Philippidis, but
modified somewhat since) provides support for replacing such variables
with new non-reference-typed temporary versions within partitioned
offload regions.

In more detail, the problem with reductions is as follows.  The expansion
of reduction operations (or similarly use of private variables) may
cause the bits of a reference variable (i.e. a pointer to a stack slot)
formed in worker-single mode to be broadcast to worker-partitioned mode,
and then dereferenced. Thus all workers will try to access the same
variable on worker 0's stack, which is not what was intended -- rather,
the reference in each worker should have been a pointer to a slot in
that worker's own stack.

A better solution to this problem might be to avoid trying to broadcast
pointers formed by taking the address of a stack slot somehow, but that
could prove tricky or perhaps impossible in the general case.

(I noticed during testing that Tobias has a couple of follow-up patches
to this one on the og10 branch. It might make sense to fold those into
this one too, else they'll need applying separately.)

Tested with offloading to AMD GCN (and separately to NVPTX). OK for
stage 1?

Julian

2021-03-02  Cesar Philippidis  
Julian Brown  
Kwok Cheung Yeung  

gcc/
* gimplify.c (privatize_reduction): New struct.
(localize_reductions_r, localize_reductions): New functions.
(gimplify_omp_for): Call localize_reductions.
(gimplify_omp_workshare): Likewise.
* omp-low.c (lower_oacc_reductions): Handle localized reductions.
Create fewer temp vars.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_REDUCTION_PRIVATE_DECL
documentation.
* tree.c (omp_clause_num_ops): Bump number of ops for
OMP_CLAUSE_REDUCTION to 6.
(walk_tree_1): Adjust accordingly.
* tree.h (OMP_CLAUSE_REDUCTION_PRIVATE_DECL): Add macro.

libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test.
---
 gcc/gimplify.c| 117 ++
 gcc/omp-low.c |  47 +++
 gcc/tree-core.h   |   4 +-
 gcc/tree.c|  11 +-
 gcc/tree.h|   2 +
 .../libgomp.oacc-c++/privatized-ref-2.C   |  64 ++
 .../libgomp.oacc-c++/privatized-ref-3.C   |  64 ++
 .../libgomp.oacc-fortran/privatized-ref-1.f95 |  71 +++
 8 files changed, 343 insertions(+), 37 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index caf25ccdd5c..e092b7be723 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -236,6 +236,11 @@ struct gimplify_omp_ctx
   int defaultmap[4];
 };
 
+struct privatize_reduction
+{
+  tree ref_var, local_var;
+};
+
 static struct gimplify_ctx *gimplify_ctxp;
 static struct gimplify_omp_ctx *gimplify_omp_ctxp;
 static bool in_omp_construct;
@@ -11381,6 +11386,95 @@ gimplify_omp_taskloop_expr (tree type, tree *tp, 
gimple_seq *pre_p,
   OMP_FOR_CLAUSES (orig_for_stmt) = c;
 }
 
+/* Helper function for localize_reductions.  Replace all uses of REF_VAR with
+   LOCAL_VAR.  */
+
+static tree
+localize_reductions_r (tree *tp, int *walk_subtrees, void *data)
+{
+  enum tree_code tc = TREE_CODE (*tp);
+  struct privatize_reduction *pr = (struct privatize_reduction *) data;
+
+  if (TYPE_P (*tp))
+*walk_subtrees = 0;
+
+  switch (tc)
+{
+case INDIRECT_REF:
+case MEM_REF:
+  if (TREE_OPERAND (*tp, 0) == pr->ref_var)
+   *tp = pr->local_var;
+
+  *walk_subtrees = 0;
+  break;
+
+case VAR_DECL:
+case PARM_DECL:
+case RESULT_DECL:
+  if (*tp == pr->ref_var)
+   *tp = pr->local_var;
+
+  *walk_subtrees = 0;
+  break;
+
+default:
+  break;
+}
+
+  return NULL_TREE;
+}
+
+/* OpenACC worker and vector loop state propagation requires reductions
+   to be inside local variables.  This function replaces all reference-type
+   reductions variables associated with the loop with a local copy.  It is
+   also used to create private copies of reduction variables for those
+   which are not associated with acc loops.  */
+
+static void
+localize_reductions (tree clauses, tree body)
+{
+  tree c, var, type, new_var;
+  struct pri

Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-03-02 Thread Richard Biener via Gcc-patches
On Sun, Feb 14, 2021 at 11:27 AM Stefan Schulze Frielinghaus
 wrote:
>
> On Tue, Feb 09, 2021 at 09:57:58AM +0100, Richard Biener wrote:
> > On Mon, Feb 8, 2021 at 3:11 PM Stefan Schulze Frielinghaus via
> > Gcc-patches  wrote:
> > >
> > > This patch adds support for recognizing loops which mimic the behaviour
> > > of function rawmemchr, and replaces those with an internal function call
> > > in case a target provides them.  In contrast to the original rawmemchr
> > > function, this patch also supports different instances where the memory
> > > pointed to and the pattern are interpreted as 8, 16, and 32 bit sized,
> > > respectively.
> > >
> > > This patch is not final and I'm looking for some feedback:
> > >
> > > Previously, only loops which mimic the behaviours of functions memset,
> > > memcpy, and memmove have been detected and replaced by corresponding
> > > function calls.  One characteristic of those loops/partitions is that
> > > they don't have a reduction.  In contrast, loops which mimic the
> > > behaviour of rawmemchr compute a result and therefore have a reduction.
> > > My current attempt is to ensure that the reduction statement is not used
> > > in any other partition and only in that case ignore the reduction and
> > > replace the loop by a function call.  We then only need to replace the
> > > reduction variable of the loop which contained the loop result by the
> > > variable of the lhs of the internal function call.  This should ensure
> > > that the transformation is correct independently of how partitions are
> > > fused/distributed in the end.  Any thoughts about this?
> >
> > Currently we're forcing reduction partitions last (and force to have a 
> > single
> > one by fusing all partitions containing a reduction) because code-generation
> > does not properly update SSA form for the reduction results.  ISTR that
> > might be just because we do not copy the LC PHI nodes or do not adjust
> > them when copying.  That might not be an issue in case you replace the
> > partition with a call.  I guess you can try to have a testcase with
> > two rawmemchr patterns and a regular loop part that has to be scheduled
> > inbetween both for correctness.
>
> Ah ok, in that case I updated my patch by removing the constraint that
> the reduction statement must be in precisely one partition.  Please find
> attached the testcases I came up so far.  Since transforming a loop into
> a rawmemchr function call is backend dependend, I planned to include
> those only in my backend patch.  I wasn't able to come up with any
> testcase where a loop is distributed into multiple partitions and where
> one is classified as a rawmemchr builtin.  The latter boils down to a
> for loop with an empty body only in which case I suspect that loop
> distribution shouldn't be done anyway.
>
> > > Furthermore, I simply added two new members (pattern, fn) to structure
> > > builtin_info which I consider rather hacky.  For the long run I thought
> > > about to split up structure builtin_info into a union where each member
> > > is a structure for a particular builtin of a partition, i.e., something
> > > like this:
> > >
> > > union builtin_info
> > > {
> > >   struct binfo_memset *memset;
> > >   struct binfo_memcpymove *memcpymove;
> > >   struct binfo_rawmemchr *rawmemchr;
> > > };
> > >
> > > Such that a structure for one builtin does not get "polluted" by a
> > > different one.  Any thoughts about this?
> >
> > Probably makes sense if the list of recognized patterns grow further.
> >
> > I see you use internal functions rather than builtin functions.  I guess
> > that's OK.  But you use new target hooks for expansion where I think
> > new optab entries similar to cmpmem would be more appropriate
> > where the distinction between 8, 16 or 32 bits can be encoded in
> > the modes.
>
> The optab implementation is really nice which allows me to use iterators
> in the backend which in the end saves me some boiler plate code compared
> to the previous implementation :)
>
> While using optabs now, I only require one additional member (pattern)
> in the builtin_info struct.  Thus I didn't want to overcomplicate things
> and kept the single struct approach as is.
>
> For the long run, should I resubmit this patch once stage 1 opens or how
> would you propose to proceed?

Yes, and sorry for the delay.  Few comments on the patch given I had a
quick look:

+void
+expand_RAWMEMCHR (internal_fn, gcall *stmt)
+{
+  expand_operand ops[3];
+
+  tree lhs = gimple_call_lhs (stmt);

I think that give we have people testing with -fno-tree-dce you
should try to handle a NULL LHS gracefully.  I suppose by
simply doing nothing.

+  tree result_old = build_fold_addr_expr (DR_REF (dr));
+  tree result_new = copy_ssa_name (result_old);

I think you simply want

   tree result = make_ssa_name (ptr_type_node);

most definitely

+  imm_use_iterator iter;
+  gimple *stmt;
+  use_operand_p use_p;
+  FOR_EACH_IMM_USE_STMT (stmt, iter, result_old)
+{

[PATCH, OG10, OpenMP, committed] Fix array members in OpenMP map clauses

2021-03-02 Thread Chung-Lin Tang

Previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564976.html

was reverted by Catherine when I was away, due to regressions in mapping
array members. The fix appears to be a re-placement of 
finish_non_static_data_member()
inside handle_omp_array_sections().

Tested and committed to devel/omp/gcc-10, the above patch was also re-committed 
as well.

Chung-Lin
From da047f63c601118ad875d13929453094acc6c6c9 Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Fri, 26 Feb 2021 20:13:29 +0800
Subject: [PATCH] Fix regression of array members in OpenMP map clauses.

Fixed a regression of array members not working in OpenMP map clauses after
commit bf8605f14ec33ea31233a3567f3184fee667b695.

This patch itself probably should be considered a fix for commit aadfc9843.

2021-02-26  Chung-Lin Tang  

gcc/cp/ChangeLog:

* semantics.c (handle_omp_array_sections): Adjust position of making
COMPONENT_REF from FIELD_DECL to earlier position.
---
 gcc/cp/semantics.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 370d5831091..55a5983528e 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5386,6 +5386,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type 
ort)
}
  OMP_CLAUSE_DECL (c) = first;
  OMP_CLAUSE_SIZE (c) = size;
+ if (TREE_CODE (t) == FIELD_DECL)
+   t = finish_non_static_data_member (t, NULL_TREE, NULL_TREE);
  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP
  || (TREE_CODE (t) == COMPONENT_REF
  && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE))
@@ -5414,8 +5416,6 @@ handle_omp_array_sections (tree c, enum c_omp_region_type 
ort)
  }
  tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c),
  OMP_CLAUSE_MAP);
- if (TREE_CODE (t) == FIELD_DECL)
-   t = finish_non_static_data_member (t, NULL_TREE, NULL_TREE);
  if ((ort & C_ORT_OMP_DECLARE_SIMD) != C_ORT_OMP && ort != C_ORT_ACC)
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_POINTER);
  else if (TREE_CODE (t) == COMPONENT_REF)
-- 
2.17.1



Re: [PATCH] Document the GCC11 change to DWARF5 default.

2021-03-02 Thread Mark Wielaard
On Fri, 2021-02-19 at 18:04 +0100, Mark Wielaard wrote:
>   * gcc-11/changes.html (General Improvements): Add a section on
>   the DWARF version 5 default.

Ping. OK to commit?

> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index 6a47b0b8..9734eee8 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -139,6 +139,37 @@ a work-in-progress.
>-fsanitize=kernel-hwaddress to instrument kernel 
> code.
>  
>
> +  
> +
> +  For targets that produce DWARF debugging information GCC now
> +  defaults to http://dwarfstd.org/doc/DWARF5.pdf";>DWARF
> +  version 5 (with the exception of VxWorks and Darwin/Mac OS X
> +  which default to version 2 and AIX which defaults to version 4).
> +  This can produce up to 25% more compact debug information
> +  compared to earlier versions.
> +
> +  To take full advantage of DWARF version 5 GCC needs to be build
> +  against binutils version 2.35.2 or higher.  When GCC is build
> +  against earlier versions of binutils GCC will still emit DWARF
> +  version 5 for most debuginfo data, but will generate version 4
> +  debug line tables (even when explicitly given -gdwarf-5).
> +
> +  The following debug information consumers can process DWARF version 5:
> +  
> +   GDB 8.0, or higher
> +   valgrind 3.17.0
> +   elfutils 0.172, or higher (for use with systemtap,
> + dwarves/pahole, perf and libabigail)
> +   dwz 0.14
> +  
> +
> +  Programs embedding libbacktrace are urged to upgrade to the version
> +  shipping with GCC 11.
> +
> +  To make GCC 11 generate an older DWARF version
> +  use -g together with -gdwarf-2,
> +  -gdwarf-3 or -gdwarf-4.
> +  
>  
>  
>  
> 


Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-02 Thread David Edelsohn via Gcc-patches
Alex,

dejagnu should not report that sqrt_insn is available on PowerPC in
confirmations when it is not.  The correct fix and the one suggested
by Richard and Segher is to test for the availability of sqrt, not to
assume that it exists in PowerPC.

The test should not explicitly add -mpowerpc-gpopt because then it is
not testing the configuration of the compiler and/or the configuration
selected by the person running dejagnu, which makes the testcase moot.

Thanks, David

On Tue, Mar 2, 2021 at 6:14 AM Alexandre Oliva  wrote:
>
> On Feb 26, 2021, Segher Boessenkool  wrote:
>
> > On Fri, Feb 26, 2021 at 12:31:16PM -0500, David Edelsohn wrote:
> >> On Fri, Feb 26, 2021 at 11:09 AM Alexandre Oliva  wrote:
> >> >
> >> > This patch avoids an ICE in gimplefe-28.c, in our ppc64-vxworks7r2
> >> > tests.  Tested on x86_64-linux-gnu, and on the affected platform.  Ok to
> >> > install?
> >>
> >> I'm sort of surprised that sqrt instruction would be available for the
> >> target but not enabled by default.  Is this really the method that
> >> customers would use?  It seems that it might be more appropriate to
> >> xfail or skip the test for PPC64 VxWorks.
>
> > You should essentially never use -mpowerpc-gpopt, but instead use a
> > -mcpu= that has it.  You also of course whould not do that in run tests
> > if the cpu does not support those insns.
>
> I think the feedback is missing the point of the obvious bug that Eric's
> patch fixes.
>
> We have a dejagnu proc that should return any target-specific options
> necessary for a sqrt insn to be available:
>
> # Return any additional options to enable square root intructions.
>
> powerpc has an optional sqrt insn, but the option that enables it is not
> returned by that proc.  That's a blatang bug in that proc.  Do you see
> that, or do you have any sensible reasoning to share to support the
> position that the proc is NOT buggy?
>
>
> This proc is presumably only used in compile tests; it wouldn't make
> sense to assume an optional sqrt insn to be available on whatever
> hardware variant you happen to be using for testing.
>
> But the bug fixed by Eric's patch is pretty blatant, and I don't think
> it makes sense to reject this fix, and insist on a fix of another bug
> instead.  That would just leave *this* bug in the dejagnu proc unfixed.
>
>
> Now, maybe a different flag is to be returned, but -mpowerpc-gpopt is
> the flag that enables sqrt with minimal disturbance of any other cpu
> selections in effect, so I believe that's the option that best serves
> the purpose of making the hardware sqrt insn available, regardless of
> whatever would be recommended to end users.
>
>
> > But, Alex, you say this avoids an ICE?  You shouldn't avoid the ICE in
> > that testcase then -- instead, fix it!  (Or report it).
>
> Orthogonal issue, IMHO.  If you restate the response as "the proposed
> patch is fine as long as a bug report is on file for the error
> encountered when attempting to expand . when a sqrt intrinsic is
> not available", I can go along with it.  But saying "we don't want to
> fix this bug, we want to fix another vaguely related bug *instead*",
> will make our stances mutually incompatible, and would likely result in
> my pursuing neither bug.
>
>
> While at that, if -mpowerpc-gpopt is not to be used, maybe you'll want
> to fix, or file a bug report, about the multiple tests
> gcc.target/powerpc/pr46728-*.c, that use it explicitly, and for the very
> purpose of enabling the fsqrt insn.  Several of these are execution
> tests, so they fail on systems that don't offer fsqrt.  I suppose these
> should mention *_sqrt_insn target requirements and add options.  I
> haven't looked into how they interact, if at all.
>
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH] analyzer: remove dead code

2021-03-02 Thread David Malcolm via Gcc-patches
On Tue, 2021-03-02 at 11:38 +0100, Martin Liška wrote:
> The issue is reported by Clang:
> 
> warning: private field 'm_engine' is not used [-Wunused-private-
> field]

Thanks; FWIW I think I removed the usage in
a505fad4dd4d93b6d642995d7df320aa40949568.

LGTM
Dave




Re: [PATCH] analyzer: remove dead code

2021-03-02 Thread Martin Liška

On 3/2/21 2:38 PM, David Malcolm wrote:

On Tue, 2021-03-02 at 11:38 +0100, Martin Liška wrote:

The issue is reported by Clang:

warning: private field 'm_engine' is not used [-Wunused-private-
field]


Thanks; FWIW I think I removed the usage in
a505fad4dd4d93b6d642995d7df320aa40949568.


Good! Apparently, the clang warning is quite useful.
I've just pushed the patch.

Thanks,
Martin



LGTM
Dave






Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-02 Thread Richard Sandiford via Gcc-patches
Alexandre Oliva  writes:
> On Feb 26, 2021, Segher Boessenkool  wrote:
>
>> On Fri, Feb 26, 2021 at 12:31:16PM -0500, David Edelsohn wrote:
>>> On Fri, Feb 26, 2021 at 11:09 AM Alexandre Oliva  wrote:
>>> >
>>> > This patch avoids an ICE in gimplefe-28.c, in our ppc64-vxworks7r2
>>> > tests.  Tested on x86_64-linux-gnu, and on the affected platform.  Ok to
>>> > install?
>>> 
>>> I'm sort of surprised that sqrt instruction would be available for the
>>> target but not enabled by default.  Is this really the method that
>>> customers would use?  It seems that it might be more appropriate to
>>> xfail or skip the test for PPC64 VxWorks.
>
>> You should essentially never use -mpowerpc-gpopt, but instead use a
>> -mcpu= that has it.  You also of course whould not do that in run tests
>> if the cpu does not support those insns.
>
> I think the feedback is missing the point of the obvious bug that Eric's
> patch fixes.
>
> We have a dejagnu proc that should return any target-specific options
> necessary for a sqrt insn to be available:
>
> # Return any additional options to enable square root intructions.

Agreed FWIW.  The intention of dg-add-options was to try to increase
test coverage by supporting pairs of procs, one to answer “can I enable
this feature?” and another to say “this is how to enable the feature”.
The question isn't whether sqrt is already enabled, but whether it can be.

Both proposed fixes are correct, in that they make the pairs of procs
consistent.  Changing add_options_for_sqrt_insn is more in the spirit
of how this was supposed to work though.

Thanks,
Richard


Re: PING [PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266)

2021-03-02 Thread Jason Merrill via Gcc-patches

On 3/1/21 6:11 PM, Martin Sebor wrote:

On 2/24/21 5:35 PM, Jason Merrill wrote:

On 2/23/21 6:07 PM, Martin Sebor wrote:

On 2/23/21 2:52 PM, Jason Merrill wrote:

On 2/23/21 11:02 AM, Martin Sebor wrote:

[CC Jason for any further comments/clarification]

On 2/9/21 10:49 AM, Martin Sebor wrote:

On 2/8/21 4:11 PM, Jeff Law wrote:



On 2/8/21 3:44 PM, Martin Sebor wrote:

On 2/8/21 3:26 PM, Jeff Law wrote:



On 2/8/21 2:56 PM, Martin Sebor wrote:

On 2/8/21 12:59 PM, Jeff Law wrote:



On 1/19/21 5:56 PM, Martin Sebor via Gcc-patches wrote:
Similar to the problem reported for -Wstringop-overflow in 
pr98266

and already fixed, -Warray-bounds is also susceptible to false
positives in assignments and copies involving virtual 
inheritance.
Because the two warnings don't share code yet (hopefully in 
GCC 12)

the attached patch adds its own workaround for this problem to
gimple-array-bounds.cc, this one slightly more crude because of
the limited insight the array bounds checking has into the 
checked

expressions.

Tested on x86_64-linux.

Martin

gcc-98266.diff

PR middle-end/98266 - bogus array subscript is partly 
outside array

bounds on virtual inheritance

gcc/ChangeLog:

  PR middle-end/98266
  * gimple-array-bounds.cc
(array_bounds_checker::check_array_bounds):
  Avoid checking references involving artificial members.

gcc/testsuite/ChangeLog:

  PR middle-end/98266
  * g++.dg/warn/Warray-bounds-15.C: New test.
It seems to me that we've got the full statement at some 
point and

thus
the full expression so at some point couldn't we detect when
TYPE_SIZE_UNIT!= DECL_SIZE_UNIT?  Or should we be using 
TYPE_SIZE_UNIT

rather than DECL_SIZE_UNIT in gimple-array-bounds.cc

Am I missing something?


The expression we're looking at when the false positive is issued
is the MEM_REF in the LHS of:

MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM  [(void
*)&_ZTC1E24_1D + 24B];

TREE_TYPE(LHS) is D, DECL_SIZE_UNIT (D.2652) is 24, and
TYPE_SIZE_UNIT(D) is also 24, so there's no discrepancy between
DECL_SIZE and TYPE_SIZE.
So that seems like it's a different issue then, unrelated to 
97595.

Right?


I think the underlying problem is the same.  We're getting a size
that doesn't correspond to what's actually being accessed, and it
happens because of the virtual inheritance.  In pr97595 Jason
suggested to use the decl/type size inequality to identify this
case but I think we could have just as well used DECL_ARTIFICIAL
instead.  At least the test cases from pr97595 both pass with
this change.
But in the 98266 case the type and decl sizes are the same.  So 
to be

true that would mean that the underlying type we're using to access
memory differs from its actual type.  Is that the case in the IL? 
And
does this have wider implications for diagnostics or 
optimizations that

rely on accurate type sizing?

I'm just trying to make sure I understand, not accepting or 
rejecting

the patch yet.


The part of the IL with the MEM_REF is this:

void g ()
{
   void * D.2789;
   struct E D.2652;

    [local count: 1073741824]:
   E::E (&D.2652, "");
   f (&D.2652);

    [local count: 1073741824]:
   MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM  
[(void *)&_ZTC1E24_1D + 24B];

   ...

The access here is to the _vptr.D pointer member of D.2652 which is
just past the end of the parent object (as reflected by its SIZE):
it sets sets up the virtual table pointer.

The access in pr97595 is to the member subobject, which, as Jason
explained (and I accordingly documented under DECL_SIZE in tree.h),
is also laid out separately from the parent object.

These cases aren't exactly the same (which is also why the test
I added for -Warray-bounds in pr97595 didn't expose this bug) but
they are closely related.  The one here can be distinguished by
DECL_ARTIFICAL.  The other by the DECL_SIZE != TYPE_SIZE member
inequality.

Might this impact other warnings?  I'd say so if they don't take
these things into account.  I just learned about this in pr97595
which was a -Wstringop-overflow false positive but I also saw
a similar instance of -Warray-bounds with my patch to improve
caching and enhance array bounds checking.  I dealt with that
instance of the warning in that patch but proactively added
a test case to the fix for pr97595.  But the test case is focused
on the subobject access and not on one to the virtual table so
(as I said above) it didn't expose this bug.

Might this also impact optimizations?  I can imagine someone
unaware of this "gotcha" making the same "naive" assumption
I did, but I'd also expect such an invalid assumption to be
found either in code review or quickly cause problems in
testing.


Jeff, does this answer your question?


I don't see how the issue here depends on the artificiality of the 
vptr; I'd expect to see the same problem with a data member.  The 
problem is that a D base subobject is smaller than a complete D 
object, and in this case the base subobject is allocated such that 
if it were a

Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-02 Thread David Edelsohn via Gcc-patches
On Tue, Mar 2, 2021 at 8:48 AM Richard Sandiford
 wrote:
>
> Alexandre Oliva  writes:
> > On Feb 26, 2021, Segher Boessenkool  wrote:
> >
> >> On Fri, Feb 26, 2021 at 12:31:16PM -0500, David Edelsohn wrote:
> >>> On Fri, Feb 26, 2021 at 11:09 AM Alexandre Oliva  
> >>> wrote:
> >>> >
> >>> > This patch avoids an ICE in gimplefe-28.c, in our ppc64-vxworks7r2
> >>> > tests.  Tested on x86_64-linux-gnu, and on the affected platform.  Ok to
> >>> > install?
> >>>
> >>> I'm sort of surprised that sqrt instruction would be available for the
> >>> target but not enabled by default.  Is this really the method that
> >>> customers would use?  It seems that it might be more appropriate to
> >>> xfail or skip the test for PPC64 VxWorks.
> >
> >> You should essentially never use -mpowerpc-gpopt, but instead use a
> >> -mcpu= that has it.  You also of course whould not do that in run tests
> >> if the cpu does not support those insns.
> >
> > I think the feedback is missing the point of the obvious bug that Eric's
> > patch fixes.
> >
> > We have a dejagnu proc that should return any target-specific options
> > necessary for a sqrt insn to be available:
> >
> > # Return any additional options to enable square root intructions.
>
> Agreed FWIW.  The intention of dg-add-options was to try to increase
> test coverage by supporting pairs of procs, one to answer “can I enable
> this feature?” and another to say “this is how to enable the feature”.
> The question isn't whether sqrt is already enabled, but whether it can be.
>
> Both proposed fixes are correct, in that they make the pairs of procs
> consistent.  Changing add_options_for_sqrt_insn is more in the spirit
> of how this was supposed to work though.

The procs are used in more than that one test.  Maybe that one test
was intended as a pair, but the use of the procs is more general than
"does the target support sqrt instruction and the complementary
options to enable the instruction".

Thanks, David


[commit][OG10] nvptx: remove erroneous stack deletion

2021-03-02 Thread Andrew Stubbs


This patch fixes an OpenMP performance issue on NVPTX.

The problem is that it deallocates the stack memory when it shouldn't, 
forcing the GOMP_OFFLOAD_run function to allocate the stack space again, 
before every kernel launch.


The memory is only meant to be deallocated when a data allocation fails, 
in the hope that memory can be reallocated more efficiently, but there's 
an additional, unconditional deallocate that looks like it may have been 
vestigial debug code, or something.


Fixing the issue gives a 3x speed-up running the BabelStream benchmark.

Andrew
nvptx: remove erroneous stack deletion

The stacks are not supposed to be deleted every time memory is allocated, only
when there is insufficient memory.  The unconditional call here seems to be in
error, and is causing a costly reallocation of the stacks before every launch.

libgomp/

	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_alloc): Remove early call to
	nvptx_stacks_free.

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 942fb989bac..21db2bd29c8 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1411,8 +1411,6 @@ GOMP_OFFLOAD_alloc (int ord, size_t size)
   ptx_dev->free_blocks = NULL;
   pthread_mutex_unlock (&ptx_dev->free_blocks_lock);
 
-  nvptx_stacks_free (ptx_dev, false);
-
   while (blocks)
 {
   tmp = blocks->next;


Re: [PATCH] Document the GCC11 change to DWARF5 default.

2021-03-02 Thread Jeff Law via Gcc-patches



On 3/2/21 5:56 AM, Mark Wielaard wrote:
> On Fri, 2021-02-19 at 18:04 +0100, Mark Wielaard wrote:
>>  * gcc-11/changes.html (General Improvements): Add a section on
>>  the DWARF version 5 default.
> Ping. OK to commit?
OK.  Sorry for the delay.
jeff



Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Richard Earnshaw via Gcc-patches
On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
> Ping?
> 
> On Wed, 3 Feb 2021 at 10:01, Christophe Lyon  
> wrote:
>>
>> Ping?
>> I guess that's obvious enough?
>>
>> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
>>  wrote:
>>>
>>> Depending on how the toolchain is configured or how the testsuite is
>>> executed, -mthumb may not be compatible. Like for other tests, skip
>>> pr97969.c in this case.
>>>
>>> For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
>>>
>>> 2021-01-27  Christophe Lyon  
>>>
>>> gcc/testsuite/
>>> PR target/97969
>>> * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
>>> b/gcc/testsuite/gcc.target/arm/pr97969.c
>>> index 714a1d1..0b5d07f 100644
>>> --- a/gcc/testsuite/gcc.target/arm/pr97969.c
>>> +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
>>> @@ -1,4 +1,5 @@
>>>  /* { dg-do compile } */
>>> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>>>  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */
>>>
>>>  typedef a[23];

I'm working on a patch to make this sort of change unnecessary (I hope).
 Just running some final checks.

R.


Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Richard Earnshaw via Gcc-patches
On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
> On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
>> Ping?
>>
>> On Wed, 3 Feb 2021 at 10:01, Christophe Lyon  
>> wrote:
>>>
>>> Ping?
>>> I guess that's obvious enough?
>>>
>>> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
>>>  wrote:

 Depending on how the toolchain is configured or how the testsuite is
 executed, -mthumb may not be compatible. Like for other tests, skip
 pr97969.c in this case.

 For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.

 2021-01-27  Christophe Lyon  

 gcc/testsuite/
 PR target/97969
 * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.

 diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
 b/gcc/testsuite/gcc.target/arm/pr97969.c
 index 714a1d1..0b5d07f 100644
 --- a/gcc/testsuite/gcc.target/arm/pr97969.c
 +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
 @@ -1,4 +1,5 @@
  /* { dg-do compile } */
 +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */

  typedef a[23];
> 
> I'm working on a patch to make this sort of change unnecessary (I hope).
>  Just running some final checks.
> 
> R.
> 

Ah, wait.  This one already has an explicit -mthumb, so my patch won't
affect this.  But why is -mthumb needed for this test anyway?  It's just
a compilation test, so why not drop that and we'll generally get better
coverage all round.

R.


Re: [PATCH 5/6] c++: Clean up normalization / satisfaction routines

2021-03-02 Thread Patrick Palka via Gcc-patches
On Mon, 1 Mar 2021, Jason Merrill wrote:

> On 2/28/21 12:58 PM, Patrick Palka wrote:
> > This patch mostly performs some straightforward refactoring:
> > 
> >- Renamed satisfy_constraint to satisfy_normalized_constraints
> >- Renamed the three-parameter version of satisfy_constraint_expression
> >  to satisfy_nondeclaration_constraints
> >- Removed normalize_(non)?template_requirements
> >- Removed satisfy_associated_constraints (and made its callers
> >  check for dependent template args sooner, before normalization)
> >- Removed the tsubst_flags_t parameter of evaluate_concept_check
> >- Combined the two versions of constraint_satisfaction_value
> >- Combined the two versions of constraint_satisfied_p
> > 
> > Additionally, this patch removes the handling of bare
> > constraint-expressions from satisfy_nondeclaration_constraints, and
> > hence constraints_satisfied_p and constraint_satisfaction_value now only
> > take things that carry their own template information needed for
> > normalization.  In practice, this only means it's no longer possible to
> > evaluate bare REQUIRES_EXPRs via the satisfaction routines, and so this
> > patch adjusts the affected callers to instead use tsubst_requires_expr.
> 
> It's probably better to have a different entry point than tsubst_requires_expr
> for callers that have nothing to do with templates.  Whether that's a one-line
> wrapper for the call to tsubst_requires_expr ("evaluate_requires_expr"?) or
> calling tsubst_requires_expr from satisfy_nondeclaration_constraints, I don't
> have an opinion either way.
> 
> > For convenience, the function diagnose_constraints continues to accept
> > REQUIRES_EXPRs, but it now handles them by calling diagnose_require_expr
> > directly.
> 
> This might argue for the latter choice above.

Ah, I didn't consider the option of continuing to handle REQUIRES_EXPRs
from satisfy_nondeclaration_constraints by evaluating them directly.

Here's a patch that implements this.  I opted to add a
evaluate_requires_expr wrapper as well, since

  evaluate_requires_expr (t)

is much more readable than

  constant_boolean_node (constraints_satisfied_p (t),
 boolean_type_node)

(Note the added TODO in satisfy_nondeclaration_constraints, which'll be
fixed in a new version of the REQUIRES_EXPR evaluation/diagnostic
unification patch.)

-- >8 --

Subject: [PATCH] c++: Clean up normalization / satisfaction routines

This patch mostly performs some straightforward refactoring:

  - Renamed satisfy_constraint to satisfy_normalized_constraints
  - Renamed the three-parameter version of satisfy_constraint_expression
to satisfy_nondeclaration_constraints
  - Removed normalize_(non)?template_requirements
  - Removed satisfy_associated_constraints (and made its callers
check for dependent template args sooner, before normalization)
  - Removed the tsubst_flags_t parameter of evaluate_concept_check
  - Combined the two versions of constraint_satisfaction_value
  - Combined the two versions of constraint_satisfied_p

Additionally, this patch removes the handling of general
constraint-expressions from satisfy_nondeclaration_constraints, and
hence constraints_satisfied_p and constraint_satisfaction_value now take
only things that carry their own template information needed for
normalization, and, as a special case, REQUIRES_EXPRs.  But the latter
now get evaluated directly via tsubst_requires_expr rather than going
through satisfaction.

(That we used to evaluate REQUIRES_EXPR via satisfaction might even be a
correctness issue: since we cache satisfaction in special ways that don't
apply to regular evaluation, going through satisfaction could in theory
cause us to reuse a cached value for a REQUIRES_EXPR when we shouldn't
have.)

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): Adjust call to
evaluate_concept_check.
(cxx_eval_constant_expression) : Use
evaluate_requires_expression instead of
satisfy_constraint_expression.
: Adjust call to evaluate_concept_check.
* constraint.cc (struct sat_info): Adjust comment about which
satisfaction entrypoints use noisy-unsat.
(normalize_template_requirements): Remove (and adjust callers
appropriately).
(normalize_nontemplate_requirements): Likewise.
(tsubst_nested_requirement): Use constraint_satisfaction_value
instead of satisfy_constraint_expression, which'll do the
noisy replaying of ill-formed quiet satisfaction for us.
(decl_satisfied_cache): Adjust comment.
(satisfy_constraint): Rename to ...
(satisfy_normalized_constraints): ... this.
(satisfy_associated_constraints): Remove (and make its
callers check for dependent arguments).
(satisfy_constraint_expression): Rename to ...
(satisfy_nondeclaration_constraints): ... this.  Assert that
'args' is empty when 't' is a con

Re: [PATCH 6/6] c++: Consolidate REQUIRES_EXPR evaluation/diagnostic routines

2021-03-02 Thread Patrick Palka via Gcc-patches
On Mon, 1 Mar 2021, Jason Merrill wrote:

> On 2/28/21 12:59 PM, Patrick Palka wrote:
> > This folds the diagnose_requires_expr routines into the corresponding
> > tsubst_requires_expr ones.  This is achieved by making the latter
> > routines take a sat_info instead of a subst_info, and assigning the
> > appropriate meanings to the flags sat_info::noisy and
> > sat_info::diagnose_unsatisfaction_p during tsubst_requires_expr:
> > info.noisy() controls whether to diagnose invalid types and expressions
> > inside the requires-expression, and info.diagnose_unsatisfaction_p()
> > controls whether to diagnose why the requires-expression evaluates to
> > false.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constraint.cc (struct sat_info): Document the different
> > meanings of noisy() and diagnose_unsatisfaction_p() during
> > satisfaction and requires-expression evaluation.
> > (tsubst_valid_expression_requirement): Take a sat_info instead
> > of a subst_info.  Perform the substitution quietly first.  Fold
> > in error-replaying code from diagnose_valid_expression.
> > (tsubst_simple_requirement): Take a sat_info instead of a
> > subst_info.
> > (tsubst_type_requirement_1): New.  Fold in error-replaying code
> > from diagnose_valid_type.
> > (tsubst_type_requirement): Use the above.  Take a sat_info
> > instead of a subst_info.
> > (tsubst_compound_requirement): Likewise.  Fold in
> > error-replaying code from diagnose_compound_requirement.
> > (tsubst_nested_requirement): Take a sat_info instead of a
> > subst_info.  Fold in error-replaying code from
> > diagnose_nested_requirement.
> > (tsubst_requirement): Take a sat_info instead of a subst_info.
> > (tsubst_requires_expr): Split into two versions, one that takes
> > a sat_info argument and another that takes a complain and
> > in_decl argument.  Remove outdated documentation.  Document the
> > effects of the sat_info argument.
> > (diagnose_trait_expr): Make static.  Take a template argument
> > vector instead of a parameter mapping.
> > (diagnose_valid_expression): Remove.
> > (diagnose_valid_type): Remove.
> > (diagnose_simple_requirement): Remove.
> > (diagnose_compound_requirement): Remove.
> > (diagnose_type_requirement): Remove.
> > (diagnose_nested_requirement): Remove.
> > (diagnose_requirement): Remove.
> > (diagnose_requires_expr): Remove.
> > (diagnose_atomic_constraint): Take a sat_info instead of a
> > subst_info.  Adjust call to diagnose_trait_expr.  Call
> > tsubst_requires_expr instead of diagnose_requires_expr.
> > (diagnose_constraints): Call tsubst_requires_expr instead of
> > diagnose_requires_expr.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/concepts/diagnostic1.C: Adjust expected diagnostics
> > now that we diagnose only the first failed requirement of a
> > requires-expression.
> > ---
> >   gcc/cp/constraint.cc| 416 +---
> >   gcc/testsuite/g++.dg/concepts/diagnostic1.C |   2 +-
> >   2 files changed, 179 insertions(+), 239 deletions(-)
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index cf319b34da0..31f32c25dfe 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -100,17 +100,30 @@ struct subst_info
> > /* Provides additional context for satisfaction.
> >   -   The flag noisy() controls whether to diagnose ill-formed satisfaction,
> > -   such as the satisfaction value of an atom being non-bool or
> > non-constant.
> > -
> > -   The flag diagnose_unsatisfaction_p() controls whether to explain why
> > -   a constraint is not satisfied.
> > -
> > -   The entrypoints to satisfaction for which we set noisy+unsat are
> > -   diagnose_constraints and diagnose_nested_requirement.  The entrypoint
> > for
> > -   which we set noisy-unsat is the replay inside
> > constraint_satisfaction_value.
> > -   From constraints_satisfied_p, we enter satisfaction quietly (both flags
> > -   cleared).  */
> > +   During satisfaction:
> > +- The flag noisy() controls whether to diagnose ill-formed
> > satisfaction,
> > +  such as the satisfaction value of an atom being non-bool or
> > non-constant.
> > +- The flag diagnose_unsatisfaction_p() controls whether to explain why
> > +  a constraint is not satisfied.
> > +- We enter satisfaction with noisy+unsat from diagnose_constraints.
> > +- We enter satisfaction with noisy-unsat from the replay inside
> > +  constraint_satisfaction_value.
> > +- We enter satisfaction quietly (both flags cleared) from
> > +  constraints_satisfied_p.
> > +
> > +   During evaluation of a requires-expression:
> > +- The flag noisy() controls whether to diagnose ill-formed types and
> > +  expressions inside its requirements.
> > +- The flag diagnose_unsatisfaction_p() controls whether to explain why
> > +  the requires-expression e

Re: [PATCH] dwarf2out: Fix up split-dwarf .debug_macro handling [PR99319]

2021-03-02 Thread Jeff Law via Gcc-patches



On 3/2/21 2:06 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> The -gsplit-dwarf changes came a few months after .debug_macro
> and the r0-120109 changes just changed the 2nd operand of
> DW_MACRO_GNU_{define,undef}_indirect from the usual .debug_str
> section offset argument to leb128 index into .debug_str_offsets
> without changing the opcodes.
>
> DWARF5 standardized different opcodes for those, but GCC hasn't been changed
> yet for that.
> This patch starts using DW_MACRO_define_strx and DW_MACRO_undef_strx
> instead of DW_MACRO_define_strp and DW_MACRO_undef_strp when -gsplit-dwarf
> -gdwarf-5 -g3.  I'm not sure what to do if anything with the -gdwarf-4
> -gsplit-dwarf -g3 -gno-strict-dwarf case, we've been emitting it that way
> for 8 years and it is an extension, so presumably the consumers that cared
> have already hacks to handle DW_MACRO_GNU_{define,undef}_indirect
> differently in .debug_macro 4 sections depending on if it is
> .debug_macro.dwo or .debug_macro.
> Another change the patch does is that it will use
> DW_MACRO_{define,undef}_str{p,x} even with -gdwarf-5 -gstrict-dwarf -g3,
> for DWARF 4 we were doing that only for -gno-strict-dwarf as we've emitted
> .debug_macro section only in that case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-03-02  Jakub Jelinek  
>
>   PR debug/99319
>   * dwarf2out.c (output_macinfo_op): Use DW_MACRO_*_str* even with
>   -gdwarf-5 -gstrict-dwarf.  For -gsplit-dwarf -gdwarf-5 use
>   DW_MACRO_*_strx instead of DW_MACRO_*_strp.  Handle
>   DW_MACRO_define_strx and DW_MACRO_undef_strx.
>   (save_macinfo_strings): Use DW_MACRO_*_str* even with
>   -gdwarf-5 -gstrict-dwarf.  Handle DW_MACRO_define_strx and
>   DW_MACRO_undef_strx.
OK
jeff



Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-02 Thread Jeff Law via Gcc-patches



On 3/1/21 1:39 AM, Richard Biener wrote:
> The default diagnostic tree printer relies on dump_generic_node which
> for some reason manages to clobber the diagnostic pretty-printer state
> so we see garbled diagnostics like
>
> /home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
> D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:  may 
> be used uninitialized in this function [-Wmaybe-uninitialized]
>
> when the diagnostic is emitted by the LTO fronted.  The following
> approach using a temporary pretty-printer for the dump_generic_node
> call fixes this for some unknown reason and we issue
>
> /home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
> /home/rguenther/src/trunk/gcc/dojump.c:118:28: warning: 'MEM[(struct poly_int 
> *)&save].D.6750.coeffs[0]' may be used uninitialized in this function 
> [-Wmaybe-uninitialized]
>
> [LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
>
> Thanks,
> Richard.
>
> 2021-02-26  Richard Biener  
>
>   PR middle-end/97855
>   * tree-diagnostic.c (default_tree_printer): Use a temporary
>   pretty-printer when formatting a tree via dump_generic_node.
It'd be good to know why this helps, but I trust your judgment that this
is an improvement.

OK

jeff



[Ada] Fix PR ada/99095

2021-03-02 Thread Eric Botcazou
This is a regression present on the mainline and 10 branch, where we fail
to make the bounds explicit for the return value of a function returning
an unconstrained array of a limited record type.

Tested on x86-64/Linux, applied on mainline and 10 branch.


2021-03-02  Eric Botcazou  

PR ada/99095
* sem_ch8.adb (Check_Constrained_Object): Restrict again the special
optimization for limited types to non-array types except in the case
of an extended return statement.


2021-03-02  Eric Botcazou  

* gnat.dg/limited5.adb: New test.

-- 
Eric Botcazoudiff --git a/gcc/ada/sem_ch8.adb b/gcc/ada/sem_ch8.adb
index 4689ae4ba18..efff7145337 100644
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -830,11 +830,19 @@ package body Sem_Ch8 is
 --  that are used in iterators. This is an optimization, but it
 --  also prevents typing anomalies when the prefix is further
 --  expanded.
+
 --  Note that we cannot just use the Is_Limited_Record flag because
 --  it does not apply to records with limited components, for which
 --  this syntactic flag is not set, but whose size is also fixed.
 
-elsif Is_Limited_Type (Typ) then
+--  Note also that we need to build the constrained subtype for an
+--  array in order to make the bounds explicit in most cases, but
+--  not if the object comes from an extended return statement, as
+--  this would create dangling references to them later on.
+
+elsif Is_Limited_Type (Typ)
+  and then (not Is_Array_Type (Typ) or else Is_Return_Object (Id))
+then
null;
 
 else
--  { dg-do compile }

procedure Limited5 is

   type Command   is limited null record;
   type Command_Array is array (Positive range <>) of Command;
  
   function To_Commands return Command_Array is
   begin
  return Result : Command_Array (1 .. 2);
   end To_Commands;
   
   The_Commands : aliased Command_Array := To_Commands;

begin
   null;
end;


Re: [PATCH] C++: target attribute - local decl

2021-03-02 Thread Jason Merrill via Gcc-patches

On 3/2/21 5:34 AM, Martin Liška wrote:

On 3/1/21 8:58 PM, Jason Merrill wrote:

On 3/1/21 11:59 AM, Martin Liška wrote:

On 3/1/21 5:36 PM, Jason Merrill wrote:

On 3/1/21 7:43 AM, Martin Liška wrote:

On 2/22/21 11:53 PM, Jason Merrill wrote:
The problem seems to be with the handling of local decls.  If 
DECL_LOCAL_DECL_P, you need to look at DECL_LOCAL_DECL_ALIAS to 
find the namespace-scope decl.  But then if there is no preceding 
namespace-scope declaration, the new decl created by 
push_local_extern_decl_alias doesn't have a cgraph node, either.  
I guess maybe_function_versions 
also needs to look through DECL_LOCAL_DECL_ALIAS.


Ah, I see. Are you sure about the name 'maybe_function_versions'? I 
can't find it.


Ah, it's maybe_version_functions, sorry.


Thanks, I see the function now.
So about your guess:

I guess maybe_function_versions also needs to look through 
DECL_LOCAL_DECL_ALIAS.


Do you mean maybe_version_functions's argument 'record' should depend 
on DECL_LOCAL_DECL_ALIAS of newdecl/oldddecl
(if present)? Or that DECL_FUNCTION_VERSIONED should be set for 
DECL_LOCAL_DECL_ALIASes of the newdecl/olddecl

function declarations?


The latter.


I see, but will not help us. Problem is that
#2  0x015d8899 in ix86_get_function_versions_dispatcher 
(decl=0x77755000) at 
/home/marxin/Programming/gcc/gcc/config/i386/i386-features.c:2862


is called for a declaration for which

Breakpoint 5, maybe_version_functions (newdecl=0x77755000 f>, olddecl=, 
record=false) at /home/marxin/Programming/gcc/gcc/cp/decl.c:1118


is called with record=false. So that cgraph_node is not created for it.

Or is a possible solution that get_function_version_dispatcher should 
look through the DECL_LOCAL_DECL_ALIAS?


Yes, I was thinking both that function and maybe_version_functions.

Jason



Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Christophe Lyon via Gcc-patches
On Tue, 2 Mar 2021 at 17:25, Richard Earnshaw
 wrote:
>
> On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
> > On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
> >> Ping?
> >>
> >> On Wed, 3 Feb 2021 at 10:01, Christophe Lyon  
> >> wrote:
> >>>
> >>> Ping?
> >>> I guess that's obvious enough?
> >>>
> >>> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
> >>>  wrote:
> 
>  Depending on how the toolchain is configured or how the testsuite is
>  executed, -mthumb may not be compatible. Like for other tests, skip
>  pr97969.c in this case.
> 
>  For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
> 
>  2021-01-27  Christophe Lyon  
> 
>  gcc/testsuite/
>  PR target/97969
>  * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
> 
>  diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
>  b/gcc/testsuite/gcc.target/arm/pr97969.c
>  index 714a1d1..0b5d07f 100644
>  --- a/gcc/testsuite/gcc.target/arm/pr97969.c
>  +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
>  @@ -1,4 +1,5 @@
>   /* { dg-do compile } */
>  +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>   /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */
> 
>   typedef a[23];
> >
> > I'm working on a patch to make this sort of change unnecessary (I hope).
> >  Just running some final checks.
> >
> > R.
> >
>
> Ah, wait.  This one already has an explicit -mthumb, so my patch won't
> affect this.  But why is -mthumb needed for this test anyway?  It's just
> a compilation test, so why not drop that and we'll generally get better
> coverage all round.
>

For instance I see the test fail for target arm-none-linux-gnueabihf
--with-mode arm --with-cpu cortex-a9 --with-fpu vfp
and running the tests with -march=armv5t

We get the famous thumb-1 + hard-float ABI not supported.

I guess -mthumb is inherited from the bug report?

Christophe


Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Richard Earnshaw via Gcc-patches
On 02/03/2021 18:10, Christophe Lyon wrote:
> On Tue, 2 Mar 2021 at 17:25, Richard Earnshaw
>  wrote:
>>
>> On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
>>> On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
 Ping?

 On Wed, 3 Feb 2021 at 10:01, Christophe Lyon  
 wrote:
>
> Ping?
> I guess that's obvious enough?
>
> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
>  wrote:
>>
>> Depending on how the toolchain is configured or how the testsuite is
>> executed, -mthumb may not be compatible. Like for other tests, skip
>> pr97969.c in this case.
>>
>> For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
>>
>> 2021-01-27  Christophe Lyon  
>>
>> gcc/testsuite/
>> PR target/97969
>> * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
>>
>> diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
>> b/gcc/testsuite/gcc.target/arm/pr97969.c
>> index 714a1d1..0b5d07f 100644
>> --- a/gcc/testsuite/gcc.target/arm/pr97969.c
>> +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
>> @@ -1,4 +1,5 @@
>>  /* { dg-do compile } */
>> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>>  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */
>>
>>  typedef a[23];
>>>
>>> I'm working on a patch to make this sort of change unnecessary (I hope).
>>>  Just running some final checks.
>>>
>>> R.
>>>
>>
>> Ah, wait.  This one already has an explicit -mthumb, so my patch won't
>> affect this.  But why is -mthumb needed for this test anyway?  It's just
>> a compilation test, so why not drop that and we'll generally get better
>> coverage all round.
>>
> 
> For instance I see the test fail for target arm-none-linux-gnueabihf
> --with-mode arm --with-cpu cortex-a9 --with-fpu vfp
> and running the tests with -march=armv5t
> 
> We get the famous thumb-1 + hard-float ABI not supported.
> 
> I guess -mthumb is inherited from the bug report?
> 
> Christophe
> 

dropping the -mthumb should fix that though?

In fact, I'd drop -Os as well, it's not needed as -Os is just one of the
many options that are used to build this test already.

R.


Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Richard Earnshaw via Gcc-patches
On 02/03/2021 18:14, Richard Earnshaw via Gcc-patches wrote:
> On 02/03/2021 18:10, Christophe Lyon wrote:
>> On Tue, 2 Mar 2021 at 17:25, Richard Earnshaw
>>  wrote:
>>>
>>> On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
 On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
> Ping?
>
> On Wed, 3 Feb 2021 at 10:01, Christophe Lyon  
> wrote:
>>
>> Ping?
>> I guess that's obvious enough?
>>
>> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
>>  wrote:
>>>
>>> Depending on how the toolchain is configured or how the testsuite is
>>> executed, -mthumb may not be compatible. Like for other tests, skip
>>> pr97969.c in this case.
>>>
>>> For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
>>>
>>> 2021-01-27  Christophe Lyon  
>>>
>>> gcc/testsuite/
>>> PR target/97969
>>> * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
>>>
>>> diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
>>> b/gcc/testsuite/gcc.target/arm/pr97969.c
>>> index 714a1d1..0b5d07f 100644
>>> --- a/gcc/testsuite/gcc.target/arm/pr97969.c
>>> +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
>>> @@ -1,4 +1,5 @@
>>>  /* { dg-do compile } */
>>> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>>>  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */
>>>
>>>  typedef a[23];

 I'm working on a patch to make this sort of change unnecessary (I hope).
  Just running some final checks.

 R.

>>>
>>> Ah, wait.  This one already has an explicit -mthumb, so my patch won't
>>> affect this.  But why is -mthumb needed for this test anyway?  It's just
>>> a compilation test, so why not drop that and we'll generally get better
>>> coverage all round.
>>>
>>
>> For instance I see the test fail for target arm-none-linux-gnueabihf
>> --with-mode arm --with-cpu cortex-a9 --with-fpu vfp
>> and running the tests with -march=armv5t
>>
>> We get the famous thumb-1 + hard-float ABI not supported.
>>
>> I guess -mthumb is inherited from the bug report?
>>
>> Christophe
>>
> 
> dropping the -mthumb should fix that though?
> 
> In fact, I'd drop -Os as well, it's not needed as -Os is just one of the
> many options that are used to build this test already.
> 
> R.
> 

But maybe then we need to change dg-options into dg-add-options.

R.


Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-02 Thread Christophe Lyon via Gcc-patches
On Tue, 2 Mar 2021 at 19:18, Richard Earnshaw
 wrote:
>
> On 02/03/2021 18:14, Richard Earnshaw via Gcc-patches wrote:
> > On 02/03/2021 18:10, Christophe Lyon wrote:
> >> On Tue, 2 Mar 2021 at 17:25, Richard Earnshaw
> >>  wrote:
> >>>
> >>> On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
>  On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
> > Ping?
> >
> > On Wed, 3 Feb 2021 at 10:01, Christophe Lyon 
> >  wrote:
> >>
> >> Ping?
> >> I guess that's obvious enough?
> >>
> >> On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
> >>  wrote:
> >>>
> >>> Depending on how the toolchain is configured or how the testsuite is
> >>> executed, -mthumb may not be compatible. Like for other tests, skip
> >>> pr97969.c in this case.
> >>>
> >>> For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
> >>>
> >>> 2021-01-27  Christophe Lyon  
> >>>
> >>> gcc/testsuite/
> >>> PR target/97969
> >>> * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
> >>>
> >>> diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
> >>> b/gcc/testsuite/gcc.target/arm/pr97969.c
> >>> index 714a1d1..0b5d07f 100644
> >>> --- a/gcc/testsuite/gcc.target/arm/pr97969.c
> >>> +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
> >>> @@ -1,4 +1,5 @@
> >>>  /* { dg-do compile } */
> >>> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
> >>>  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } 
> >>> */
> >>>
> >>>  typedef a[23];
> 
>  I'm working on a patch to make this sort of change unnecessary (I hope).
>   Just running some final checks.
> 
>  R.
> 
> >>>
> >>> Ah, wait.  This one already has an explicit -mthumb, so my patch won't
> >>> affect this.  But why is -mthumb needed for this test anyway?  It's just
> >>> a compilation test, so why not drop that and we'll generally get better
> >>> coverage all round.
> >>>
> >>
> >> For instance I see the test fail for target arm-none-linux-gnueabihf
> >> --with-mode arm --with-cpu cortex-a9 --with-fpu vfp
> >> and running the tests with -march=armv5t
> >>
> >> We get the famous thumb-1 + hard-float ABI not supported.
> >>
> >> I guess -mthumb is inherited from the bug report?
> >>
> >> Christophe
> >>
> >
> > dropping the -mthumb should fix that though?
> >
> > In fact, I'd drop -Os as well, it's not needed as -Os is just one of the
> > many options that are used to build this test already.
> >
> > R.
> >
>
> But maybe then we need to change dg-options into dg-add-options.
>

Not sure to follow: the test is compiled only once, with:
-std=c99 -fno-omit-frame-pointer -mthumb -w -Os
in my logs

> R.


Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-02 Thread Segher Boessenkool
On Tue, Mar 02, 2021 at 08:13:51AM -0300, Alexandre Oliva wrote:
> On Feb 26, 2021, Segher Boessenkool  wrote:
> > You should essentially never use -mpowerpc-gpopt, but instead use a
> > -mcpu= that has it.  You also of course whould not do that in run tests
> > if the cpu does not support those insns.
> 
> I think the feedback is missing the point of the obvious bug that Eric's
> patch fixes.
> 
> We have a dejagnu proc that should return any target-specific options
> necessary for a sqrt insn to be available:
> 
> # Return any additional options to enable square root intructions.
> 
> powerpc has an optional sqrt insn, but the option that enables it is not
> returned by that proc.  That's a blatang bug in that proc.  Do you see
> that, or do you have any sensible reasoning to share to support the
> position that the proc is NOT buggy?

You often need more than one flag to enable this instruction, or it is
simply impossible (if the CPU selected does not have it).  You can need
-mno-soft-float, -mpowerpc-gpopt, or it is simply impossible because
your cpu does not implement these instructions (if any traditional
floating point instructions, if any floating point at all!)

> This proc is presumably only used in compile tests; it wouldn't make
> sense to assume an optional sqrt insn to be available on whatever
> hardware variant you happen to be using for testing.

It also conflicts with other insns (in principle, I know no cpu that has
some otther insns with this same opcode).

> But the bug fixed by Eric's patch is pretty blatant, and I don't think
> it makes sense to reject this fix, and insist on a fix of another bug
> instead.  That would just leave *this* bug in the dejagnu proc unfixed.

The actual bug is in check_effective_target_sqrt_insn.  I'm doing a
patch, but I cannot say that in the PR because there is no PR.

> > But, Alex, you say this avoids an ICE?  You shouldn't avoid the ICE in
> > that testcase then -- instead, fix it!  (Or report it).
> 
> Orthogonal issue, IMHO.  If you restate the response as "the proposed
> patch is fine as long as a bug report is on file for the error
> encountered when attempting to expand . when a sqrt intrinsic is
> not available", I can go along with it.  But saying "we don't want to
> fix this bug, we want to fix another vaguely related bug *instead*",
> will make our stances mutually incompatible, and would likely result in
> my pursuing neither bug.

I am saying fixing an ICE is much more important than fixing a testsuite
bug.  I am not saying the latter should not be done; I am asking to
please not sweep this under the rug.

> While at that, if -mpowerpc-gpopt is not to be used, maybe you'll want
> to fix, or file a bug report, about the multiple tests
> gcc.target/powerpc/pr46728-*.c, that use it explicitly, and for the very
> purpose of enabling the fsqrt insn.  Several of these are execution
> tests, so they fail on systems that don't offer fsqrt.  I suppose these
> should mention *_sqrt_insn target requirements and add options.  I
> haven't looked into how they interact, if at all.

That there are broken tests does not mean we should break more.

Yes, they should require a sqrt_insn effective target.


Almost everything that is ever tested these days has these insns, and
the insns are not usually disabled (which they always can be,
-msoft-float disables the FP registers).  So forcing gpopt on even
*reduces* the coverage instead of increasing it!

This is PR99352 now.


Segher


Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread David Malcolm via Gcc-patches
On Tue, 2021-03-02 at 07:09 +, brian.sobulefsky wrote:
> Hi,
> 
> It may not be worth altering at this point, but it seems like it
> would leave less
> bugs open if all the constraints go in as "closed" ranges and all
> evals are
> translated to closed intervals. So, if (idx > 0) and if (idx >= 1)
> are the same
> constraint. I know this won't be an option for eventual float
> support, but
> that is a different can of worms. For integers, you can fix it at
> those entry points
> and then all other subroutines can ignore the issue of open or closed
> ranges.
> 
> I fully understand the eye glaze and did not want to have to write it
> that
> way. I am thinking if there is a cleaner way to do it. Anyway, that
> is why I
> put a comment in each case to derive the result. This issue of "sides
> of the
> condition" and "inverted operator" as you call it in some places is a
> recurring
> theme. It is especially irritating when we lose commutativity, as we
> do with MINUS.
> 
> Adding logic in my subroutine for MULT or DIV is not hard, handling
> overflow
> is a bit more involved. At the very least, we would need to know what
> the max or min
> of a particular variable is, which might be in the type tree. We
> would also need to
> define how we want to handle the issue.
> 
> The problem is (and I have been thinking about this a lot in terms of
> constraint
> merging), there are currently no "or constraints," which would be
> helpful in merging too.
> So for overflow, when you have something like
> 
> if (idx > 0)
>  {
>   idx += num;
>   __analyzer_eval(idx > num);
>  }
> 
> you have gone from a single constraint (idx > 0), to an "or
> condition"
> (idx > num || idx < MIN_INT + num). The only solution now, other than
> ignoring
> overflow as a bug that is tolerated, is to just pass it off as
> unhandled (and
> therefore UNKNOWN). Perhaps you may want to add overflow as one of
> your analyzer
> warnings if it cannot be ruled out?
> 
> I did try to run a test with a simple || in the condition before just
> to see what
> would happen, and as you probably know it is not handled at all. I
> did not watch
> in gdb, but it is obvious from constraint-manager.cc that there is
> nothing to handle
> it. I think I actually did an __analyzer_eval() of the same ||
> condition verbatim
> that was in the if() conditional and still got an UNKNOWN.
> 
> It is a pretty intrusive change to add logic for that, which is why I
> have not
> done any work on it yet. Without the concept of "or" I don't see how
> we could
> handle overflow, but maybe you don't really want to handle it anyway,
> but
> rather just emit a warning if it might be considered poor practice to
> rely
> on something that is technically machine dependent anyway.

I'm not sure how we should handle this.

One approach would be to generalize the constraint-handling so that we
store a set of "facts" about svalues, where the facts are themselves
svalues that are known to be true.

Hence we could build an svalue for the TRUTH_OR_EXPR
  (idx > num || idx < MIN_INT + num)
and store that as a fact within the constraint manager.

But that would be a big rewrite.

Somehow the constraint manager needs to be able to evaluate queries
w.r.t known facts, and probably canonicalize sets of facts, and handle
mergers.

IIRC, the clang analyzer works by exposing an "add fact" interface
(where the facts can be compound symbolic expressions), but
implementing things internally with a choice of either a set of ranges,
or a Z3-backed solver.


If we're considering overflow, I would like to -fanalyzer to eventually
support bounds checking, and e.g. detecting attacks due to buffer size
overflows (CWE-131 due to CWE-190) e.g. in this code that probably
should have used calloc:

struct foo * make_arr (size_t n)
{
  size_t sz = sizeof (struct foo) * n;
  struct foo *p = (struct foo *)malloc (sz);
  if (!p)
 return;
  memset (p, 0, sz);
  return p;
}

void test (size_t n)
{
   struct foo *f = make_arr (n);
   if (!f)
 return;
   for (i = 0; i < n; i++)
 {
   //... do stuff with f[i]
 }
}

it would be good to detect the case when sz overflows and thus the
array is smaller than expected.  I think this could work be recording
that the allocated size of *p (== *f) is
   (size_t)(sizeof (struct foo) * n)

In the loop, the access to f[i] could bifurcate the egraph into:

  outcome A: (sizeof (struct foo) * i) < allocated_size
(carry on, and have this recorded so no further checking needed on
this path)

  outcome B: (sizeof (struct foo) * i) >= allocated_size
(complain about buffer overflow and stop this path)

Outcome B thus occurs when:
  (sizeof (struct foo) * i) >= (size_t)(sizeof (struct foo) * n)
  && (i < n)
so say sizeof (struct foo) == 16
  (16 * i) <= (size_t) (16 * n)
  && (i < n)
and we could then (somehow) show that this can happen e.g. for
  n > (SIZE_MAX / 16).
(and I've probably messed up at least some of the logic in the above)

Obviously this would be hu

Re: [PATCH] correct fix to avoid false positives for vectorized stores (PR 96963, 94655)

2021-03-02 Thread Martin Sebor via Gcc-patches

On 3/2/21 3:39 AM, Richard Biener wrote:

On Fri, Jan 22, 2021 at 12:39 AM Martin Sebor via Gcc-patches
 wrote:


The hack I put in compute_objsize() last January for pr93200 isn't
quite correct.  It happened to suppress the false positive there
but, due to what looks like a thinko on my part, not in some other
test cases involving vectorized stores.

The attached change adjusts the hack to have compute_objsize() give
up on all MEM_REFs with a vector type.  This effectively disables
-Wstringop-{overflow,overread} for vector accesses (either written
by the user or synthesized by GCC from ordinary accesses).  It
doesn't affect -Warray-bounds because this warning doesn't use
compute_objsize() yet.  When it does (it should considerably
simplify the code) some additional changes will be needed to
preserve -Warray-bounds for out of bounds vector accesses.
The test this patch adds should serve as a reminder to make
it if we forget.

Tested on x86_64-linux.  Since PR 94655 was reported against GCC
10 I'd like to apply this fix to both the trunk and the 10 branch.


The proposed code reads even more confusingly than before.
What is called 'ptr' is treated as a reference and what you
then name ptrtype is the type of the reference.

That leads to code like

+   if (code == MEM_REF && TREE_CODE (ptrtype) == VECTOR_TYPE)

what?  the pointer type is a VECTOR_TYPE?!

I think you are looking for whether the reference type is a VECTOR_TYPE.

Part of the confusion is likely because the code commons

   if (code == ARRAY_REF || code == MEM_REF)

but in one case operand zero is a pointer to the object (MEM_REF) and
in one case
it is the object (ARRAY_REF).

Please refactor this code so one can actually read it literally
without second-guessing proper variable names.


I agree that handling both REFs in the same block makes the code
more difficult to follow than it needs to be.

In the attached patch I've factored the code out into two helper
functions, one for ARRAY_REF and another for MEM_REF, and chosen
(hopefully) clearer names for the local variables.

A similar refactoring could be done for COMPONENT_REF and also
for SSA_NAME to reduce the size of the function and make it much
easier to follow.  I stopped short of doing that but I can do it
for GCC 12 when I move the whole compute_objsize() implementation
into a more appropriate  place (e.g., its own file).

Martin



Thanks,
Richard.




Martin


PR middle-end/96963 - -Wstringop-overflow false positive with -ftree-vectorize when assigning consecutive char struct members

gcc/ChangeLog:

	PR middle-end/96963
	* builtins.c (compute_objsize_r): Factor out ARRAY_REF and MEM_REF
	handling into helper functions.	 Correct a workaround for vectorized
	assignments.

gcc/testsuite/ChangeLog:

	PR middle-end/96963
	* gcc.dg/Wstringop-overflow-47.c: Xfail tests.
	* gcc.dg/Wstringop-overflow-65.c: New test.
	* gcc.dg/Warray-bounds-69.c: Same.

@@ -5210,7 +5209,7 @@ gimple_call_return_array (gimple *stmt, offset_int offrng[2],
   return NULL_TREE;
 }
 
-/* A helper of compute_objsize() to determine the size from an assignment
+/* A helper of compute_objsize_r() to determine the size from an assignment
statement STMT with the RHS of either MIN_EXPR or MAX_EXPR.  */
 
 static bool
@@ -5288,6 +5287,129 @@ handle_min_max_size (gimple *stmt, int ostype, access_ref *pref,
   return true;
 }
 
+/* A helper of compute_objsize_r() to determine the size from ARRAY_REF
+   AREF.  ADDR is true if PTR is the operand of ADDR_EXPR.  Return true
+   on success and false on failure.  */
+
+static bool
+handle_array_ref (tree aref, bool addr, int ostype, access_ref *pref,
+		  ssa_name_limit_t &snlim, pointer_query *qry)
+{
+  gcc_assert (TREE_CODE (aref) == ARRAY_REF);
+
+  ++pref->deref;
+
+  tree arefop = TREE_OPERAND (aref, 0);
+  tree reftype = TREE_TYPE (arefop);
+  if (!addr && TREE_CODE (TREE_TYPE (reftype)) == POINTER_TYPE)
+/* Avoid arrays of pointers.  FIXME: Hande pointers to arrays
+   of known bound.  */
+return false;
+
+  if (!compute_objsize_r (arefop, ostype, pref, snlim, qry))
+return false;
+
+  offset_int orng[2];
+  tree off = pref->eval (TREE_OPERAND (aref, 1));
+  range_query *const rvals = qry ? qry->rvals : NULL;
+  if (!get_offset_range (off, NULL, orng, rvals))
+{
+  /* Set ORNG to the maximum offset representable in ptrdiff_t.  */
+  orng[1] = wi::to_offset (TYPE_MAX_VALUE (ptrdiff_type_node));
+  orng[0] = -orng[1] - 1;
+}
+
+  /* Convert the array index range determined above to a byte
+ offset.  */
+  tree lowbnd = array_ref_low_bound (aref);
+  if (!integer_zerop (lowbnd) && tree_fits_uhwi_p (lowbnd))
+{
+  /* Adjust the index by the low bound of the array domain
+	 (normally zero but 1 in Fortran).  */
+  unsigned HOST_WIDE_INT lb = tree_to_uhwi (lowbnd);
+  orng[0] -= lb;
+  orng[1] -= lb;
+}
+
+  tree eltype = TREE_TYPE (aref);
+  tree tpsize = TYPE_SIZE_UNIT (eltype);
+  if (!tpsize || TREE_CO

Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-02 Thread Martin Sebor via Gcc-patches

On 3/2/21 9:52 AM, Jeff Law via Gcc-patches wrote:



On 3/1/21 1:39 AM, Richard Biener wrote:

The default diagnostic tree printer relies on dump_generic_node which
for some reason manages to clobber the diagnostic pretty-printer state
so we see garbled diagnostics like

/home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:  may 
be used uninitialized in this function [-Wmaybe-uninitialized]

when the diagnostic is emitted by the LTO fronted.  The following
approach using a temporary pretty-printer for the dump_generic_node
call fixes this for some unknown reason and we issue

/home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
/home/rguenther/src/trunk/gcc/dojump.c:118:28: warning: 'MEM[(struct poly_int 
*)&save].D.6750.coeffs[0]' may be used uninitialized in this function 
[-Wmaybe-uninitialized]

[LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

2021-02-26  Richard Biener  

PR middle-end/97855
* tree-diagnostic.c (default_tree_printer): Use a temporary
pretty-printer when formatting a tree via dump_generic_node.

It'd be good to know why this helps, but I trust your judgment that this
is an improvement.


I don't know if it's related but pr98492 tracks a problem in the C++
front end caused by reinitializing the pretty printer in a number of
functions in cp/error.c.  When one of these functions is called while
the pretty printer is formatting something, the effect of
the reinitialization is to drop the already formatted contents
of the printer's buffer.

IIRC, I tripped over this when working on the MEM_REF formatting
improvement for -Wuninitialized.

Martin



OK

jeff





[committed] remove stray warning text (PR 99276)

2021-03-02 Thread Martin Sebor via Gcc-patches

In PR 99276 a translator points out malformed warning message newly
introduced into GCC 11.  In r11-7460 I have committed the attached
patch to remove the stray text.

Martin
commit e7ca37649e4f322e7512c6d11813992c61b0a4b3
Author: Martin Sebor 
Date:   Tue Mar 2 13:37:01 2021 -0700

PR middle-end/99276 - grammar in diagnostics for overflowing the destination

gcc/ChangeLog:

PR middle-end/99276
* builtins.c (warn_for_access): Remove stray warning text.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 18be93c0f3b..da6dac8048e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -4237,8 +4237,7 @@ warn_for_access (location_t loc, tree func, tree exp, int opt, tree range[2],
 		? warning_at (loc, opt,
   (maybe
    ? G_("%K%qD may write %E or more bytes "
-	"into a region of size %E "
-	"the destination")
+	"into a region of size %E")
    : G_("%K%qD writing %E or more bytes "
 	"into a region of size %E overflows "
 	"the destination")),


[committed] diagnostics: fix ICE on fix-it hints on very long lines [PR99323]

2021-03-02 Thread David Malcolm via Gcc-patches
PR c/99323 describes an ICE due to a failed assertion deep inside the
fix-it printing machinery, where the fix-it hints on one line have not
been properly sorted in layout's constructor.

The underlying issue occurs when multiple fix-it hints affect a line
wider that LINE_MAP_MAX_COLUMN_NUMBER, where the location_t values for
characters after that threshold fall back to having column zero.

It's not meaningful to try to handle fix-it hints without column
information, so this patch rejects them as they are added to the
rich_location, falling back to the "no fix-it hints on this diagnostic"
case, fixing the crash.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7461-g41fbacdd10305654b1d10f887fa3f4677f9b8f34.

gcc/ChangeLog:
PR c/99323
* diagnostic-show-locus.c
(selftest::test_one_liner_many_fixits_2): Fix accidental usage of
column 0.

gcc/testsuite/ChangeLog:
PR c/99323
* gcc.dg/pr99323-1.c: New test.
* gcc.dg/pr99323-2.c: New test.

libcpp/ChangeLog:
PR c/99323
* line-map.c (rich_location::maybe_add_fixit): Reject fix-it hints
at column 0.
---
 gcc/diagnostic-show-locus.c  |  4 ++--
 gcc/testsuite/gcc.dg/pr99323-1.c | 17 +
 gcc/testsuite/gcc.dg/pr99323-2.c | 11 +++
 libcpp/line-map.c|  8 
 4 files changed, 38 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr99323-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr99323-2.c

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 458830aa2a9..4111cd66544 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -3288,14 +3288,14 @@ test_one_liner_many_fixits_2 ()
   rich_location richloc (line_table, equals);
   for (int i = 0; i < 19; i++)
 {
-  location_t loc = linemap_position_for_column (line_table, i * 2);
+  location_t loc = linemap_position_for_column (line_table, (i * 2) + 1);
   richloc.add_fixit_insert_before (loc, "a");
 }
   ASSERT_EQ (19, richloc.get_num_fixit_hints ());
   diagnostic_show_locus (&dc, &richloc, DK_ERROR);
   ASSERT_STREQ (" foo = bar.field;\n"
" ^\n"
-   "a a a a a a a a a a a a a a a a a a a\n",
+   " a a a a a a a a a a a a a a a a a a a\n",
pp_formatted_text (dc.printer));
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr99323-1.c b/gcc/testsuite/gcc.dg/pr99323-1.c
new file mode 100644
index 000..6fe14002e59
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr99323-1.c
@@ -0,0 +1,17 @@
+/* Verify that fix-it printing doesn't ICE when there are multiple
+   fix-it hints on a very long line after LINE_MAP_MAX_COLUMN_NUMBER.  */
+
+/* { dg-options "-Wall -no-integrated-cpp -fdiagnostics-show-caret" } */
+/* { dg-allow-blank-lines-in-output 1 } */
+/* { dg-prune-output ".*" } */
+
+typedef struct {
+} REFERENCE;
+#define LIM2() LIM1()
+#define LIM3() LIM2() LIM2() LIM2() LIM2() LIM2() LIM2()
+#define LIM4() 
\
+  LIM3() LIM3() LIM3() LIM3() LIM3() LIM3() LIM3() LIM3() LIM3() LIM3()
+#define LIM5() 
\
+  LIM4() LIM4() LIM4() LIM4() LIM4() LIM4() LIM4() LIM4() LIM4() LIM4()
+#define LIM1() DEF(),
+REFERENCE references[] = {LIM5()};
diff --git a/gcc/testsuite/gcc.dg/pr99323-2.c b/gcc/testsuite/gcc.dg/pr99323-2.c
new file mode 100644
index 000..d4075b61525
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr99323-2.c
@@ -0,0 +1,11 @@
+/* Verify that fix-it printing doesn't ICE when there are multiple
+   fix-it hints on a very long line after LINE_MAP_MAX_COLUMN_NUMBER.  */
+
+/* { dg-options "-Wall -fdiagnostics-show-caret" } */
+/* { dg-allow-blank-lines-in-output 1 } */
+/* { dg-prune-output ".*" } */
+
+typedef struct {
+} REFERENCE;
+
+REFERENCE references[] = {DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), DEF(), 
D

Re: [PATCH 2/3 V2] Do not include stdio.h in libgcc's Decimal/Float128 conversions.

2021-03-02 Thread Michael Meissner via Gcc-patches
On Mon, Mar 01, 2021 at 05:15:44PM -0600, Segher Boessenkool wrote:
> On Mon, Mar 01, 2021 at 12:18:52PM -0500, Michael Meissner wrote:
> > The _sprintfkf.c file was including stdio.h to get the definition of 
> > sprintf.
> 
> (declaration of)
> 
> > This patch modifies this so that stdio.h is not included in order to support
> > freestanding cross compilers that might not provide stdio.h.
> 
> So the code cannot work at all there?  Will not link?
> 
> > +   We use __builtin_sprintf so that we don't have to include stdio.h to 
> > define
> > +   sprintf.  Stdio.h might not be present for freestanding cross compilers 
> > that
> > +   do not need to include a library.  */
> 
> "declare sprintf", and the function is called stdio.g (all lowercase).
> It is often written , which makes it easier to handle in
> sentences.
> 
> > @@ -54,5 +57,5 @@ int __sprintfkf (char *restrict string,
> >if (__sprintfieee128)
> >  return __sprintfieee128 (string, format, number);
> >  
> > -  return sprintf (string, format, (long double) number);
> > +  return __builtin_sprintf (string, format, (long double) number);
> 
> sprintf as well as __builtin_sprintf do not exist exactly when 
> does not (or are not guaranteed to exist, anyway).

There are 2 issues:

The first issue is whether you get an error when you are building a cross
compiler and the target you are using does not provide a stdio.h.  This is the
error that this patch fixes.  I tend to regard not being able to build the
toolchain are higher in priority than whether it works at runtime.

The second is whether you get an error at link time because there is not
sprintf or strtold functions.  For normal archive libraries, this would only be
an issue if you actually do the conversion.  I have my doubts whether there is
any extant code that wants to convert _Float128 to one of the Decimal types or
vice versa.

Note the second issue would affect x86_64 cross compilers as well, since they
use those two functions to do the _Float128/Decimal versions.

I am open to suggestions of how to move forward.  I think we have to have a way
to build cross compilers without decimal support (i.e. my third patch).
Secondarily, I think we should allow the compiler to be built if there is no
stdio.h, but the user did not disable decimal floating point.

I don't think rewriting the Decimal conversions not to use GLIBC is really
practical.  I certainly do not have the FP math skills to do this without
losing precision, etc.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] c++: Private parent access check for using decls [PR19377]

2021-03-02 Thread Jason Merrill via Gcc-patches

On 3/1/21 5:11 PM, Anthony Sharp wrote:

Hi all,

Here is the patch as promised. Regression tested on the c++ side and 
everything seems okay. Compiles fine.


Sounds good, though strip_using_decl (parent_field) may be overloaded if
the using-decl brings in multiple functions with that name.


Could you give my new regression test a quick glance over then just to 
make sure I've not forgotten about something? It definitely seems to 
work but I'm no expert in all the different ways using statements can be 
constructed. If you were to use 'using comma' (in the testcase), it 
generates another error because it's ambiguous, so that's okay. And if 
you use a comma-separated list like I have in the test (i.e. using 
A2::comma, A1::comma) it seems to find the correct bits just fine. 
Unless I'm getting really lucky and it's actually just a coincidence.


It seems that strip_using_decl doesn't return an overloaded list. Or, if 
it does, the first entry in that list just so happens to always be the 
right answer, so it can be treated like it's a regular decl for this 
purpose. For example, even if we change up the order of baseclasses, 
using statements and class definitions, it still seems to work, e.g. the 
following *seems* to work just fine:


That's because none of the names are overloaded within a single base 
class.  But if I add



class A2
{
   protected:
   int separate(int a);
   int comma(int a);
   int alone;
};

class A1
{
   protected:
   int separate();

 int separate(int,int,int);

then strip_using_decl for A1::separate gives an OVERLOAD.

You can iterate over the result of strip_using decl with the

for (ovl_iterator iter (fns); iter; ++iter)
  {
tree fn = *iter;

pattern.

Also, you can use == instead of cp_tree_equal for comparing FUNCTION_DECLs.

Jason



libbacktrace patch committed: Pass -1 to error callback for unknown DWARF

2021-03-02 Thread Ian Lance Taylor via Gcc-patches
This libbacktrace patch passes -1 to the error callback function for
unknown DWARF versions.  This makes users of libbacktrace treat DWARF
versions that libbacktrace does not support as though no debug
information were available.  This fixes PR 98818.  Bootstrapped and
ran libbacktrace tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* dwarf.c (dwarf_buf_error): Add errnum parameter.  Change all
callers.
* backtrace.h: Update backtrace_error_callback comment.
df003d1e0bf2d0a8e2ed45a323d4e974b15dc95f
diff --git a/libbacktrace/backtrace.h b/libbacktrace/backtrace.h
index 2814763f417..caaa66d3686 100644
--- a/libbacktrace/backtrace.h
+++ b/libbacktrace/backtrace.h
@@ -71,13 +71,14 @@ struct backtrace_state;
invalid after this function returns.
 
As a special case, the ERRNUM argument will be passed as -1 if no
-   debug info can be found for the executable, but the function
-   requires debug info (e.g., backtrace_full, backtrace_pcinfo).  The
-   MSG in this case will be something along the lines of "no debug
-   info".  Similarly, ERRNUM will be passed as -1 if there is no
-   symbol table, but the function requires a symbol table (e.g.,
-   backtrace_syminfo).  This may be used as a signal that some other
-   approach should be tried.  */
+   debug info can be found for the executable, or if the debug info
+   exists but has an unsupported version, but the function requires
+   debug info (e.g., backtrace_full, backtrace_pcinfo).  The MSG in
+   this case will be something along the lines of "no debug info".
+   Similarly, ERRNUM will be passed as -1 if there is no symbol table,
+   but the function requires a symbol table (e.g., backtrace_syminfo).
+   This may be used as a signal that some other approach should be
+   tried.  */
 
 typedef void (*backtrace_error_callback) (void *data, const char *msg,
  int errnum);
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 9097df6cc76..546b4b26a32 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -410,13 +410,13 @@ struct dwarf_data
 /* Report an error for a DWARF buffer.  */
 
 static void
-dwarf_buf_error (struct dwarf_buf *buf, const char *msg)
+dwarf_buf_error (struct dwarf_buf *buf, const char *msg, int errnum)
 {
   char b[200];
 
   snprintf (b, sizeof b, "%s in %s at %d",
msg, buf->name, (int) (buf->buf - buf->start));
-  buf->error_callback (buf->data, b, 0);
+  buf->error_callback (buf->data, b, errnum);
 }
 
 /* Require at least COUNT bytes in BUF.  Return 1 if all is well, 0 on
@@ -430,7 +430,7 @@ require (struct dwarf_buf *buf, size_t count)
 
   if (!buf->reported_underflow)
 {
-  dwarf_buf_error (buf, "DWARF underflow");
+  dwarf_buf_error (buf, "DWARF underflow", 0);
   buf->reported_underflow = 1;
 }
 
@@ -592,7 +592,7 @@ read_address (struct dwarf_buf *buf, int addrsize)
 case 8:
   return read_uint64 (buf);
 default:
-  dwarf_buf_error (buf, "unrecognized address size");
+  dwarf_buf_error (buf, "unrecognized address size", 0);
   return 0;
 }
 }
@@ -643,7 +643,7 @@ read_uleb128 (struct dwarf_buf *buf)
ret |= ((uint64_t) (b & 0x7f)) << shift;
   else if (!overflow)
{
- dwarf_buf_error (buf, "LEB128 overflows uint64_t");
+ dwarf_buf_error (buf, "LEB128 overflows uint64_t", 0);
  overflow = 1;
}
   shift += 7;
@@ -678,7 +678,7 @@ read_sleb128 (struct dwarf_buf *buf)
val |= ((uint64_t) (b & 0x7f)) << shift;
   else if (!overflow)
{
- dwarf_buf_error (buf, "signed LEB128 overflows uint64_t");
+ dwarf_buf_error (buf, "signed LEB128 overflows uint64_t", 0);
  overflow = 1;
}
   shift += 7;
@@ -818,7 +818,7 @@ read_attribute (enum dwarf_form form, uint64_t implicit_val,
offset = read_offset (buf, is_dwarf64);
if (offset >= dwarf_sections->size[DEBUG_STR])
  {
-   dwarf_buf_error (buf, "DW_FORM_strp out of range");
+   dwarf_buf_error (buf, "DW_FORM_strp out of range", 0);
return 0;
  }
val->encoding = ATTR_VAL_STRING;
@@ -833,7 +833,7 @@ read_attribute (enum dwarf_form form, uint64_t implicit_val,
offset = read_offset (buf, is_dwarf64);
if (offset >= dwarf_sections->size[DEBUG_LINE_STR])
  {
-   dwarf_buf_error (buf, "DW_FORM_line_strp out of range");
+   dwarf_buf_error (buf, "DW_FORM_line_strp out of range", 0);
return 0;
  }
val->encoding = ATTR_VAL_STRING;
@@ -880,7 +880,8 @@ read_attribute (enum dwarf_form form, uint64_t implicit_val,
if (form == DW_FORM_implicit_const)
  {
dwarf_buf_error (buf,
-"DW_FORM_indirect to DW_FORM_implicit_const");
+"DW_FORM_indirect to DW_FORM_implicit_const",
+0);
return 0;
  }
return read_at

Re: [PATCH v2] c++: ICE with deduction guide in checking type-dep [PR99009, PR97034]

2021-03-02 Thread Marek Polacek via Gcc-patches
On Mon, Mar 01, 2021 at 09:29:19PM -0500, Jason Merrill via Gcc-patches wrote:
> On 3/1/21 7:59 PM, Marek Polacek wrote:
> > On Mon, Mar 01, 2021 at 03:08:58PM -0500, Jason Merrill wrote:
> > > On 3/1/21 2:54 PM, Marek Polacek wrote:
> > > > On Thu, Feb 25, 2021 at 10:45:29PM -0500, Jason Merrill via Gcc-patches 
> > > > wrote:
> > > > > On 2/25/21 5:41 PM, Marek Polacek wrote:
> > > > > > On Thu, Feb 25, 2021 at 10:59:49AM -0500, Jason Merrill wrote:
> > > > > > > On 2/12/21 6:12 PM, Marek Polacek wrote:
> > > > > > > > We represent deduction guides with FUNCTION_DECLs, but they are 
> > > > > > > > built
> > > > > > > > without DECL_CONTEXT
> > > > > > > 
> > > > > > > Hmm, that seems wrong: "A deduction-guide shall be declared in the
> > > > > > > same scope as the corresponding class template and, for a member 
> > > > > > > class
> > > > > > > template, with the same access."  But it probably isn't necessary 
> > > > > > > to change
> > > > > > > this:
> > > > > > > 
> > > > > > > > leading to an ICE in type_dependent_expression_p
> > > > > > > > on the assert that the type of a function template with no 
> > > > > > > > dependent
> > > > > > > > (innermost!) template arguments must be non-dependent.  
> > > > > > > > Consider the
> > > > > > > > attached class-deduction79.C: we create a deduction guide:
> > > > > > > > 
> > > > > > > >   template G(T)-> E::G
> > > > > > > > 
> > > > > > > > we deduce T and create a partial instantiation:
> > > > > > > > 
> > > > > > > >   G(T) -> E::G [with T = int]
> > > > > > > > 
> > > > > > > > And then do_class_deduction wants to create a CALL_EXPR from 
> > > > > > > > the above
> > > > > > > > using build_new_function_call -> build_over_call which calls 
> > > > > > > > mark_used
> > > > > > > > -> maybe_instantiate_noexcept -> type_dependent_expression_p.
> > > > > > > > 
> > > > > > > > There, the innermost template arguments are non-dependent 
> > > > > > > > (), but
> > > > > > > > the fntype is dependent -- the return type is a TYPENAME_TYPE, 
> > > > > > > > and
> > > > > > > > since we have no DECL_CONTEXT, this check holds:
> > > > > > > > 
> > > > > > > >   /* Otherwise, if the function decl isn't from a dependent 
> > > > > > > > scope, it can't be
> > > > > > > >  type-dependent.  Checking this is important for 
> > > > > > > > functions with auto return
> > > > > > > >  type, which looks like a dependent type.  */
> > > > > > > >   if (TREE_CODE (expression) == FUNCTION_DECL
> > > > > > > >   && !(DECL_CLASS_SCOPE_P (expression)
> > > > > > > >&& dependent_type_p (DECL_CONTEXT (expression)))
> > > > > > > > 
> > > > > > > > whereupon we ICE.
> > > > > > > > 
> > > > > > > > Experiments with setting DECL_CONTEXT didn't pan out.
> > > > > > > 
> > > > > > > In c8 of the PR it looks like you were using the class itself as
> > > > > > > DECL_CONTEXT; the quote above says that the right context is the 
> > > > > > > enclosing
> > > > > > > scope of the class.
> > > > > > 
> > > > > > Sadly, using CP_TYPE_CONTEXT (type) would result in a crash in
> > > > > > retrieve_specialization:
> > > > > > 
> > > > > >  /* There should be as many levels of arguments as there are
> > > > > > levels of parameters.  */
> > > > > >  gcc_assert (TMPL_ARGS_DEPTH (args)
> > > > > >  == (TREE_CODE (tmpl) == TEMPLATE_DECL
> > > > > >  ? TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
> > > > > >  : template_class_depth (DECL_CONTEXT (tmpl;
> > > > > 
> > > > > Yeah, probably simpler not to bother.
> > > > > 
> > > > > > > > So perhaps we
> > > > > > > > just want to skip the assert for deduction guides, because they 
> > > > > > > > are
> > > > > > > > a little special.  Better ideas solicited.
> > > > > > > 
> > > > > > > In c3 you mention that one of the variants broke with r269093; 
> > > > > > > this is
> > > > > > > because my change to check CLASSTYPE_TEMPLATE_INSTANTIATION is 
> > > > > > > false for the
> > > > > > > template pattern itself (E).
> > > > > > 
> > > > > > And the original test started with my r11-1713 because using 
> > > > > > TREE_TYPE
> > > > > > directly instead of decltype (which is a non-deduced context) means 
> > > > > > we
> > > > > > can deduced from the argument.
> > > > > > > But I think probably the right answer is to defer this deduction 
> > > > > > > until the
> > > > > > > enclosing scope is non-dependent, i.e. (untested)
> > > > > > 
> > > > > > Thanks.  That mostly works, except the new 
> > > > > > class-deduction-aggr[89].C
> > > > > > tests.  Consider 8:
> > > > > > 
> > > > > > namespace N {
> > > > > > template  struct S {
> > > > > >  template  S(T, U);
> > > > > > };
> > > > > > } // namespace N
> > > > > > template  struct E {
> > > > > >  template  struct G { T t; };
> > > > > >  void fn() { G{N::S{'a', 1}}; }
> > > > > > };
> > > > > > 
> > > > > > void
> > > > > > g ()
> > > > > > {

Re: [PATCH 2/3 V2] Do not include stdio.h in libgcc's Decimal/Float128 conversions.

2021-03-02 Thread Segher Boessenkool
On Tue, Mar 02, 2021 at 04:25:33PM -0500, Michael Meissner wrote:
> On Mon, Mar 01, 2021 at 05:15:44PM -0600, Segher Boessenkool wrote:
> > On Mon, Mar 01, 2021 at 12:18:52PM -0500, Michael Meissner wrote:
> > > The _sprintfkf.c file was including stdio.h to get the definition of 
> > > sprintf.
> > 
> > (declaration of)
> > 
> > > This patch modifies this so that stdio.h is not included in order to 
> > > support
> > > freestanding cross compilers that might not provide stdio.h.
> > 
> > So the code cannot work at all there?  Will not link?
> > 
> > > +   We use __builtin_sprintf so that we don't have to include stdio.h to 
> > > define
> > > +   sprintf.  Stdio.h might not be present for freestanding cross 
> > > compilers that
> > > +   do not need to include a library.  */
> > 
> > "declare sprintf", and the function is called stdio.g (all lowercase).
> > It is often written , which makes it easier to handle in
> > sentences.
> > 
> > > @@ -54,5 +57,5 @@ int __sprintfkf (char *restrict string,
> > >if (__sprintfieee128)
> > >  return __sprintfieee128 (string, format, number);
> > >  
> > > -  return sprintf (string, format, (long double) number);
> > > +  return __builtin_sprintf (string, format, (long double) number);
> > 
> > sprintf as well as __builtin_sprintf do not exist exactly when 
> > does not (or are not guaranteed to exist, anyway).
> 
> There are 2 issues:
> 
> The first issue is whether you get an error when you are building a cross
> compiler and the target you are using does not provide a stdio.h.  This is the
> error that this patch fixes.  I tend to regard not being able to build the
> toolchain are higher in priority than whether it works at runtime.

And I see both as basic requirements.  A patch that adds X should make X
work.

If you want to make decimal and/or QP float work only on 64-bit LE Linux
you should say so.  And in that case, that is certainly not acceptable
if it doesn't "sorry" at configure time already.

> The second is whether you get an error at link time because there is not
> sprintf or strtold functions.  For normal archive libraries, this would only 
> be
> an issue if you actually do the conversion.  I have my doubts whether there is
> any extant code that wants to convert _Float128 to one of the Decimal types or
> vice versa.

So what are these patches for at all, again?

Anyway, if some conversion doesn't work (or cannot work correctly at
all, which is the same thing), you should simply disable the conversion
routines themselves (and then cascade that through the possible callers:just 
make them say "sorry, unimplemented").

> Note the second issue would affect x86_64 cross compilers as well, since they
> use those two functions to do the _Float128/Decimal versions.

Yes, it cannot work correctly at all there, either.

> I am open to suggestions of how to move forward.  I think we have to have a 
> way
> to build cross compilers without decimal support (i.e. my third patch).
> Secondarily, I think we should allow the compiler to be built if there is no
> stdio.h, but the user did not disable decimal floating point.

Yes.  So just do not use it then.  Disable the feature that would use it.

> I don't think rewriting the Decimal conversions not to use GLIBC is really
> practical.

It is necessary if you want to support it on any other systems than the
one you use.


Segher


Go patch committed: All go:embed with import _ "embed"

2021-03-02 Thread Ian Lance Taylor via Gcc-patches
This patch by Michael Matloob fixes the Go frontend to permit
//go:embed comments in files that do
import _ "embed"

The embed spec allows for //go:embed to be used in files that
underscore-import package "embed".  This is useful for embeds to
[]byte and string vars because the embed.FS type may not be referenced
if those are the only types of embeds in a file.  Because the compiler
previously checked whether there were any aliases to the embed package
to decide if //go:embed could be used, it would reject files with only
underscore imports of embed.  Instead, record whether the embed import
is encountered at all, similar to what is done with unsafe, to decide
whether //go:embed is allowed.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
d1776b7757001d9da47efcfd64d3bb20afb8d96d
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 9806b9da1e1..5c9fc7db4e1 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-56cf388da8d04bbd3824c4df34d77a8afa69749b
+2c5188b5ad6143e791f2ba42f02a4ea7887d87b6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/embed.cc b/gcc/go/gofrontend/embed.cc
index 0a7df0531ec..0584f707ce6 100644
--- a/gcc/go/gofrontend/embed.cc
+++ b/gcc/go/gofrontend/embed.cc
@@ -663,21 +663,6 @@ Embedcfg_reader::error(const char* msg)
this->filename_, msg);
 }
 
-// Return whether the current file imports "embed".
-
-bool
-Gogo::is_embed_imported() const
-{
-  Packages::const_iterator p = this->packages_.find("embed");
-  if (p == this->packages_.end())
-return false;
-
-  // We track current file imports in the package aliases, where a
-  // typical import will just list the package name in aliases.  So
-  // the package has been imported if there is at least one alias.
-  return !p->second->aliases().empty();
-}
-
 // Implement the sort order for a list of embedded files, as discussed
 // at the docs for embed.FS.
 
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index 38a2f6f8f9e..9d4150eff7c 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -37,6 +37,7 @@ Gogo::Gogo(Backend* backend, Linemap* linemap, int, int 
pointer_size)
 imports_(),
 imported_unsafe_(false),
 current_file_imported_unsafe_(false),
+current_file_imported_embed_(false),
 packages_(),
 init_functions_(),
 var_deps_(),
@@ -469,6 +470,9 @@ Gogo::import_package(const std::string& filename,
   return;
 }
 
+  if (filename == "embed")
+this->current_file_imported_embed_ = true;
+
   Imports::const_iterator p = this->imports_.find(filename);
   if (p != this->imports_.end())
 {
@@ -2717,6 +2721,7 @@ Gogo::clear_file_scope()
 }
 
   this->current_file_imported_unsafe_ = false;
+  this->current_file_imported_embed_ = false;
 }
 
 // Queue up a type-specific hash function for later writing.  These
diff --git a/gcc/go/gofrontend/gogo.h b/gcc/go/gofrontend/gogo.h
index 51b6575ba47..f4155a29edb 100644
--- a/gcc/go/gofrontend/gogo.h
+++ b/gcc/go/gofrontend/gogo.h
@@ -397,10 +397,6 @@ class Gogo
   void
   read_embedcfg(const char* filename);
 
-  // Return whether the current file imports "embed".
-  bool
-  is_embed_imported() const;
-
   // Build an initializer for a variable with a go:embed directive.
   Expression*
   initializer_for_embeds(Type*, const std::vector*, Location);
@@ -709,6 +705,11 @@ class Gogo
   current_file_imported_unsafe() const
   { return this->current_file_imported_unsafe_; }
 
+  // Return whether the current file imported the embed package.
+  bool
+  current_file_imported_embed() const
+  { return this->current_file_imported_embed_; }
+
   // Clear out all names in file scope.  This is called when we start
   // parsing a new file.
   void
@@ -1251,6 +1252,8 @@ class Gogo
   bool imported_unsafe_;
   // Whether the magic unsafe package was imported by the current file.
   bool current_file_imported_unsafe_;
+  // Whether the embed package was imported by the current file.
+  bool current_file_imported_embed_;
   // Mapping from package names we have seen to packages.  This does
   // not include the package we are compiling.
   Packages packages_;
diff --git a/gcc/go/gofrontend/parse.cc b/gcc/go/gofrontend/parse.cc
index fd81a85c87f..e43b5f21448 100644
--- a/gcc/go/gofrontend/parse.cc
+++ b/gcc/go/gofrontend/parse.cc
@@ -1321,7 +1321,7 @@ Parse::declaration()
   embeds = new(std::vector);
   this->lex_->get_and_clear_embeds(embeds);
 
-  if (!this->gogo_->is_embed_imported())
+  if (!this->gogo_->current_file_imported_embed())
{
  go_error_at(token->location(),
  "invalid go:embed: missing import %");


Re: [PATCH, rs6000] Add Power10 scheduling description

2021-03-02 Thread Pat Haugen via Gcc-patches
Ping3

On 2/18/21 2:31 PM, Pat Haugen via Gcc-patches wrote:
> Ping2.
> 
> On 1/26/21 11:30 AM, Pat Haugen via Gcc-patches wrote:
>> Ping.
>>
>> On 11/13/20 4:04 PM, Pat Haugen via Gcc-patches wrote:
>>> Add Power10 scheduling description.
>>>
>>> This patch adds the Power10 scheduling description. Since power10.md was 
>>> pretty much a complete rewrite (existing version of power10.md is mostly 
>>> just a copy of power9.md), I diffed power10.md with /dev/null so that the 
>>> full contents of the file are shown as opposed to a diff. This should make 
>>> it easier to read. This patch will not apply on current trunk do to that 
>>> reason.
>>>  
>>> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. 
>>> Ok for trunk?
>>>
>>> -Pat
>>>
>>>
>>> 2020-11-13  Pat Haugen  
>>>
>>> gcc/
>>> * config/rs6000/rs6000.c (struct processor_costs): New.
>>> (rs6000_option_override_internal): Set Power10 costs.
>>> (rs6000_issue_rate): Set Power10 issue rate.
>>> * config/rs6000/power10.md: Rewrite for Power10.
>>>
>>
> 



Re: [PATCH] Fix Ada bootstrap failure on Cygwin since switch to C++11 (PR98590)

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/26/21 8:24 AM, Mikael Pettersson via Gcc-patches wrote:
> On Fri, Feb 5, 2021 at 2:59 PM Arnaud Charlet  wrote:
 We'd rather not have PR references in the source files, so please remove it
 (it will be there as part of the commit log and git annotate).

 OK with the comment updated.
>>> Thanks, here's the revised patch.
>> OK, thanks.
> I forgot to mention that I don't have commit rights to gcc, so can
> someone please commit this for me? Thanks.
Thanks.  Pushed to the trunk.

jeff



Re: [PATCH] middle-end: Pretty-print address space of aggregates

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/26/21 5:02 AM, Julian Brown wrote:
> It's possible for an aggregate to be declared in a non-default address
> space, but the tree pretty-printer doesn't currently show that address
> space in dumps, which can be confusing.  This patch adds printing of
> "" markers for aggregates in non-default address spaces.
>
> OK (now or for stage 1)?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
> gcc/
>   * tree-pretty-print.c (dump_generic_node): Emit non-generic
>   address space info for aggregates.
Given this can't affect code generation, I think it's fine now.

OK for the trunk.

Jeff



Re: [PATCH, rs6000] Update "size" attribute for Power10

2021-03-02 Thread Pat Haugen via Gcc-patches
Ping3

On 2/18/21 2:30 PM, Pat Haugen via Gcc-patches wrote:
> Ping2
> 
> On 1/26/21 11:27 AM, Pat Haugen via Gcc-patches wrote:
>> Ping
>>
>> On 12/8/20 3:46 PM, Pat Haugen via Gcc-patches wrote:
>>> Update size attribute for Power10.
>>>
>>>
>>> This patch was broken out from my larger patch to update various attributes 
>>> for
>>> Power10, in order to make the review process hopefully easier. This patch 
>>> only
>>> updates the size attribute for various new instructions. There were no 
>>> changes
>>> requested to this portion of the original patch, so nothing is new here.
>>>
>>> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. 
>>> Ok for trunk?
>>>
>>> -Pat
>>>
>>>
>>> 2020-11-08  Pat Haugen  
>>>
>>> gcc/
>>> * config/rs6000/dfp.md (extendddtd2, trunctddd2, *cmp_internal1,
>>> floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
>>> dfp_denbcd_, dfp_dxex_, dfp_diex_,
>>> *dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_): Update size
>>> attribute for Power10.
>>> * config/rs6000/mma.md (*movoo): Likewise.
>>> * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>>> (define_mode_attr bits): Add DD/TD modes.
>>> * config/rs6000/sync.md (load_quadpti, store_quadpti, load_lockedpti,
>>> store_conditionalpti): Update size attribute for Power10.
>>>
>>
> 



Re: [PATCH] match.pd: Use single_use for (T)(A) + CST -> (T)(A + CST) [PR95798]

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/24/21 11:59 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> The r10-2806 change regressed following testcases, instead of doing
> int -> unsigned long sign-extension once and then add 8, 16, ... 56 to it
> for each of the memory access, it adds 8, 16, ... 56 in int mode and then
> sign extends each.  So that means:
> + movq$0, (%rsp,%rax,8)
> + leal1(%rdx), %eax
> + cltq
> + movq$0, (%rsp,%rax,8)
> + leal2(%rdx), %eax
> + cltq
> + movq$0, (%rsp,%rax,8)
> + leal3(%rdx), %eax
> + cltq
> + movq$0, (%rsp,%rax,8)
> + leal4(%rdx), %eax
> + cltq
> + movq$0, (%rsp,%rax,8)
> + leal5(%rdx), %eax
> + cltq
> + movq$0, (%rsp,%rax,8)
> + leal6(%rdx), %eax
> + addl$7, %edx
> + cltq
> + movslq  %edx, %rdx
> + movq$0, (%rsp,%rax,8)
>   movq$0, (%rsp,%rdx,8)
> - movq$0, 8(%rsp,%rdx,8)
> - movq$0, 16(%rsp,%rdx,8)
> - movq$0, 24(%rsp,%rdx,8)
> - movq$0, 32(%rsp,%rdx,8)
> - movq$0, 40(%rsp,%rdx,8)
> - movq$0, 48(%rsp,%rdx,8)
> - movq$0, 56(%rsp,%rdx,8)
> GCC 9 -> 10 change or:
> - movq$0, (%rsp,%rdx,8)
> - movq$0, 8(%rsp,%rdx,8)
> - movq$0, 16(%rsp,%rdx,8)
> - movq$0, 24(%rsp,%rdx,8)
> - movq$0, 32(%rsp,%rdx,8)
> - movq$0, 40(%rsp,%rdx,8)
> - movq$0, 48(%rsp,%rdx,8)
> - movq$0, 56(%rsp,%rdx,8)
> + movq$0, (%rsp,%rax,8)
> + leal1(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal2(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal3(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal4(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal5(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal6(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> + leal7(%rdx), %eax
> + movq$0, (%rsp,%rax,8)
> change on the other test.  While for the former case of
> int there is due to signed integer overflow (unless -fwrapv)
> the possibility to undo it e.g. during expansion, for the unsigned
> case information is unfortunately lost.
>
> The following patch adds single_use case which restores these testcases
> but keeps the testcases the patch meant to improve as is.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-02-24  Jakub Jelinek  
>
>   PR target/95798
>   * match.pd ((T)(A) + CST -> (T)(A + CST)): Add single_use check.
>
>   * gcc.target/i386/pr95798-1.c: New test.
>   * gcc.target/i386/pr95798-2.c: New test.
OK
jeff



Re: [PATCH] outputs.exp: skip @file -save-temps if target has -L or -I

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/24/21 10:11 PM, Hans-Peter Nilsson via Gcc-patches wrote:
> The outputs.exp tests check what temporary files are created
> and left behind with e.g. -save-temps.
>
> Additional files are created in presence of @file option.
> Adding an -I or -L option causes *another* temporary file to
> appear.  I take it that's deliberate, as there are tests for
> that behavior.
>
> For native testing, the default test-framework baseboard
> file unix.exp doesn't add any -I or -L options and all tests
> pass.  For a newlib target however, you'll have a couple of
> -L options (see the nopts handling in outputs.exp), leading
> to (cris-elf with --target_board=cris-sim):
>
> Running /x/gcc/gcc/testsuite/gcc.misc-tests/outputs.exp ...
> FAIL: outputs exe savetmp namedb: extra
> outputs.args.1
> FAIL: outputs exe savetmp named2: extra
> outputs.args.1
> FAIL: outputs exe savetmp named2: extra
> outputs.args.3
> FAIL: outputs lto sing unnamed: extra
> a.args.1
>
> The failing tests are among the actual tests that check the
> behavior of @file, and are confused by the additional -L.
>
> Identify presence of -I or -L from the test framework and
> skip those tests.
>
> Tested cris-elf and x86_64-pc-linux-gnu.
>
> Ok to commit?
>
> gcc/testsuite:
>   * gcc.misc-tests/outputs.exp: Skip @file -save-temps
>   tests if target test-framework has -L or -I options.
OK
jeff



Re: [PATCH] gcc.misc-tests/outputs.exp: assert unique test-names

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/24/21 10:17 PM, Hans-Peter Nilsson via Gcc-patches wrote:
> The gcc.misc-tests/outputs.exp tests can take some effort to
> digest.
>
> Navigating and debugging causes for failing tests here isn't
> helped by the existence of tests with duplicate names.
> Let's stop that from happening.  This requires that test-run
> output is actually reviewed, as Tcl errors don't stop the
> test-run, but then again there's no such dejagnu construct
> that I know of.
>
> Tested x86_64-pc-linux-gnu.
>
> Ok to commit?  Or is a renaming patch appending a
> number-suffix, like:
>
> --- outputs.exp.orig3 2021-02-25 06:13:28.304243791 +0100
> +++ outputs.exp   2021-02-25 06:13:51.575457825 +0100
> @@ -280,8 +280,8 @@ if { "$aout" != "" } then {
>  }
>  
>  # Driver-chosen outputs.
> -outest "$b asm default 1" $sing "-S" {} {{-0.s}}
> -outest "$b asm default 2" $mult "-S" {} {{-1.s -2.s}}
> +outest "$b-1 asm default 1" $sing "-S" {} {{-0.s}}
> +outest "$b-2 asm default 2" $mult "-S" {} {{-1.s -2.s}}
> ...
>
> ...better and ok to commit?  (IMHO: yes, much easier to follow)
>
> gcc/testsuite:
>   * gcc.misc-tests/outputs.exp: Append discriminating
>   suffixes to tests with duplicate names.
>   (outest): Assert that each running test has a unique
>   name.
OK.  And I think that in general changes which fix duplicated testnames
should be considered as not needing approval.  Just fix 'em and post the
patch for the historical record :-)

jeff



[PATCH] fwprop: Fix single_use_p calculation

2021-03-02 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.  Ok for master?



Commit efb6bc55a93a ("fwprop: Allow (subreg (mem)) simplifications")
introduced a check that was supposed to look at the propagated def's
number of uses.  It uses insn_info::num_uses (), which in reality
returns the number of uses def's insn has.  The whole change therefore
works only by accident.

Fix by looking at def_info's uses instead of insn_info's uses.  This
requires passing around def_info instead of insn_info.

gcc/ChangeLog:

2021-03-02  Ilya Leoshkevich  

* fwprop.c (def_has_single_use_p): New function.
(fwprop_propagation::fwprop_propagation): Look at
def_info's uses.
(try_fwprop_subst_note): Use def_info instead of insn_info.
(try_fwprop_subst_pattern): Likewise.
(try_fwprop_subst_notes): Likewise.
(try_fwprop_subst): Likewise.
(forward_propagate_subreg): Likewise.
(forward_propagate_and_simplify): Likewise.
(forward_propagate_into): Likewise.
* iterator-utils.h (single_element_p): New function.
---
 gcc/fwprop.c | 89 ++--
 gcc/iterator-utils.h | 10 +
 2 files changed, 62 insertions(+), 37 deletions(-)

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 4b8a554e823..478dcdd96cc 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -175,7 +175,7 @@ namespace
 static const uint16_t CONSTANT = FIRST_SPARE_RESULT << 1;
 static const uint16_t PROFITABLE = FIRST_SPARE_RESULT << 2;
 
-fwprop_propagation (insn_info *, insn_info *, rtx, rtx);
+fwprop_propagation (insn_info *, def_info *, rtx, rtx);
 
 bool changed_mem_p () const { return result_flags & CHANGED_MEM; }
 bool folded_to_constants_p () const;
@@ -191,13 +191,27 @@ namespace
   };
 }
 
-/* Prepare to replace FROM with TO in INSN.  */
+/* Return true if DEF has a single non-debug non-phi use.  */
+
+static bool
+def_has_single_use_p (def_info *def)
+{
+  if (!is_a (def))
+return false;
+
+  set_info *set = as_a (def);
+
+  return single_element_p (set->nondebug_insn_uses ())
+&& !set->has_phi_uses ();
+}
+
+/* Prepare to replace FROM with TO in USE_INSN.  */
 
 fwprop_propagation::fwprop_propagation (insn_info *use_insn,
-   insn_info *def_insn, rtx from, rtx to)
+   def_info *def, rtx from, rtx to)
   : insn_propagation (use_insn->rtl (), from, to),
-single_use_p (def_insn->num_uses () == 1),
-single_ebb_p (use_insn->ebb () == def_insn->ebb ())
+single_use_p (def_has_single_use_p (def)),
+single_ebb_p (use_insn->ebb () == def->insn ()->ebb ())
 {
   should_check_mems = true;
   should_note_simplifications = true;
@@ -368,9 +382,9 @@ contains_paradoxical_subreg_p (rtx x)
   return false;
 }
 
-/* Try to substitute (set DEST SRC) from DEF_INSN into note NOTE of USE_INSN.
-   Return the number of substitutions on success, otherwise return -1 and
-   leave USE_INSN unchanged.
+/* Try to substitute (set DEST SRC), which defines DEF, into note NOTE of
+   USE_INSN.  Return the number of substitutions on success, otherwise return
+   -1 and leave USE_INSN unchanged.
 
If REQUIRE_CONSTANT is true, require all substituted occurences of SRC
to fold to a constant, so that the note does not use any more registers
@@ -379,13 +393,14 @@ contains_paradoxical_subreg_p (rtx x)
instruction pattern.  */
 
 static int
-try_fwprop_subst_note (insn_info *use_insn, insn_info *def_insn,
+try_fwprop_subst_note (insn_info *use_insn, def_info *def,
   rtx note, rtx dest, rtx src, bool require_constant)
 {
   rtx_insn *use_rtl = use_insn->rtl ();
+  insn_info *def_insn = def->insn ();
 
   insn_change_watermark watermark;
-  fwprop_propagation prop (use_insn, def_insn, dest, src);
+  fwprop_propagation prop (use_insn, def, dest, src);
   if (!prop.apply_to_rvalue (&XEXP (note, 0)))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -436,19 +451,20 @@ try_fwprop_subst_note (insn_info *use_insn, insn_info 
*def_insn,
   return prop.num_replacements;
 }
 
-/* Try to substitute (set DEST SRC) from DEF_INSN into location LOC of
+/* Try to substitute (set DEST SRC), which defines DEF, into location LOC of
USE_INSN's pattern.  Return true on success, otherwise leave USE_INSN
unchanged.  */
 
 static bool
 try_fwprop_subst_pattern (obstack_watermark &attempt, insn_change &use_change,
- insn_info *def_insn, rtx *loc, rtx dest, rtx src)
+ def_info *def, rtx *loc, rtx dest, rtx src)
 {
   insn_info *use_insn = use_change.insn ();
   rtx_insn *use_rtl = use_insn->rtl ();
+  insn_info *def_insn = def->insn ();
 
   insn_change_watermark watermark;
-  fwprop_propagation prop (use_insn, def_insn, dest, src);
+  fwprop_propagation prop (use_insn, def, dest, src);
   if (!prop.apply_to_pattern (loc))
 {
   if (dump_f

Re: [PATCH RFA] cgraph: flatten and same_body aliases [PR96078]

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/24/21 5:08 PM, Jason Merrill wrote:
> On 2/16/21 1:52 PM, Jeff Law wrote:
>>
>>
>> On 2/11/21 10:18 PM, Jason Merrill via Gcc-patches wrote:
>>> The patch for PR92372 made us start warning about a flatten
>>> attribute on an
>>> alias.  But in the case of C++ 'tor base/complete variants, the user
>>> didn't
>>> create the alias, so we shouldn't warn.
>>>
>>> I could also remove the attribute in maybe_clone_body, but here
>>> seems a bit
>>> better.
>>>
>>> Tested x86_64-pc-linux-gnu.  OK for trunk?
>>>
>>> gcc/ChangeLog:
>>>
>>> PR c++/96078
>>> * cgraph.c (cgraph_node::create_same_body_alias): Remove flatten
>>> attribute from alias.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR c++/96078
>>> * g++.dg/ext/attr-flatten1.C: New test.
>> But shouldn't we validate that we've got a C++ ctor/dtor rather than
>> blindly removing the attribute from all aliases?  ISTM like the patch
>> as-is would always suppress warnings when we create a same-body alias
>> regardless of why we created a same-body alias.
>
> Fair enough.  How about this approach, instead?  If the target also
> has attribute flatten, we're getting the desired effect even though
> the called symbol is an alias.
>
> 0001-cgraph-flatten-and-same_body-aliases-PR96078.patch
>
> From 740414f268382625525a892fd14357f694ca4391 Mon Sep 17 00:00:00 2001
> From: Jason Merrill 
> Date: Thu, 11 Feb 2021 22:01:19 -0500
> Subject: [PATCH] cgraph: flatten and same_body aliases [PR96078]
> To: gcc-patches@gcc.gnu.org
>
> The patch for PR92372 made us start warning about a flatten attribute on an
> alias.  But in the case of C++ 'tor base/complete variants, the user didn't
> create the alias.  If the alias target also has the attribute, the alias
> points to a flattened function, so we shouldn't warn.
>
> gcc/ChangeLog:
>
>   PR c++/96078
>   * cgraphunit.c (process_function_and_variable_attributes): Don't
>   warn about flatten on an alias if the target also has it.
>   * cgraph.h (symtab_node::get_alias_target_tree): New.
>
> gcc/testsuite/ChangeLog:
>
>   PR c++/96078
>   * g++.dg/ext/attr-flatten1.C: New test.
OK
jeff



Re: [PATCH] gcc.misc-tests/outputs.exp: assert unique test-names

2021-03-02 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jeff Law 
> Date: Tue, 2 Mar 2021 23:39:27 +0100
> received-spf: None (smtp1.axis.com: no sender authenticity  information
>  available from domain of  postmas...@mail-il1-f172.google.com) identity=helo;
>   client-ip=209.85.166.172; receiver=smtp1.axis.com;
>   envelope-from="jeffreya...@gmail.com";
>   x-sender="postmas...@mail-il1-f172.google.com";
>   x-conformance=sidf_compatible
> Old-Content-Type: multipart/alternative;
>   boundary="_000_08d773b4a6cd248fc44aa1877542afabgmailcom_"
> Content-Type: text/plain; charset=iso-8859-1
> 
> 
> 
> On 2/24/21 10:17 PM, Hans-Peter Nilsson via Gcc-patches wrote:
> > The gcc.misc-tests/outputs.exp tests can take some effort to
> > digest.
> >
> > Navigating and debugging causes for failing tests here isn't
> > helped by the existence of tests with duplicate names.
> > Let's stop that from happening.  This requires that test-run
> > output is actually reviewed, as Tcl errors don't stop the
> > test-run, but then again there's no such dejagnu construct
> > that I know of.
> >
> > Tested x86_64-pc-linux-gnu.
> >
> > Ok to commit?  Or is a renaming patch appending a
> > number-suffix, like:
> >
> > --- outputs.exp.orig3   2021-02-25 06:13:28.304243791 +0100
> > +++ outputs.exp 2021-02-25 06:13:51.575457825 +0100
> > @@ -280,8 +280,8 @@ if { "$aout" != "" } then {
> >  }
> >  
> >  # Driver-chosen outputs.
> > -outest "$b asm default 1" $sing "-S" {} {{-0.s}}
> > -outest "$b asm default 2" $mult "-S" {} {{-1.s -2.s}}
> > +outest "$b-1 asm default 1" $sing "-S" {} {{-0.s}}
> > +outest "$b-2 asm default 2" $mult "-S" {} {{-1.s -2.s}}
> > ...
> >
> > ...better and ok to commit?  (IMHO: yes, much easier to follow)
> >
> > gcc/testsuite:
> > * gcc.misc-tests/outputs.exp: Append discriminating
> > suffixes to tests with duplicate names.
> > (outest): Assert that each running test has a unique
> > name.
> OK.  And I think that in general changes which fix duplicated testnames
> should be considered as not needing approval.  Just fix 'em and post the
> patch for the historical record :-)

Ok, thanks for the review!

Since there were two versions suggested and (to me) the "OK"
was ambiguous, I'll go ahead with the suggested (but
scripted) renaming of e.g. "$b blah", "$b foo", "$b blah
with futz", "$b frob" (etc) into "$b-1 blah", "$b-2 foo",
"$b-3 blah with futz", "$b-4 frob" (etc) ;-) But, I'm going
to wait 24h.

brgds, H-P


Re: [PATCH 5/6] c++: Clean up normalization / satisfaction routines

2021-03-02 Thread Jason Merrill via Gcc-patches

On 3/2/21 11:25 AM, Patrick Palka wrote:

On Mon, 1 Mar 2021, Jason Merrill wrote:


On 2/28/21 12:58 PM, Patrick Palka wrote:

This patch mostly performs some straightforward refactoring:

- Renamed satisfy_constraint to satisfy_normalized_constraints
- Renamed the three-parameter version of satisfy_constraint_expression
  to satisfy_nondeclaration_constraints
- Removed normalize_(non)?template_requirements
- Removed satisfy_associated_constraints (and made its callers
  check for dependent template args sooner, before normalization)
- Removed the tsubst_flags_t parameter of evaluate_concept_check
- Combined the two versions of constraint_satisfaction_value
- Combined the two versions of constraint_satisfied_p

Additionally, this patch removes the handling of bare
constraint-expressions from satisfy_nondeclaration_constraints, and
hence constraints_satisfied_p and constraint_satisfaction_value now only
take things that carry their own template information needed for
normalization.  In practice, this only means it's no longer possible to
evaluate bare REQUIRES_EXPRs via the satisfaction routines, and so this
patch adjusts the affected callers to instead use tsubst_requires_expr.


It's probably better to have a different entry point than tsubst_requires_expr
for callers that have nothing to do with templates.  Whether that's a one-line
wrapper for the call to tsubst_requires_expr ("evaluate_requires_expr"?) or
calling tsubst_requires_expr from satisfy_nondeclaration_constraints, I don't
have an opinion either way.


For convenience, the function diagnose_constraints continues to accept
REQUIRES_EXPRs, but it now handles them by calling diagnose_require_expr
directly.


This might argue for the latter choice above.


Ah, I didn't consider the option of continuing to handle REQUIRES_EXPRs
from satisfy_nondeclaration_constraints by evaluating them directly.

Here's a patch that implements this.  I opted to add a
evaluate_requires_expr wrapper as well, since

   evaluate_requires_expr (t)

is much more readable than

   constant_boolean_node (constraints_satisfied_p (t),
  boolean_type_node)

(Note the added TODO in satisfy_nondeclaration_constraints, which'll be
fixed in a new version of the REQUIRES_EXPR evaluation/diagnostic
unification patch.)


OK.


-- >8 --

Subject: [PATCH] c++: Clean up normalization / satisfaction routines

This patch mostly performs some straightforward refactoring:

   - Renamed satisfy_constraint to satisfy_normalized_constraints
   - Renamed the three-parameter version of satisfy_constraint_expression
 to satisfy_nondeclaration_constraints
   - Removed normalize_(non)?template_requirements
   - Removed satisfy_associated_constraints (and made its callers
 check for dependent template args sooner, before normalization)
   - Removed the tsubst_flags_t parameter of evaluate_concept_check
   - Combined the two versions of constraint_satisfaction_value
   - Combined the two versions of constraint_satisfied_p

Additionally, this patch removes the handling of general
constraint-expressions from satisfy_nondeclaration_constraints, and
hence constraints_satisfied_p and constraint_satisfaction_value now take
only things that carry their own template information needed for
normalization, and, as a special case, REQUIRES_EXPRs.  But the latter
now get evaluated directly via tsubst_requires_expr rather than going
through satisfaction.

(That we used to evaluate REQUIRES_EXPR via satisfaction might even be a
correctness issue: since we cache satisfaction in special ways that don't
apply to regular evaluation, going through satisfaction could in theory
cause us to reuse a cached value for a REQUIRES_EXPR when we shouldn't
have.)

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): Adjust call to
evaluate_concept_check.
(cxx_eval_constant_expression) : Use
evaluate_requires_expression instead of
satisfy_constraint_expression.
: Adjust call to evaluate_concept_check.
* constraint.cc (struct sat_info): Adjust comment about which
satisfaction entrypoints use noisy-unsat.
(normalize_template_requirements): Remove (and adjust callers
appropriately).
(normalize_nontemplate_requirements): Likewise.
(tsubst_nested_requirement): Use constraint_satisfaction_value
instead of satisfy_constraint_expression, which'll do the
noisy replaying of ill-formed quiet satisfaction for us.
(decl_satisfied_cache): Adjust comment.
(satisfy_constraint): Rename to ...
(satisfy_normalized_constraints): ... this.
(satisfy_associated_constraints): Remove (and make its
callers check for dependent arguments).
(satisfy_constraint_expression): Rename to ...
(satisfy_nondeclaration_constraints): ... this.  Assert that
'args' is empty when 't' is a concept-id.  Removing handling
   

[PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-02 Thread Ilya Leoshkevich via Gcc-patches
mul-signed-overflow-*.c execution tests fail on z13, because they
contain z14-specific instructions.  Fix by requiring s390_z14_hw
target.

gcc/testsuite/ChangeLog:

* gcc.target/s390/mul-signed-overflow-1.c: Run only on z14+.
* gcc.target/s390/mul-signed-overflow-2.c: Likewise.
---
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c | 3 ++-
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
index fdf56d6e695..e8b1938dab7 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
index d0088188aa2..01328e1d286 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
-- 
2.29.2



Re: [PATCH] dwarf2out: Fix -gsplit-dwarf on riscv or other non-.uleb128 targets [PR99090]

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/27/21 3:21 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> As mentioned in the PR, riscv* only supports .uleb128 with constant
> arguments, doesn't support difference of two labels because of aggressive
> linker relaxations.  But I bet various other targets, especially those not
> using GNU assembler, might suffer from the same problem.
Yea.  I'm going to be working on a target with similar constraints
soon.  The 32bit PA hpux port has similar issues, but can at least emit
suitable relocations to make this work.
> As the FIXME comment in output_loc_list indicates, we ICE on
> -gsplit-dwarf on those targets whenever we need .debug_loclists, because
> we only emit DW_LLE_startx_length which requires working .uleb128 delta
> of 2 code section labels.  We can't use DW_LLE_base_addressx
> once followed by DW_LLE_offset_pair either because the latter suffers
> from the same issue - need .uleb128 difference of code section labels
> (and in that case not just for the second operand but also for the first).
>
> So, this patch implements what the comment said and emits DW_LLE_startx_endx
> instead, which wastes more space in .debug_addr, but will work.


>
> Bootstrapped/regtested on x86_64-linux and i686-linux and as written in the
> PR, Jim has tested it on riscv*linux.  Ok for trunk?
>
> BTW, for HAVE_AS_LEB128 -gdwarf-5 -gsplit-dwarf, maybe we should consider
> instead of always emitting DW_LLE_startx_length do all the optimizations
> that we do for HAVE_AS_LEB128 -gdwarf-5, or at least a subset of them.
> For !have_multiple_function_sections, we in that case emit just
> DW_LLE_offset_pair (that can certainly be a win for small TUs, we wouldn't
> need any .debug_addr entry in that case; on the other side, just using
> DW_LLE_offset_pair can be harmful for very large TUs especially if the
> loclist has many entries, emitting in that case a single DW_LLE_base_address
> or for -gsplit-dwarf DW_LLE_base_addressx followed by DW_LLE_offset_pair
> might be much smaller), and for have_multiple_function_sections figuring
> out if DW_LLE_base_address followed by DW_LLE_offset_pair entries
> or DW_LLE_start_length is bettter.  So perhaps a middle-ground for
> -gsplit-dwarf would be to always do the have_multiple_function_sections
> behavior, i.e. DW_LLE_base_addressx followed by DW_LLE_offset_pair vs.
> DW_LLE_startx_length decisions based on the ranges and their counts.
> And perhaps dwz could optimize afterwards, on linked binaries or shared
> libraries it knows all the offsets and could figure out optimal DW_LLE_*.
>
> 2021-02-26  Jakub Jelinek  
>
>   PR debug/99090
>   * dwarf2out.c (dw_loc_list_struct): Add end_entry member.
>   (new_loc_list): Clear end_entry.
>   (output_loc_list): Only use DW_LLE_startx_length for -gsplit-dwarf
>   if HAVE_AS_LEB128, otherwise use DW_LLE_startx_endx.  Fix comment
>   typo.
>   (index_location_lists): For dwarf_version >= 5 without HAVE_AS_LEB128,
>   initialize also end_entry.
OK
jeff



Re: [PATCH 6/6] c++: Consolidate REQUIRES_EXPR evaluation/diagnostic routines

2021-03-02 Thread Jason Merrill via Gcc-patches

On 3/2/21 11:45 AM, Patrick Palka wrote:

On Mon, 1 Mar 2021, Jason Merrill wrote:


On 2/28/21 12:59 PM, Patrick Palka wrote:

This folds the diagnose_requires_expr routines into the corresponding
tsubst_requires_expr ones.  This is achieved by making the latter
routines take a sat_info instead of a subst_info, and assigning the
appropriate meanings to the flags sat_info::noisy and
sat_info::diagnose_unsatisfaction_p during tsubst_requires_expr:
info.noisy() controls whether to diagnose invalid types and expressions
inside the requires-expression, and info.diagnose_unsatisfaction_p()
controls whether to diagnose why the requires-expression evaluates to
false.

gcc/cp/ChangeLog:

* constraint.cc (struct sat_info): Document the different
meanings of noisy() and diagnose_unsatisfaction_p() during
satisfaction and requires-expression evaluation.
(tsubst_valid_expression_requirement): Take a sat_info instead
of a subst_info.  Perform the substitution quietly first.  Fold
in error-replaying code from diagnose_valid_expression.
(tsubst_simple_requirement): Take a sat_info instead of a
subst_info.
(tsubst_type_requirement_1): New.  Fold in error-replaying code
from diagnose_valid_type.
(tsubst_type_requirement): Use the above.  Take a sat_info
instead of a subst_info.
(tsubst_compound_requirement): Likewise.  Fold in
error-replaying code from diagnose_compound_requirement.
(tsubst_nested_requirement): Take a sat_info instead of a
subst_info.  Fold in error-replaying code from
diagnose_nested_requirement.
(tsubst_requirement): Take a sat_info instead of a subst_info.
(tsubst_requires_expr): Split into two versions, one that takes
a sat_info argument and another that takes a complain and
in_decl argument.  Remove outdated documentation.  Document the
effects of the sat_info argument.
(diagnose_trait_expr): Make static.  Take a template argument
vector instead of a parameter mapping.
(diagnose_valid_expression): Remove.
(diagnose_valid_type): Remove.
(diagnose_simple_requirement): Remove.
(diagnose_compound_requirement): Remove.
(diagnose_type_requirement): Remove.
(diagnose_nested_requirement): Remove.
(diagnose_requirement): Remove.
(diagnose_requires_expr): Remove.
(diagnose_atomic_constraint): Take a sat_info instead of a
subst_info.  Adjust call to diagnose_trait_expr.  Call
tsubst_requires_expr instead of diagnose_requires_expr.
(diagnose_constraints): Call tsubst_requires_expr instead of
diagnose_requires_expr.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic1.C: Adjust expected diagnostics
now that we diagnose only the first failed requirement of a
requires-expression.
---
   gcc/cp/constraint.cc| 416 +---
   gcc/testsuite/g++.dg/concepts/diagnostic1.C |   2 +-
   2 files changed, 179 insertions(+), 239 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index cf319b34da0..31f32c25dfe 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -100,17 +100,30 @@ struct subst_info
 /* Provides additional context for satisfaction.
   -   The flag noisy() controls whether to diagnose ill-formed satisfaction,
-   such as the satisfaction value of an atom being non-bool or
non-constant.
-
-   The flag diagnose_unsatisfaction_p() controls whether to explain why
-   a constraint is not satisfied.
-
-   The entrypoints to satisfaction for which we set noisy+unsat are
-   diagnose_constraints and diagnose_nested_requirement.  The entrypoint
for
-   which we set noisy-unsat is the replay inside
constraint_satisfaction_value.
-   From constraints_satisfied_p, we enter satisfaction quietly (both flags
-   cleared).  */
+   During satisfaction:
+- The flag noisy() controls whether to diagnose ill-formed
satisfaction,
+  such as the satisfaction value of an atom being non-bool or
non-constant.
+- The flag diagnose_unsatisfaction_p() controls whether to explain why
+  a constraint is not satisfied.
+- We enter satisfaction with noisy+unsat from diagnose_constraints.
+- We enter satisfaction with noisy-unsat from the replay inside
+  constraint_satisfaction_value.
+- We enter satisfaction quietly (both flags cleared) from
+  constraints_satisfied_p.
+
+   During evaluation of a requires-expression:
+- The flag noisy() controls whether to diagnose ill-formed types and
+  expressions inside its requirements.
+- The flag diagnose_unsatisfaction_p() controls whether to explain why
+  the requires-expression evaluates to false.
+- We enter tsubst_requires_expr with noisy+unsat from
diagnose_constraints
+  and from diagnose_atomic_constraint.
+- We enter tsubst_requires_expr with

[PATCH] c-family: Avoid ICE on va_arg [PR99324]

2021-03-02 Thread Jakub Jelinek via Gcc-patches
Hi!

build_va_arg calls the middle-end mark_addressable, which e.g. requires that
cfun is non-NULL.  The following patch calls instead 
c_common_mark_addressable_vec
which is the c-family variant similarly to the FE c_mark_addressable and
cxx_mark_addressable, except that it doesn't error on addresses of register
variables.  As the taking of the address is artificial for the .VA_ARG
ifn and when that is lowered goes away, it is similar case to the vector
subscripting for which c_common_mark_addressable_vec has been added.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-02  Jakub Jelinek  

PR c/99324
* c-common.c (build_va_arg): Call c_common_mark_addressable_vec
instead of mark_addressable.  Fix a comment typo -
neutrallly -> neutrally.

* gcc.c-torture/compile/pr99324.c: New test.

--- gcc/c-family/c-common.c.jj  2021-02-10 19:27:33.981140185 +0100
+++ gcc/c-family/c-common.c 2021-03-02 12:04:01.982340959 +0100
@@ -4553,7 +4553,7 @@ build_va_arg (location_t loc, tree expr,
   if (canon_va_type == NULL_TREE)
error_at (loc, "first argument to % not of type %");
 
-  /* Let's handle things neutrallly, if expr:
+  /* Let's handle things neutrally, if expr:
 - has undeclared type, or
 - is not an va_list type.  */
   return build_va_arg_1 (loc, type, error_mark_node);
@@ -4565,7 +4565,7 @@ build_va_arg (location_t loc, tree expr,
 
   /* Take the address, to get '&ap'.  Note that &ap is not a va_list
 type.  */
-  mark_addressable (expr);
+  c_common_mark_addressable_vec (expr);
   expr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (expr)), expr);
 
   return build_va_arg_1 (loc, type, expr);
@@ -4627,7 +4627,7 @@ build_va_arg (location_t loc, tree expr,
 
   /* Take the address, to get '&ap'.  Make sure it's a pointer to array
 elem type.  */
-  mark_addressable (expr);
+  c_common_mark_addressable_vec (expr);
   expr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (canon_va_type)),
 expr);
 
--- gcc/testsuite/gcc.c-torture/compile/pr99324.c.jj2021-03-02 
12:27:31.025761333 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr99324.c   2021-03-02 
12:27:36.785697682 +0100
@@ -0,0 +1,19 @@
+/* PR c/99324 */
+
+#include 
+
+int
+foo (int x, ...)
+{
+  va_list a;
+  va_start (a, x);
+  int b[6] = {};
+  int bar (c)
+int c[1][va_arg (a, int)];
+  {
+return sizeof c[0];
+  }
+  int r = bar (b);
+  va_end (a);
+  return r;
+}

Jakub



[PATCH] c++: Fix -fstrong-eval-order for operator &&, || and , [PR82959]

2021-03-02 Thread Jakub Jelinek via Gcc-patches
Hi!

P0145R3 added
"However, the operands are sequenced in the order prescribed for the built-in
operator" rule for overloaded operator calls when using the operator syntax.
op_is_ordered follows that, but added just the overloaded operators
added in that paper.  &&, || and comma operators had rules that
lhs is sequenced before rhs already in C++98.
The following patch adds those cases to op_is_ordered.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-02  Jakub Jelinek  

PR c++/82959
* call.c (op_is_ordered): Handle TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR
and COMPOUND_EXPR.

* g++.dg/cpp1z/eval-order10.C: New test.

--- gcc/cp/call.c.jj2021-02-19 12:14:10.965113839 +0100
+++ gcc/cp/call.c   2021-03-02 13:32:08.423924095 +0100
@@ -6083,6 +6083,15 @@ op_is_ordered (tree_code code)
 case LSHIFT_EXPR:
   // 8. a >> b
 case RSHIFT_EXPR:
+  // a && b
+  // Predates P0145R3.
+case TRUTH_ANDIF_EXPR:
+  // a || b
+  // Predates P0145R3.
+case TRUTH_ORIF_EXPR:
+  // a , b
+  // Predates P0145R3.
+case COMPOUND_EXPR:
   return (flag_strong_eval_order ? 1 : 0);
 
 default:
--- gcc/testsuite/g++.dg/cpp1z/eval-order10.C.jj2021-03-02 
14:15:21.735278240 +0100
+++ gcc/testsuite/g++.dg/cpp1z/eval-order10.C   2021-03-02 14:24:36.984143267 
+0100
@@ -0,0 +1,27 @@
+// PR c++/82959
+// { dg-do run }
+// { dg-additional-options -fstrong-eval-order }
+
+struct A {};
+
+void operator && (const A x, const A) {}
+void operator || (const A x, const A) {}
+void operator , (const A x, const A) {}
+
+int i;
+
+A f () { if (i != 0) __builtin_abort (); i = 1; return A (); }
+A g () { if (i != 1) __builtin_abort (); i = 2; return A (); }
+
+int
+main ()
+{
+  f () && g ();
+  if (i != 2) __builtin_abort ();
+  i = 0;
+  f () || g ();
+  if (i != 2) __builtin_abort ();
+  i = 0;
+  f (), g ();
+  if (i != 2) __builtin_abort ();
+}

Jakub



Re: [PATCH] constrain writing one too many bytes" warning (PR 97631)

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/18/21 3:30 PM, Martin Sebor via Gcc-patches wrote:
> The "writing one too many bytes" form of -Wstringop-overflow is
> designed to trigger for strcpy(d, s) calls into allocated destinations
> whose size is the result of (or depends on) strlen(s).  But the warning
> is in code that's also called from handlers for bounded functions like
> memcpy and strncpy, and the code doesn't differentiate between the two
> kinds of callers, causing false positives.
>
> The attached patch corrects both the warning routine and its callers
> to properly distinguish these two classes of callers.  In addition,
> it corrects a mistake where -Wstringop-overflow is being issued for
> destinations of unknown size instead of the more appropriate
> -Wstringop-truncation.
>
> Tested on x86_64-linux.
>
> The bug is a P2 10/11 regression and I'm looking to commit this change
> both into the trunk and 10-branch.
>
> Martin
>
>
> gcc-97631.diff
>
> PR middle-end/97631 - bogus "writing one too many bytes" warning for memcpy 
> with strlen argument
>
> gcc/ChangeLog:
>
>   PR middle-end/97631
>   * tree-ssa-strlen.c (maybe_warn_overflow): Test rawmem.
>   (handle_builtin_stxncpy_strncat): Rename locals.  Determine
>   destination size from allocation calls.  Issue a more appropriate
>   kind of warning.
>   (handle_builtin_memcpy): Pass true as rawmem to maybe_warn_overflow.
>   (handle_builtin_memset): Same.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/97631
>   * c-c++-common/Wstringop-overflow.c: Remove unexpected warnings.
>   Add an xfail.
>   * c-c++-common/Wstringop-truncation.c: Add expected warnings.
>   * gcc.dg/Wstringop-overflow-10.c: Also enable -Wstringop-truncation.
>   * gcc.dg/Wstringop-overflow-65.c: New test.
OK
jeff



Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches
I have been kicking these sorts of ideas around ever since I came to understand 
that
the second "UNKNOWN" in the for loop that originally started this was due to 
the state
merge as we loop. For now, and I don't mean this disrespectfully because it is 
very
hard to get right, the whole issue of merging has basically been punted, given 
some
of the simple examples we found that will merge as an unknown svalue. As you 
think
about this issue, "scope creep" becomes a concern quickly. It  quickly turns 
into
a halting problem of sorts, you have to decide how much of you want the 
analyzer to
be able to "understand" a program. For example, any human can follow this:

sum = 0;
for (idx = 1; idx <= 10; idx++) sum += idx;
__analyzer_eval (sum == 55);

but from an analyzer perspective it opens up all sorts of questions and
becomes a bit of a PhD thesis as to where you draw the line. The biggest concern
with the analyzer seems to be vulnerabilities, so I doubt it is useful to get 
the
analyzer to produce the correct answer for the above code, although it might be
interesting to do so from an academic perspective.

The example you provided gives a concrete reason that overflow should not be a
complete "punt" and I like it. In the interest of fighting scope creep and 
keeping
things manageable, I would question whether you want to actually track the 
overflow /
no overflow cases separately or just raise any possible overflow as an error 
immediately.
I am not disputing your idea, I would prefer to track the overflow and get
a correct result (in this case, an under allocation of memory). I guess I would 
want
to know how much work you think that will be. You still know the codebase a lot 
better
than I do.

My devil's advocate position would be if the analyzer raises exception on
any possible overflow, will that overwhelm the user with false positives? I
am not sure of the answer here, because a piece of me feels that overflow is not
something that production code should be relying on in any serious application,
and so should be non existent, but I am not sure if that is reflective of
reality.

The simplest way to handle your example would be the following:

struct foo * make_arr (size_t n)
{
  if (MAX_INT / sizeof (struct foo) >= n)
return NULL;
  //continue with what you wrote
}

This should add a constraint that downstream of the initial guard, n is small
enough to prevent overflow (I would have to check, but the current analyzer
should be close to doing this if not already correct). Therefore, all we would 
need
would be the check at the definition of sz as to whether it overflows, and that
check should come back negative in my example, unknown in yours. Assuming we 
agree
that the purpose of the analyzer is to prevent vulnerabilities, and not to 
provide
an academic exercise in seeing how close we get to solving the halting problem,
we could just treat any possible overflow as an error.

As this was fundamentally your project, I don't want to tell you how to do it, 
and
the nerd in me wants to build an analyzer that can answer all the silly puzzle 
code
I can think to feed it, but from a utilitarian perspective and given everyone's
limited time, I thought I would offer this as a path you can try.


Re: [PATCH] MIPS: Fix PR target/98491 (ChangeLog)

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/17/21 5:02 AM, Richard Sandiford wrote:
> Xi Ruoyao via Gcc-patches  writes:
>> I can't understand the comment either.  To me it looks like it's
>> possible to
>> remove this "if (MSA_SUPPORTED_P (mode)) return 0;"
> I think the point is that the MSA loads and stores only have a 10-bit
> offset field instead of the usual 16-bit offset field and so the usual
> approaches to handling symbolic addresses won't work.
Ah.

>
>> CC Robert to get some help.
> Happy new lunar year folks.
>
> I found a newer email address of Robert.  Hope it is still being used.
>
> Could someone update MAINTAINERS file by the way?
 If you have an updated email address, I can reach out to Robert and see
 if he wants his entry updated or removed.
>>>  His latest reply in gcc mail lists used robert.sucha...@mips.com.  But 
>>> when I
>>> sent mail to it, the mail was just rejected with "access denied".  Google 
>>> told
>>> me Office 365 mail service (used by mips.com) rejects mail to deleted 
>>> accounts
>>> with "access denied". So I'm not sure if this email address is invalid 
>>> again,
>>> or
>>> Office 365 just dislikes me...
>> Hi Jeff,
>>
>> I think it's better to just fix the out-of-bound array access now by special
>> casing MAX_MACHINE_MODE, if we can't figure out if this entry should be 
>> removed.
>> Either in MSA_SUPPORTED_P or in mips_symbol_insns.
>>
>> It's really "irrational" to leave such a obvious programming error in new 
>> GCC 11
>> release...  And I've built a Linux system (in Linux From Scratch way, X11 was
>> built and it runs correctly now) on the Loongson 3A4000 with patched 
>> GCC-10.2.0,
>> and "-O3 -mmsa" in CFLAGS for most packages so the change should be OK.
> Yeah, agreed.  I think the mips_symbol_insns patch is the right one,
> so I've pushed to it trunk.  I think it's also worth backporting
> to release branches, but let me know how far back you've tested it.
Seems reasonable to me.  THanks for lending a hand on this Richard.

jeff



[PATCH] i386: Fix a peephole2 for -mavx512vl -mno-avx512bw [PR99321]

2021-03-02 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, the
(define_peephole2
  [(set (match_operand 0 "sse_reg_operand")
(match_operand 1 "sse_reg_operand"))
   (set (match_dup 0)
(match_operator 3 "commutative_operator"
  [(match_dup 0)
   (match_operand 2 "memory_operand")]))]
peephole2 can for AVX512VL without AVX512BW (I guess it is a hyphothetical
CPU, but unfortunately they are separate CPUID bits and we have separate
options for them) turn something that is valid without that peephole2
into something that is invalid (and in this case ICEs).
The problem is that the vpadd[bw], vpmullw, vpmin[su][bw] and vpmax[su][bw]
instructions require both AVX512BW and AVX512VL when they have
16-byte or 32-byte operands.  If operands[0] is %[xy]mm0 .. %[xy]mm15
but operands[1] is %[xy]mm16 .. %[xy]mm31, then before we have
a vector move which uses vmovdqa{32,64} and doesn't need AVX512BW,
AVX512VL is I think implied from HARD_REGNO_MODE_OK only supporting
V{16Q,32Q,8H,16H}imode in EXT_REX_SSE_REGNO_P regs with AVX512VL, and then
we have a commutative operation with that %[xy]mm0 .. %[xy]mm15 destination
and one source and a memory operand, so VEX encoded operation.
And, the peephole2 wants to replace it with a load into the destination
register from memory (ok) and then the commutative arith instruction.
But that needs EVEX encoding because of the high register and so requires
AVX512BW which might not be enabled.
The exception is and/ior/xor, because the hw doesn't have
vp{and,or,xor}{b,w} instructions at all, it uses vp{and,or,xor}d instead
and that of course doesn't need AVX512BW.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

BTW, there are other bugs I need to look at, while the vp{min,max}ub with
16-byte operands instruction properly requires avx512bw for v constraints
and otherwise uses x, e.g. the vpadd[bw] etc. instructions don't.
I'll try to handle that incrementally later this week.

2021-03-02  Jakub Jelinek  

PR target/99321
* config/i386/i386.md (mov + mem using comm arith peephole2):
Punt if operands[1] is EXT_REX_SSE_REGNO_P, AVX512BW is not enabled
and the inner mode is [QH]Imode.

* gcc.target/i386/pr99321.c: New test.

--- gcc/config/i386/i386.md.jj  2021-02-18 09:16:21.026841364 +0100
+++ gcc/config/i386/i386.md 2021-03-02 17:57:55.878158963 +0100
@@ -19843,7 +19843,18 @@ (define_peephole2
(match_operator 3 "commutative_operator"
  [(match_dup 0)
   (match_operand 2 "memory_operand")]))]
-  "REGNO (operands[0]) != REGNO (operands[1])"
+  "REGNO (operands[0]) != REGNO (operands[1])
+   /* Punt if operands[1] is %[xy]mm16+ and AVX512BW is not enabled,
+  as EVEX encoded vpadd[bw], vpmullw, vpmin[su][bw] and vpmax[su][bw]
+  instructions require AVX512BW and AVX512VL, but with the original
+  instructions it might require just AVX512VL.
+  AVX512VL is implied from TARGET_HARD_REGNO_MODE_OK.  */
+   && (!EXT_REX_SSE_REGNO_P (REGNO (operands[1]))
+   || TARGET_AVX512BW
+   || GET_MODE_SIZE (GET_MODE_INNER (GET_MODE (operands[0]))) > 2
+   || GET_CODE (operands[3]) == AND
+   || GET_CODE (operands[3]) == IOR
+   || GET_CODE (operands[3]) == XOR)"
   [(set (match_dup 0) (match_dup 2))
(set (match_dup 0)
(match_op_dup 3 [(match_dup 0) (match_dup 1)]))])
--- gcc/testsuite/gcc.target/i386/pr99321.c.jj  2021-03-02 18:05:02.561528747 
+0100
+++ gcc/testsuite/gcc.target/i386/pr99321.c 2021-03-02 18:05:51.121001799 
+0100
@@ -0,0 +1,41 @@
+/* PR target/99321 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=btver2 -fno-tree-dce -mavx512vl -mno-avx512bw" } */
+
+typedef unsigned __attribute__((__vector_size__ (8))) A;
+typedef unsigned int __attribute__((__vector_size__ (8))) B;
+typedef unsigned char __attribute__((__vector_size__ (16))) C;
+typedef unsigned __attribute__((__vector_size__ (16))) D;
+typedef unsigned int __attribute__((__vector_size__ (16))) E;
+typedef unsigned __attribute__((__vector_size__ (16))) F;
+typedef unsigned __attribute__((__vector_size__ (32))) G;
+typedef int __attribute__((__vector_size__ (32))) H;
+typedef unsigned int __attribute__((__vector_size__ (32))) I;
+typedef char __attribute__((__vector_size__ (64))) J;
+typedef unsigned int __attribute__((__vector_size__ (64))) K;
+typedef unsigned long long __attribute__((__vector_size__ (64))) L;
+unsigned char a;
+unsigned b, c;
+H d;
+E e, f;
+D g;
+L h;
+
+A
+foo0 (A i, C j, G k, B l, K m, B n, I o)
+{
+  J p, q = a != p;
+  F r = b << f;
+  int s = a * 15;
+  C t = (1 << ((C) ((C) { 80 } >=j) & sizeof (0)) | (j ^ (C) { 5 }) << (j & 
sizeof (0))) != 0;
+  L u = h;
+  H v = d - 40;
+  u ^= -(long long) n;
+  D w = (char) s > g;
+  o ^= c / o;
+  J x = p + q + (J) m + (J) u + (J) u;
+  G y = ((union { J a; G b;}) x).b + ((union { J a; G b[2];}) x).b[1] + k + v 
+ o;
+  C z = ((union { G a; C b;}) y).b + ((union { G a; C b;}) y).b + j + t + (C) 
g + (C) w + (C) e + (C) f + (C) r

Re: [PATCH] cfgrtl: Fix up fixup_partitions caused ICE [PR99085]

2021-03-02 Thread Jeff Law via Gcc-patches



On 2/16/21 1:13 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> fixup_partitions sometimes changes some basic blocks from hot partition to
> cold partition, in particular if after unreachable block removal or other
> optimizations a hot partition block is dominated by cold partition block(s).
> It fixes up the edges and jumps on those edges, but when after reorder
> blocks and in rtl (non-cfglayout) mode that is clearly not enough, because
> it keeps the block order the same and so we can end up with more than
> 1 hot/cold section transition in the same function.
>
> So, this patch fixes that up too.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-02-15  Jakub Jelinek  
>
>   PR target/99085
>   * cfgrtl.c (fixup_partitions): When changing some bbs from hot to cold
>   partitions, if in non-layout mode after reorder_blocks also move
>   affected blocks to ensure a single partition transition.
>
>   * gcc.dg/graphite/pr99085.c: New test.
OK
jeff



[PATCH] Fix TLS thread pointer function names on AIX

2021-03-02 Thread David Edelsohn via Gcc-patches
This patch adds missing periods to the symbol name for TLS helper functions.

Bootstrapped on powerpc-ibm-aix7.2.2.0.

Thanks, David

* config/rs6000/rs6000.md (tls_get_tpointer_internal): Prepend
period to symbol name.
(tls_get_addr_internal): Same.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5ae65fb9f96..a1904b35f7f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -10013,7 +10013,7 @@ (define_insn "tls_get_tpointer_internal"
(unspec:SI [(const_int 0)] UNSPEC_TLSTLS))
(clobber (reg:SI LR_REGNO))]
   "TARGET_XCOFF && HAVE_AS_TLS"
-  "bla __get_tpointer")
+  "bla .__get_tpointer")

 (define_expand "tls_get_addr"
   [(set (match_operand:P 0 "gpc_reg_operand")
@@ -10038,7 +10038,7 @@ (define_insn "tls_get_addr_internal"
(clobber (reg:CC CR0_REGNO))
(clobber (reg:P LR_REGNO))]
   "TARGET_XCOFF && HAVE_AS_TLS"
-  "bla __tls_get_addr")
+  "bla .__tls_get_addr")
 ^L
 ;; Next come insns related to the calling sequence.
 ;;


[PATCH PING^3] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-03-02 Thread Ilya Leoshkevich via Gcc-patches
Hello,

I would like to ping the following patch:

Add input_modes parameter to TARGET_MD_ASM_ADJUST hook
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html

It is needed for the following regression fix:

IBM Z: Fix usage of "f" constraint with long doubles
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html


Jakub, who would be the right person to review this change?  I've
decided to ask you, since `git shortlog -ns gcc/cfgexpand.c` shows that
you deal with this code a lot.

Best regards,
Ilya




If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
should be ok as long as the hook itself as well as after_md_seq make up
for it), input_mode will contain stale information.

It might be tempting to fix this by removing input_mode altogether and
just using GET_MODE (), but this will not work correctly with constants.
So add input_modes parameter and document that it should be updated
whenever inputs parameter is updated.

gcc/ChangeLog:

2021-01-05  Ilya Leoshkevich  

* cfgexpand.c (expand_asm_loc): Pass new parameter.
(expand_asm_stmt): Likewise.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add new
parameter.
* config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
* config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
* config/cris/cris.c (cris_md_asm_adjust): Likewise.
* config/i386/i386.c (ix86_md_asm_adjust): Likewise.
* config/mn10300/mn10300.c (mn10300_md_asm_adjust): Likewise.
* config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
* config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
* config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
* config/vax/vax.c (vax_md_asm_adjust): Likewise.
* config/visium/visium.c (visium_md_asm_adjust): Likewise.
* target.def (md_asm_adjust): Likewise.
---
 gcc/cfgexpand.c  | 16 
 gcc/config/arm/aarch-common-protos.h |  8 
 gcc/config/arm/aarch-common.c|  7 ---
 gcc/config/arm/arm.c | 14 --
 gcc/config/cris/cris.c   |  7 ---
 gcc/config/i386/i386.c   |  7 ---
 gcc/config/mn10300/mn10300.c |  7 ---
 gcc/config/nds32/nds32.c |  1 +
 gcc/config/pdp11/pdp11.c |  9 +
 gcc/config/rs6000/rs6000.c   |  7 ---
 gcc/config/vax/vax.c |  3 ++-
 gcc/config/visium/visium.c   | 12 +++-
 gcc/doc/tm.texi  | 10 ++
 gcc/target.def   | 13 -
 14 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index aef9e916fcd..a6b48d3e48f 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2880,6 +2880,7 @@ expand_asm_loc (tree string, int vol, location_t locus)
   rtx asm_op, clob;
   unsigned i, nclobbers;
   auto_vec input_rvec, output_rvec;
+  auto_vec input_mode;
   auto_vec constraints;
   auto_vec clobber_rvec;
   HARD_REG_SET clobbered_regs;
@@ -2889,9 +2890,8 @@ expand_asm_loc (tree string, int vol, location_t locus)
   clobber_rvec.safe_push (clob);
 
   if (targetm.md_asm_adjust)
-   targetm.md_asm_adjust (output_rvec, input_rvec,
-  constraints, clobber_rvec,
-  clobbered_regs);
+   targetm.md_asm_adjust (output_rvec, input_rvec, input_mode,
+  constraints, clobber_rvec, clobbered_regs);
 
   asm_op = body;
   nclobbers = clobber_rvec.length ();
@@ -3068,8 +3068,8 @@ expand_asm_stmt (gasm *stmt)
   return;
 }
 
-  /* There are some legacy diagnostics in here, and also avoids a
- sixth parameger to targetm.md_asm_adjust.  */
+  /* There are some legacy diagnostics in here, and also avoids an extra
+ parameter to targetm.md_asm_adjust.  */
   save_input_location s_i_l(locus);
 
   unsigned noutputs = gimple_asm_noutputs (stmt);
@@ -3420,9 +3420,9 @@ expand_asm_stmt (gasm *stmt)
  the flags register.  */
   rtx_insn *after_md_seq = NULL;
   if (targetm.md_asm_adjust)
-after_md_seq = targetm.md_asm_adjust (output_rvec, input_rvec,
- constraints, clobber_rvec,
- clobbered_regs);
+after_md_seq
+   = targetm.md_asm_adjust (output_rvec, input_rvec, input_mode,
+constraints, clobber_rvec, clobbered_regs);
 
   /* Do not allow the hook to change the output and input count,
  lest it mess up the operand numbering.  */
diff --git a/gcc/config/arm/aarch-common-protos.h 
b/gcc/config/arm/aarch-common-protos.h
index 7a9cf3d324c..b6171e8668d 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -144,9 +144,9 @@ struct cpu_cost_table
   const struct vector_cost_table vect;
 };
 
-rtx_insn *
-arm_md_as

Re: [PATCH, rs6000] Add Power10 scheduling description

2021-03-02 Thread Segher Boessenkool
On Tue, Nov 17, 2020 at 10:31:42PM -0600, will schmidt wrote:
> > 2020-11-13  Pat Haugen  
> > 
> > gcc/
> > * config/rs6000/rs6000.c (struct processor_costs): New.
> 
> Should that add/reference the "power10_cost" structure itself?

Yup.

> +;; Copyright (C) 2020-2020 Free Software Foundation, Inc.
> 
> Nit: Can probably just be a single 2020 reference.  :-)

Not anymore :-/


Segher


Re: [PATCH, rs6000] Add Power10 scheduling description

2021-03-02 Thread Segher Boessenkool
On Wed, Nov 18, 2020 at 03:10:21PM -0600, Pat Haugen wrote:
> On 11/17/20 10:31 PM, will schmidt wrote:
> >> +(define_cpu_unit "fp_div0_power10,fp_div1_power10,fp_div2_power10,
> >> +fp_div3_power10" "power10div")
> > 
> > The spacing catches my eye, I'd want to add spaces around those commas,
> > etc.   But.. this appears to be consistent with behavior
> > as seen in the
> > existing power9.md ; power9.md ; etc. 
> > So it's either this way per necessity, or this way per history.
> > Either way, no change requested here given that precedence.
> > (If this and
> > the older stuff also needs to be cosmetically tweaked, that can be
> > handled later on..)
> 
> Yeah, just historical.

Yes, any whitespace is allowed before and after every entry (and you do
use that for the newlines already :-) )

But it is traditional to not have spaces around the commas here.  Feel
free to change your habits if you want :-)

> >> +(define_insn_reservation "power10-prefixed-load" 4
> >> +  (and (eq_attr "type" "load")
> >> +   (eq_attr "update" "no")
> >> +   (eq_attr "size" "!128")
> >> +   (eq_attr "prefixed" "!no")
> > 
> > I'm sure there is reason, but remind me..  "!no" versus "yes" ?
> 
> >From my prior patch, the prefixed attribute can now have values 
> >no/yes/always, so '!no' means 'yes' || 'always'.

(eq_attr "bla" "!smth")  means  (ne_attr "bla" "smth")  (but that latter
syntax is not recognised).


Segher


Re: [PATCH] c-family: Avoid ICE on va_arg [PR99324]

2021-03-02 Thread Joseph Myers
On Wed, 3 Mar 2021, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> build_va_arg calls the middle-end mark_addressable, which e.g. requires that
> cfun is non-NULL.  The following patch calls instead 
> c_common_mark_addressable_vec
> which is the c-family variant similarly to the FE c_mark_addressable and
> cxx_mark_addressable, except that it doesn't error on addresses of register
> variables.  As the taking of the address is artificial for the .VA_ARG
> ifn and when that is lowered goes away, it is similar case to the vector
> subscripting for which c_common_mark_addressable_vec has been added.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Ping: [PATCH v2] rs6000: Convert the vector element register to SImode [PR98914]

2021-03-02 Thread Xionghu Luo via Gcc-patches


On 2021/2/25 14:33, Xionghu Luo via Gcc-patches wrote:
> 
> 
> On 2021/2/25 00:57, Segher Boessenkool wrote:
>> Hi!
>>
>> On Wed, Feb 24, 2021 at 09:06:24AM +0800, Xionghu Luo wrote:
>>> vec_insert defines the element argument type to be signed int by ELFv2
>>> ABI, When expanding a vector with a variable rtx, convert the rtx type
>>> SImode.
>>
>> But that is true for the intrinsics, not for all other callers of
>> rs6000_expand_vector_init.  See
>>  as well?
>>
>> So I don't think you do this in the right place.  You can convince me
>> with good arguments of course :-)
> 
> Thanks for pointing out, it seems we should convert the type to DImode in
> rs6000_expand_vector_set_var_p9 and rs6000_expand_vector_set_var_p8
> to support both usage?
> 
> 
> PS: for "vec_insert (i, u, n)" usage when n is long, what should the front-end
> do in altivec_resolve_overloaded_builtin to follow the ELFv2 rule?  Currently,
> no warning/error message or conversion there, INTEGRAL_TYPE_P range is much 
> larger
> than signed int.

long to int should follow implicit transformation, so no need change here.
Ping the patch, thanks.


BR,
Xionghu

> 
> 
> 
> Updated the back-end patch as below.
> 
> 
> 0001-rs6000-Convert-the-vector-set-variable-idx-to-DImode.patch
> 
> 
> vec_insert defines the element argument type to be signed int by ELFv2
> ABI.  When expanding a vector with a variable rtx, convert the rtx type
> to DImode to support both intrinsic usage and other callers from
> rs6000_expand_vector_init produced by v[k] = val when k is long type.
> 
> gcc/ChangeLog:
> 
> 2021-02-25  Xionghu Luo  
> 
>   PR target/98914
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set_var_p9):
>   Convert idx to DImode.
>   (rs6000_expand_vector_set_var_p8): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-02-25  Xionghu Luo  
> 
>   PR target/98914
>   * gcc.target/powerpc/pr98914.c: New test.
> ---
>   gcc/config/rs6000/rs6000.c | 33 +-
>   gcc/testsuite/gcc.target/powerpc/pr98914.c | 11 
>   2 files changed, 30 insertions(+), 14 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr98914.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index ec068c58aa5..48eb91132a9 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -7000,11 +7000,15 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, 
> rtx idx)
> 
> gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
> 
> -  gcc_assert (GET_MODE (idx) == E_SImode);
> -
> machine_mode inner_mode = GET_MODE (val);
> 
> -  rtx tmp = gen_reg_rtx (GET_MODE (idx));
> +  machine_mode idx_mode = GET_MODE (idx);
> +  rtx tmp = gen_reg_rtx (DImode);
> +  if (idx_mode != DImode)
> +tmp = convert_modes (DImode, idx_mode, idx, 0);
> +  else
> +tmp = idx;
> +
> int width = GET_MODE_SIZE (inner_mode);
> 
> gcc_assert (width >= 1 && width <= 8);
> @@ -7012,9 +7016,7 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, 
> rtx idx)
> int shift = exact_log2 (width);
> /* Generate the IDX for permute shift, width is the vector element size.
>idx = idx * width.  */
> -  emit_insn (gen_ashlsi3 (tmp, idx, GEN_INT (shift)));
> -
> -  tmp = convert_modes (DImode, SImode, tmp, 1);
> +  emit_insn (gen_ashldi3 (tmp, tmp, GEN_INT (shift)));
> 
> /*  lvsrv1,0,idx.  */
> rtx pcvr = gen_reg_rtx (V16QImode);
> @@ -7047,27 +7049,31 @@ rs6000_expand_vector_set_var_p8 (rtx target, rtx val, 
> rtx idx)
> 
> gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
> 
> -  gcc_assert (GET_MODE (idx) == E_SImode);
> -
> machine_mode inner_mode = GET_MODE (val);
> HOST_WIDE_INT mode_mask = GET_MODE_MASK (inner_mode);
> 
> -  rtx tmp = gen_reg_rtx (GET_MODE (idx));
> int width = GET_MODE_SIZE (inner_mode);
> 
> +  machine_mode idx_mode = GET_MODE (idx);
> +  rtx tmp = gen_reg_rtx (DImode);
> +  if (idx_mode != DImode)
> +tmp = convert_modes (DImode, idx_mode, idx, 0);
> +  else
> +tmp = idx;
> +
> gcc_assert (width >= 1 && width <= 4);
> 
> if (!BYTES_BIG_ENDIAN)
>   {
> /*  idx = idx * width.  */
> -  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
> +  emit_insn (gen_muldi3 (tmp, tmp, GEN_INT (width)));
> /*  idx = idx + 8.  */
> -  emit_insn (gen_addsi3 (tmp, tmp, GEN_INT (8)));
> +  emit_insn (gen_adddi3 (tmp, tmp, GEN_INT (8)));
>   }
> else
>   {
> -  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
> -  emit_insn (gen_subsi3 (tmp, GEN_INT (24 - width), tmp));
> +  emit_insn (gen_muldi3 (tmp, idx, GEN_INT (width)));
> +  emit_insn (gen_subdi3 (tmp, GEN_INT (24 - width), tmp));
>   }
> 
> /*  lxv vs33, mask.
> @@ -7118,7 +7124,6 @@ rs6000_expand_vector_set_var_p8 (rtx target, rtx val, 
> rtx idx)
> emit_insn (gen_rtx_SET (val_v16qi, sub_val));
> 
>

Re: [PATCH, rs6000] Add Power10 scheduling description

2021-03-02 Thread Segher Boessenkool
Hi!

On Fri, Nov 13, 2020 at 04:04:30PM -0600, Pat Haugen wrote:
> gcc/
>   * config/rs6000/rs6000.c (struct processor_costs): New.

Will already commented on this; please consider his comments :-)

>   (rs6000_option_override_internal): Set Power10 costs.
>   (rs6000_issue_rate): Set Power10 issue rate.
>   * config/rs6000/power10.md: Rewrite for Power10.

> +(define_automaton "power10dsp,power10issue,power10div")

Write "dispatch" in full?  It is only used here and in the du*_power10
definition.  Or does it show up in debug dumps too much, or similar?

> +(define_insn_reservation "power10-load-update" 4
> +  (and (eq_attr "type" "load")
> +   (eq_attr "update" "yes")
> +   (eq_attr "cpu" "power10"))
> +  "DU_even_power10,LU_power10+SXU_power10")

Maybe make a bypass (or however you can model this) for the updated
register?  That is only 2 cycles, right?

> +(define_insn_reservation "power10-mul" 5

This makes the power10_costs entries somewhat questionable.

> +(define_insn_reservation "power10-div" 12

And this (and the other division latencies).

Please check Will's comments too, and (build) test on both BE and LE.
With that, okay for trunk, and thank you!


Segher


Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread David Malcolm via Gcc-patches
On Tue, 2021-03-02 at 23:14 +, brian.sobulefsky wrote:
> I have been kicking these sorts of ideas around ever since I came to
> understand that
> the second "UNKNOWN" in the for loop that originally started this was
> due to the state
> merge as we loop. For now, and I don't mean this disrespectfully
> because it is very
> hard to get right, the whole issue of merging has basically been
> punted, given some
> of the simple examples we found that will merge as an unknown svalue.
> As you think
> about this issue, "scope creep" becomes a concern quickly. It 
> quickly turns into
> a halting problem of sorts, you have to decide how much of you want
> the analyzer to
> be able to "understand" a program. For example, any human can follow
> this:
> 
> sum = 0;
> for (idx = 1; idx <= 10; idx++) sum += idx;
> __analyzer_eval (sum == 55);
> 
> but from an analyzer perspective it opens up all sorts of questions
> and
> becomes a bit of a PhD thesis as to where you draw the line. 

Challenge accepted!  FWIW with suitable options, the analyzer can
actually "figure this out":

$ cat ../../src/t.c
extern void __analyzer_eval (int);

void test (void)
{
 int sum = 0;
 for (int idx = 1; idx <= 10; idx++)
   sum += idx;
 __analyzer_eval (sum == 55);
}

$ ./xgcc -B. -S -fanalyzer ../../src/t.c \
-Wanalyzer-too-complex \
-fno-analyzer-state-merge \
--param analyzer-max-enodes-per-program-point=11
../../src/t.c: In function ‘test’:
../../src/t.c:8:2: warning: TRUE
8 |  __analyzer_eval (sum == 55);
  |  ^~~

i.e. with -fno-analyzer-state-merge to disable state merging, 
and increasing the enode limit so that the analyzer effectively fully
unrolls the loop when exploring the exploded_graph.

But obviously this isn't particularly useful, except as a demo of the
internals of the analyzer.


> The biggest concern
> with the analyzer seems to be vulnerabilities, so I doubt it is
> useful to get the
> analyzer to produce the correct answer for the above code, although
> it might be
> interesting to do so from an academic perspective.

Indeed - security vulnerabilities are my highest priority (making it
easier to avoid them as code is written/patched, and to find them in
existing code).

> The example you provided gives a concrete reason that overflow should
> not be a
> complete "punt" and I like it. In the interest of fighting scope
> creep and keeping
> things manageable, I would question whether you want to actually
> track the overflow /
> no overflow cases separately or just raise any possible overflow as
> an error immediately.
> I am not disputing your idea, I would prefer to track the overflow
> and get
> a correct result (in this case, an under allocation of memory). I
> guess I would want
> to know how much work you think that will be. You still know the
> codebase a lot better
> than I do.

Brainstorming somewhat, another idea I have for handling overflow (as
opposed to the || idea you mentioned) might be to bifurcate state at
each point where overflow could occur, splitting the path into "didn't
overflow" and "did overflow" outcomes, adding conditions to each
successor state accordingly.

But maybe that would lead to a combinatorial explosion of nodes (unless
it can be tamed by merging?)

(Unfortunately, we can currently only split states at CFG splits, not
at arbitrary statements; see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99260 )

> My devil's advocate position would be if the analyzer raises
> exception on
> any possible overflow, will that overwhelm the user with false
> positives?

Presumably by "raise exception" you mean "issue a diagnostic and stop
analyzing this path", right?

I think the point is to detect where numerical overflow can lead to
e.g. a buffer overflow, rather than complain about numerical overflow
in its own right, like in the make_arr example I gave earlier.

>  I
> am not sure of the answer here, because a piece of me feels that
> overflow is not
> something that production code should be relying on in any serious
> application,
> and so should be non existent, but I am not sure if that is
> reflective of
> reality.

My belief is that production code is full of overflows, but only some
of them are security-sensitive.  Consider e.g. hashing algorithms that
sum some values and beningly assume overflow for wraparound as opposed
to the "calculate the size of the buffer to be allocated" example
(where the overflow is a classic security pitfall).

> The simplest way to handle your example would be the following:
> 
> struct foo * make_arr (size_t n)
> {
>   if (MAX_INT / sizeof (struct foo) >= n)
>     return NULL;
>   //continue with what you wrote
> }

Ideally we'd emit a fix-it hint suggesting adding such a clause (I'm
kidding, but only partly).

> This should add a constraint that downstream of the initial guard, n
> is small
> enough to prevent overflow (I would have to check, but the current
> analyzer
> should be close to doing this if not alr

Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread Jeff Law via Gcc-patches



On 3/2/21 6:40 PM, David Malcolm via Gcc-patches wrote:
>
>> My devil's advocate position would be if the analyzer raises
>> exception on
>> any possible overflow, will that overwhelm the user with false
>> positives?
> Presumably by "raise exception" you mean "issue a diagnostic and stop
> analyzing this path", right?
>
> I think the point is to detect where numerical overflow can lead to
> e.g. a buffer overflow, rather than complain about numerical overflow
> in its own right, like in the make_arr example I gave earlier.
WRT overflow, IMHO, the most valuable case to detect overflows is when
they feed an allocation via malloc/alloca.  If an attacker can arrange
to overflow the size computation, then they can cause an
under-allocation which in turn opens the ability to over-write the stack
or heap data structures, which in turn are great attack vectors.

And in case you think that's contrived, it isn't :-)

http://phrack.org/issues/67/9.html

>>  I
>> am not sure of the answer here, because a piece of me feels that
>> overflow is not
>> something that production code should be relying on in any serious
>> application,
>> and so should be non existent, but I am not sure if that is
>> reflective of
>> reality.
> My belief is that production code is full of overflows, but only some
> of them are security-sensitive.  Consider e.g. hashing algorithms that
> sum some values and beningly assume overflow for wraparound as opposed
> to the "calculate the size of the buffer to be allocated" example
> (where the overflow is a classic security pitfall).
Agreed.


Jeff


libbacktrace patch committed: Don't special case file 0

2021-03-02 Thread Ian Lance Taylor via Gcc-patches
This libbacktrace patch stops special casing file 0.  It's no longer
necessary as for DWARF 5 support we now set up filename 0 in all
cases.  Bootstrapped and ran libbacktrace and Go tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* dwarf.c (read_line_program): Don't special case file 0.
(read_function_entry): Likewise.
b3176ab8787a7f988a931e26bce9227edd2e6d1a
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 546b4b26a32..e6b1f238cd3 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2857,20 +2857,15 @@ read_line_program (struct backtrace_state *state, 
struct dwarf_data *ddata,
uint64_t fileno;
 
fileno = read_uleb128 (line_buf);
-   if (fileno == 0)
- filename = "";
-   else
+   if (fileno >= hdr->filenames_count)
  {
-   if (fileno >= hdr->filenames_count)
- {
-   dwarf_buf_error (line_buf,
-("invalid file number in "
- "line number program"),
-0);
-   return 0;
- }
-   filename = hdr->filenames[fileno];
+   dwarf_buf_error (line_buf,
+("invalid file number in "
+ "line number program"),
+0);
+   return 0;
  }
+   filename = hdr->filenames[fileno];
  }
  break;
case DW_LNS_set_column:
@@ -3298,21 +3293,15 @@ read_function_entry (struct backtrace_state *state, 
struct dwarf_data *ddata,
case DW_AT_call_file:
  if (val.encoding == ATTR_VAL_UINT)
{
- if (val.u.uint == 0)
-   function->caller_filename = "";
- else
+ if (val.u.uint >= lhdr->filenames_count)
{
- if (val.u.uint >= lhdr->filenames_count)
-   {
- dwarf_buf_error (unit_buf,
-  ("invalid file number in "
-   "DW_AT_call_file attribute"),
-  0);
- return 0;
-   }
- function->caller_filename =
-   lhdr->filenames[val.u.uint];
+ dwarf_buf_error (unit_buf,
+  ("invalid file number in "
+   "DW_AT_call_file attribute"),
+  0);
+ return 0;
}
+ function->caller_filename = lhdr->filenames[val.u.uint];
}
  break;
 


Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches
Wow! I wasn't expecting that to work. Obviously we know that  there is currently
no handler for binop_svalue in the constraints so I would have to watch it run 
with
state merging disabled to see how it is managing the unroll. The fact that 
merging
breaks it is indicative of what we are saying though, that the constraint and
merge model is currently insufficient.

Hash algorithms may provide a counterexample for legitimate use of overflow. 
Anyway,
I would prefer tracking the split too, but it is a big change. One way is state
split, like you say, but that is a pretty invasive change to the way the graph 
works.
The other way is to handle "or constraints" as I have said. This is an invasive
change to the constraint model, but arguably the concept of "or" cannot be 
ignored
forever.

Thinking about it, I guess currently the concept of "and" is handled by the
constraints (all constraints in a model exist as a big "and") and the concept of
"or" is handled by the graph. This could be acceptable but we cannot split the
graph arbitrarily, so there is currently no way to handle even a basic

if (i == 1 || i == 10)

what should be a very simple conditional. Handling a hash algorithm, like
you say, is good to keep in mind, because we don't want an explosion of
possibilities. If done right, the analyzer should understand for hashing that
anything + anything => anything, and we see no explosion of state.

By "raise an exception" I did mean issue an analyzer warning, yes. Perhaps the
simple answer is to just track the svalue as a possible overflow in the state
machine and report the warning for certain uses, like the alloc family, as you
said. Regardless, proper overflow handling renders my naive binop handler 
unusable,
because all it does is fold the condition and recur. There is basically no logic
to it, and once you reenter eval_condition it is not possible to know how you 
got
there.



Re: PR analyzer/94362 Partial Fix

2021-03-02 Thread brian.sobulefsky via Gcc-patches
Agreed too. Generic "error on overflow" is not an answer, and ignoring overflow
is not an answer either because flagging faulty memory allocations is an
important feature.

Brian


Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Tuesday, March 2, 2021 6:09 PM, Jeff Law  wrote:

> On 3/2/21 6:40 PM, David Malcolm via Gcc-patches wrote:
>
> > > My devil's advocate position would be if the analyzer raises
> > > exception on
> > > any possible overflow, will that overwhelm the user with false
> > > positives?
> > > Presumably by "raise exception" you mean "issue a diagnostic and stop
> > > analyzing this path", right?
> >
> > I think the point is to detect where numerical overflow can lead to
> > e.g. a buffer overflow, rather than complain about numerical overflow
> > in its own right, like in the make_arr example I gave earlier.
>
> WRT overflow, IMHO, the most valuable case to detect overflows is when
> they feed an allocation via malloc/alloca.  If an attacker can arrange
> to overflow the size computation, then they can cause an
> under-allocation which in turn opens the ability to over-write the stack
> or heap data structures, which in turn are great attack vectors.
>
> And in case you think that's contrived, it isn't :-)
>
> http://phrack.org/issues/67/9.html
>
> > > I
> > > am not sure of the answer here, because a piece of me feels that
> > > overflow is not
> > > something that production code should be relying on in any serious
> > > application,
> > > and so should be non existent, but I am not sure if that is
> > > reflective of
> > > reality.
> > > My belief is that production code is full of overflows, but only some
> > > of them are security-sensitive. Consider e.g. hashing algorithms that
> > > sum some values and beningly assume overflow for wraparound as opposed
> > > to the "calculate the size of the buffer to be allocated" example
> > > (where the overflow is a classic security pitfall).
>
> Agreed.
>
> Jeff




Re: [PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-02 Thread Andreas Krebbel via Gcc-patches
On 3/2/21 11:59 PM, Ilya Leoshkevich wrote:
> mul-signed-overflow-*.c execution tests fail on z13, because they
> contain z14-specific instructions.  Fix by requiring s390_z14_hw
> target.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/mul-signed-overflow-1.c: Run only on z14+.
>   * gcc.target/s390/mul-signed-overflow-2.c: Likewise.

I did that change yesterday already.

Andreas

> ---
>  gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c | 3 ++-
>  gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c 
> b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> index fdf56d6e695..e8b1938dab7 100644
> --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
> +/* { dg-do run { target { s390_z14_hw } } } */
>  /* z14 only because we need msrkc, msc, msgrkc, msgc  */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  
> diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c 
> b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> index d0088188aa2..01328e1d286 100644
> --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
> +/* { dg-do run { target { s390_z14_hw } } } */
>  /* z14 only because we need msrkc, msc, msgrkc, msgc  */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  
> 



Re: [PATCH] i386: Fix a peephole2 for -mavx512vl -mno-avx512bw [PR99321]

2021-03-02 Thread Uros Bizjak via Gcc-patches
On Wed, Mar 3, 2021 at 12:29 AM Jakub Jelinek  wrote:
>
> Hi!
>
> As the testcase shows, the
> (define_peephole2
>   [(set (match_operand 0 "sse_reg_operand")
> (match_operand 1 "sse_reg_operand"))
>(set (match_dup 0)
> (match_operator 3 "commutative_operator"
>   [(match_dup 0)
>(match_operand 2 "memory_operand")]))]
> peephole2 can for AVX512VL without AVX512BW (I guess it is a hyphothetical
> CPU, but unfortunately they are separate CPUID bits and we have separate
> options for them) turn something that is valid without that peephole2
> into something that is invalid (and in this case ICEs).
> The problem is that the vpadd[bw], vpmullw, vpmin[su][bw] and vpmax[su][bw]
> instructions require both AVX512BW and AVX512VL when they have
> 16-byte or 32-byte operands.  If operands[0] is %[xy]mm0 .. %[xy]mm15
> but operands[1] is %[xy]mm16 .. %[xy]mm31, then before we have
> a vector move which uses vmovdqa{32,64} and doesn't need AVX512BW,
> AVX512VL is I think implied from HARD_REGNO_MODE_OK only supporting
> V{16Q,32Q,8H,16H}imode in EXT_REX_SSE_REGNO_P regs with AVX512VL, and then
> we have a commutative operation with that %[xy]mm0 .. %[xy]mm15 destination
> and one source and a memory operand, so VEX encoded operation.
> And, the peephole2 wants to replace it with a load into the destination
> register from memory (ok) and then the commutative arith instruction.
> But that needs EVEX encoding because of the high register and so requires
> AVX512BW which might not be enabled.
> The exception is and/ior/xor, because the hw doesn't have
> vp{and,or,xor}{b,w} instructions at all, it uses vp{and,or,xor}d instead
> and that of course doesn't need AVX512BW.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> BTW, there are other bugs I need to look at, while the vp{min,max}ub with
> 16-byte operands instruction properly requires avx512bw for v constraints
> and otherwise uses x, e.g. the vpadd[bw] etc. instructions don't.
> I'll try to handle that incrementally later this week.
>
> 2021-03-02  Jakub Jelinek  
>
> PR target/99321
> * config/i386/i386.md (mov + mem using comm arith peephole2):
> Punt if operands[1] is EXT_REX_SSE_REGNO_P, AVX512BW is not enabled
> and the inner mode is [QH]Imode.
>
> * gcc.target/i386/pr99321.c: New test.

OK, a comment inline.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2021-02-18 09:16:21.026841364 +0100
> +++ gcc/config/i386/i386.md 2021-03-02 17:57:55.878158963 +0100
> @@ -19843,7 +19843,18 @@ (define_peephole2
> (match_operator 3 "commutative_operator"
>   [(match_dup 0)
>(match_operand 2 "memory_operand")]))]
> -  "REGNO (operands[0]) != REGNO (operands[1])"
> +  "REGNO (operands[0]) != REGNO (operands[1])
> +   /* Punt if operands[1] is %[xy]mm16+ and AVX512BW is not enabled,
> +  as EVEX encoded vpadd[bw], vpmullw, vpmin[su][bw] and vpmax[su][bw]
> +  instructions require AVX512BW and AVX512VL, but with the original
> +  instructions it might require just AVX512VL.
> +  AVX512VL is implied from TARGET_HARD_REGNO_MODE_OK.  */
> +   && (!EXT_REX_SSE_REGNO_P (REGNO (operands[1]))
> +   || TARGET_AVX512BW
> +   || GET_MODE_SIZE (GET_MODE_INNER (GET_MODE (operands[0]))) > 2
> +   || GET_CODE (operands[3]) == AND
> +   || GET_CODE (operands[3]) == IOR
> +   || GET_CODE (operands[3]) == XOR)"

Can you please introduce the "logic_operator" predicate, similar to
plusminuslogic_operator and use it here?

>[(set (match_dup 0) (match_dup 2))
> (set (match_dup 0)
> (match_op_dup 3 [(match_dup 0) (match_dup 1)]))])
> --- gcc/testsuite/gcc.target/i386/pr99321.c.jj  2021-03-02 18:05:02.561528747 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr99321.c 2021-03-02 18:05:51.121001799 
> +0100
> @@ -0,0 +1,41 @@
> +/* PR target/99321 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mtune=btver2 -fno-tree-dce -mavx512vl -mno-avx512bw" } 
> */
> +
> +typedef unsigned __attribute__((__vector_size__ (8))) A;
> +typedef unsigned int __attribute__((__vector_size__ (8))) B;
> +typedef unsigned char __attribute__((__vector_size__ (16))) C;
> +typedef unsigned __attribute__((__vector_size__ (16))) D;
> +typedef unsigned int __attribute__((__vector_size__ (16))) E;
> +typedef unsigned __attribute__((__vector_size__ (16))) F;
> +typedef unsigned __attribute__((__vector_size__ (32))) G;
> +typedef int __attribute__((__vector_size__ (32))) H;
> +typedef unsigned int __attribute__((__vector_size__ (32))) I;
> +typedef char __attribute__((__vector_size__ (64))) J;
> +typedef unsigned int __attribute__((__vector_size__ (64))) K;
> +typedef unsigned long long __attribute__((__vector_size__ (64))) L;
> +unsigned char a;
> +unsigned b, c;
> +H d;
> +E e, f;
> +D g;
> +L h;
> +
> +A
> +foo0 (A i, C j, G k, B l, K m, B n, I o)
> +{
> +  J p, q = a != p;
> +  F r = b << f;
> +  int s = a * 15;
> +  C t = (1 << ((C

Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-02 Thread Richard Biener
On Tue, 2 Mar 2021, Martin Sebor wrote:

> On 3/2/21 9:52 AM, Jeff Law via Gcc-patches wrote:
> > 
> > 
> > On 3/1/21 1:39 AM, Richard Biener wrote:
> >> The default diagnostic tree printer relies on dump_generic_node which
> >> for some reason manages to clobber the diagnostic pretty-printer state
> >> so we see garbled diagnostics like
> >>
> >> /home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
> >> D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:
> >> may be used uninitialized in this function [-Wmaybe-uninitialized]
> >>
> >> when the diagnostic is emitted by the LTO fronted.  The following
> >> approach using a temporary pretty-printer for the dump_generic_node
> >> call fixes this for some unknown reason and we issue
> >>
> >> /home/rguenther/src/trunk/gcc/calls.c: In function 'expand_call':
> >> /home/rguenther/src/trunk/gcc/dojump.c:118:28: warning: 'MEM[(struct
> >> poly_int *)&save].D.6750.coeffs[0]' may be used uninitialized in this
> >> function [-Wmaybe-uninitialized]
> >>
> >> [LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> >>
> >> Thanks,
> >> Richard.
> >>
> >> 2021-02-26  Richard Biener  
> >>
> >>  PR middle-end/97855
> >>  * tree-diagnostic.c (default_tree_printer): Use a temporary
> >>  pretty-printer when formatting a tree via dump_generic_node.
> > It'd be good to know why this helps, but I trust your judgment that this
> > is an improvement.
> 
> I don't know if it's related but pr98492 tracks a problem in the C++
> front end caused by reinitializing the pretty printer in a number of
> functions in cp/error.c.  When one of these functions is called while
> the pretty printer is formatting something, the effect of
> the reinitialization is to drop the already formatted contents
> of the printer's buffer.
> 
> IIRC, I tripped over this when working on the MEM_REF formatting
> improvement for -Wuninitialized.

I've poked quite a bit with breakpoints on suspicious pretty-printer
functions and watch points on the pp state but found nothing in the
case I was looking at (curiously also -Wuninitialized).  But I also
wasn't able to understand why the caller should work at all.  And
yes, the C/C++ tree printers also simply format to the passed
pretty-printer...

Hoping that David could shed some light on how this should play
together.  Most specifically

  pp_format (context->printer, &diagnostic->message);

^^^ this is the path affected by the patch

  (*diagnostic_starter (context)) (context, diagnostic);

^^^ this somehow messes things up, it does pp_set_prefix on
context->printer but also does some formatting

  pp_output_formatted_text (context->printer);

and now we expect this to magically output the composed pieces.

Note swapping the first two lines didn't have any effect (I don't
remember if it changed anything so details might have changed but
it was definitely still broken).

That said, the only hint I have is that the diagnostic plus prefix
is quite long, but any problem in the generic code should eventually
affect non-LTO as well but the PR is reported for LTO only
(bogus diagnostics shown during LTO bootstrap).  The patch fixes
all bogus diagnostics during LTO bootstrap.

I originally thought there's maybe a pp_flush too much but maybe
there's a pp_flush missing ...

I'll wait for Davids feedback but will eventually install the
patch to close the bug.

Richard.