[PATCH] [ICE] Check another epilog variable peeling case in vectorizable_nonlinear_induction.

2022-09-13 Thread liuhongt via Gcc-patches
In vectorizable_nonlinear_induction, r13-2503-gc13223b790bbc5 prevent variable 
peeling by
only checking LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo). But when
"!vect_use_loop_mask_for_alignment_p (loop_vinfo) &&
LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0", vectorizer will
still do variable peeling for epilog, and it hits gcc_assert in
vect_peel_nonlinear_iv_init.


Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
The patch also fix ICE of the testcase in the PR for ia64-linux-gnu(verified by 
cross-compile).

Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/106905
* tree-vect-loop.cc (vectorizable_nonlinear_induction): Return
false when !vect_use_loop_mask_for_alignment_p (loop_vinfo) &&
LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106905.c: New test.
* gcc.target/ia64/pr106905.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr106905.c | 14 ++
 gcc/testsuite/gcc.target/ia64/pr106905.c | 20 
 gcc/tree-vect-loop.cc|  6 --
 3 files changed, 38 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106905.c
 create mode 100644 gcc/testsuite/gcc.target/ia64/pr106905.c

diff --git a/gcc/testsuite/gcc.target/i386/pr106905.c 
b/gcc/testsuite/gcc.target/i386/pr106905.c
new file mode 100644
index 000..a190a1c84e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106905.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=silvermont -O2 -fvect-cost-model=dynamic" } */
+
+void
+foo_mul_peel (int *a, int b)
+{
+  int i;
+
+  for (i = 0; i < 7; ++i)
+{
+  b *= 2;
+  a[i] = b;
+}
+}
diff --git a/gcc/testsuite/gcc.target/ia64/pr106905.c 
b/gcc/testsuite/gcc.target/ia64/pr106905.c
new file mode 100644
index 000..1b9656e1203
--- /dev/null
+++ b/gcc/testsuite/gcc.target/ia64/pr106905.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -O3 -fPIC" } */
+long ZDICT_fillNoise_p, ZDICT_trainFromBuffer_legacy_result;
+unsigned ZDICT_fillNoise_acc;
+int ZDICT_totalSampleSize_nbFiles;
+static void ZDICT_fillNoise(void *buffer, long length) {
+  unsigned prime2 = 9;
+  for (ZDICT_fillNoise_p = 0; ZDICT_fillNoise_p < length; ZDICT_fillNoise_p++)
+ZDICT_fillNoise_acc *= ((char *)buffer)[ZDICT_fillNoise_p] = prime2;
+}
+long ZDICT_trainFromBuffer_legacy() {
+  void *newBuff;
+  long total = 0;
+  for (; ZDICT_totalSampleSize_nbFiles;)
+total += 0;
+  long sBuffSize = total;
+  newBuff = 0;
+  ZDICT_fillNoise(newBuff + sBuffSize, 32);
+  return ZDICT_trainFromBuffer_legacy_result;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 8f88f1755be..9c434b66c5b 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -8646,8 +8646,10 @@ vectorizable_nonlinear_induction (loop_vec_info 
loop_vinfo,
   /* Also doens't support peel for neg when niter is variable.
  ??? generate something like niter_expr & 1 ? init_expr : -init_expr?  */
   niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
-  if (niters_skip != NULL_TREE
-  && TREE_CODE (niters_skip) != INTEGER_CST)
+  if ((niters_skip != NULL_TREE
+   && TREE_CODE (niters_skip) != INTEGER_CST)
+  || (!vect_use_loop_mask_for_alignment_p (loop_vinfo)
+ && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0))
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-- 
2.27.0



[r13-2650 Regression] FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (test for excess errors) on Linux/x86_64

2022-09-13 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

5d84a4418aa962a715dc74998fea2a7de9d4042c is the first bad commit
commit 5d84a4418aa962a715dc74998fea2a7de9d4042c
Author: Patrick Palka 
Date:   Tue Sep 13 11:18:14 2022 -0400

libstdc++: Implement ranges::chunk_view from P2442R1

caused

FAIL: g++.dg/modules/xtreme-header-2_a.H module-cmi  
(gcm.cache/$srcdir/g++.dg/modules/xtreme-header-2_a.H.gcm)
FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (internal compiler error: 
tree check: expected record_type or union_type or qual_union_type, have 
typename_type in friend_from_decl_list, at cp/module.cc:4737)
FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-2650/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[PATCH] xtensa: gcc: implement MI thunk generation for call0 ABI

2022-09-13 Thread Max Filippov via Gcc-patches
Suwa-san, could you please take a look?

This change fixes the fowllowing testsuite failures when building for
call0 ABI:

g++.dg/ipa/pr60640-4.C
g++.dg/ipa/pr83549.C
g++.dg/ipa/pr83667.C
g++.dg/torture/pr81812.C

gcc/
* config/xtensa/xtensa.cc (xtensa_can_output_mi_thunk)
(xtensa_output_mi_thunk): New functions.
(TARGET_ASM_CAN_OUTPUT_MI_THUNK)
(TARGET_ASM_OUTPUT_MI_THUNK): New macro definitions.
(xtensa_prepare_expand_call): Use fixed register a8 as temporary
when called with reload_completed set to 1.
---
 gcc/config/xtensa/xtensa.cc | 116 +++-
 1 file changed, 115 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index f1b3331ea558..e5abd356a745 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -191,6 +191,15 @@ static bool xtensa_can_eliminate (const int from 
ATTRIBUTE_UNUSED,
 static HOST_WIDE_INT xtensa_starting_frame_offset (void);
 static unsigned HOST_WIDE_INT xtensa_asan_shadow_offset (void);
 static bool xtensa_function_ok_for_sibcall (tree, tree);
+static bool xtensa_can_output_mi_thunk (const_tree thunk_fndecl 
ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT delta ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT vcall_offset 
ATTRIBUTE_UNUSED,
+   const_tree function ATTRIBUTE_UNUSED);
+static void xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT delta,
+   HOST_WIDE_INT vcall_offset,
+   tree function);
+
 static rtx xtensa_delegitimize_address (rtx);
 
 
@@ -351,6 +360,12 @@ static rtx xtensa_delegitimize_address (rtx);
 #undef TARGET_FUNCTION_OK_FOR_SIBCALL
 #define TARGET_FUNCTION_OK_FOR_SIBCALL xtensa_function_ok_for_sibcall
 
+#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
+#define TARGET_ASM_CAN_OUTPUT_MI_THUNK xtensa_can_output_mi_thunk
+
+#undef TARGET_ASM_OUTPUT_MI_THUNK
+#define TARGET_ASM_OUTPUT_MI_THUNK xtensa_output_mi_thunk
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 
@@ -2173,7 +2188,16 @@ xtensa_prepare_expand_call (int callop, rtx *operands)
 addr = gen_sym_PLT (addr);
 
   if (!call_insn_operand (addr, VOIDmode))
-XEXP (operands[callop], 0) = copy_to_mode_reg (Pmode, addr);
+{
+  /* This may be called while generating MI thunk when we pretend
+that reload is over.  Use a8 as a temporary register in that case.  */
+  rtx reg = can_create_pseudo_p ()
+   ? copy_to_mode_reg (Pmode, addr)
+   : copy_to_suggested_reg (addr,
+gen_rtx_REG (Pmode, A8_REG),
+Pmode);
+  XEXP (operands[callop], 0) = reg;
+}
 }
 
 
@@ -4983,6 +5007,96 @@ xtensa_function_ok_for_sibcall (tree decl 
ATTRIBUTE_UNUSED, tree exp ATTRIBUTE_U
   return true;
 }
 
+static bool
+xtensa_can_output_mi_thunk (const_tree thunk_fndecl ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT delta ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT vcall_offset ATTRIBUTE_UNUSED,
+   const_tree function ATTRIBUTE_UNUSED)
+{
+  if (TARGET_WINDOWED_ABI)
+return false;
+
+  return true;
+}
+
+/* Output code to add DELTA to the first argument, and then jump
+   to FUNCTION.  Used for C++ multiple inheritance.  */
+static void
+xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
+   HOST_WIDE_INT delta,
+   HOST_WIDE_INT vcall_offset,
+   tree function)
+{
+  rtx this_rtx;
+  rtx funexp;
+  rtx_insn *insn;
+  int this_reg_no;
+  rtx temp0 = gen_rtx_REG (Pmode, A9_REG);
+  const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk));
+
+  reload_completed = 1;
+
+  if (aggregate_value_p (TREE_TYPE (TREE_TYPE (function)), function))
+this_reg_no = 3;
+  else
+this_reg_no = 2;
+
+  this_rtx = gen_rtx_REG (Pmode, A0_REG + this_reg_no);
+
+  if (delta)
+{
+  if (xtensa_simm8 (delta))
+   emit_insn (gen_addsi3 (this_rtx, this_rtx, GEN_INT (delta)));
+  else
+   {
+ emit_move_insn (temp0, GEN_INT (delta));
+ emit_insn (gen_addsi3 (this_rtx, this_rtx, temp0));
+   }
+}
+
+  if (vcall_offset)
+{
+  rtx temp1 = gen_rtx_REG (Pmode, A0_REG + 10);
+  rtx addr = temp1;
+
+  emit_move_insn (temp0, gen_rtx_MEM (Pmode, this_rtx));
+  if (xtensa_uimm8x4 (vcall_offset))
+   addr = plus_constant (Pmode, temp0, vcall_offset);
+  else if (xtensa_simm8 (vcall_offset))
+   emit_insn (gen_addsi3 (temp1, temp0, GEN_INT (vcall_offset)));
+  else
+   {
+ emit_move_insn (temp1, GEN_INT (vcall_offset));
+ emit_insn (gen_addsi3 (temp1, temp0, temp1));
+   }
+  emit_move_insn (temp1, gen_rtx_MEM (Pmode, addr));
+  emit_insn (gen_add2_insn 

[PATCH v3 10/11] Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE

2022-09-13 Thread Julian Brown
This patch changes the representation of OMP array sections in the
C++ front end to use the new OMP_ARRAY_SECTION tree code instead of a
TREE_LIST.  This is important for "declare mapper" support, because the
array section representation may stick around longer (in "declare mapper"
definitions), and special-case handling TREE_LIST becomes necessary in
more places, which starts to become unwieldy.

Including this patch to support the surrounding unfinished/FYI patches.

2022-02-18  Julian Brown  

gcc/c-family/
* c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION.

gcc/cp/
* parser.cc (cp_parser_omp_var_list_no_open): Use OMP_ARRAY_SECTION
code instead of TREE_LIST to represent OpenMP array sections.
* pt.cc (tsubst_copy, tsubst_omp_clause_decl, tsubst_copy_and_build):
Add OMP_ARRAY_SECTION support.
* semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections,
cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION
instead of TREE_LIST where appropriate.
* gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been
processed out before gimplification.
---
 gcc/c-family/c-omp.cc | 14 ++
 gcc/cp/parser.cc  | 15 ---
 gcc/cp/pt.cc  | 52 
 gcc/cp/semantics.cc   | 62 ++-
 gcc/gimplify.cc   |  3 +++
 5 files changed, 112 insertions(+), 34 deletions(-)

diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc
index e3cda716532..2c25b0072bc 100644
--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -2671,6 +2671,9 @@ c_omp_split_clauses (location_t loc, enum tree_code code,
}
  else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == TREE_LIST)
{
+ /* TODO: This can go away once we transition all uses of
+TREE_LIST for representing OMP array sections to
+OMP_ARRAY_SECTION.  */
  tree t;
  for (t = OMP_CLAUSE_DECL (c);
   TREE_CODE (t) == TREE_LIST; t = TREE_CHAIN (t))
@@ -2679,6 +2682,17 @@ c_omp_split_clauses (location_t loc, enum tree_code code,
bitmap_clear_bit (_head, DECL_UID (t));
  break;
}
+ else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == OMP_ARRAY_SECTION)
+   {
+ tree t;
+ for (t = OMP_CLAUSE_DECL (c);
+  TREE_CODE (t) == OMP_ARRAY_SECTION;
+  t = TREE_OPERAND (t, 0))
+   ;
+ if (DECL_P (t))
+   bitmap_clear_bit (_head, DECL_UID (t));
+ break;
+   }
  /* FALLTHRU */
case OMP_CLAUSE_PRIVATE:
case OMP_CLAUSE_FIRSTPRIVATE:
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5468502cd26..80d33c3b674 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -36911,11 +36911,14 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, 
enum omp_clause_code kind,
decl = TREE_OPERAND (decl, 0);
 
  for (int i = dims.length () - 1; i >= 0; i--)
-   decl = tree_cons (dims[i].low_bound, dims[i].length, decl);
+   decl = build3_loc (input_location, OMP_ARRAY_SECTION,
+  TREE_TYPE (decl), decl, dims[i].low_bound,
+  dims[i].length);
}
  else if (TREE_CODE (decl) == INDIRECT_REF)
{
  bool ref_p = REFERENCE_REF_P (decl);
+ tree type = TREE_TYPE (decl);
 
  /* Turn *foo into the representation previously used for
 foo[0].  */
@@ -36925,7 +36928,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum 
omp_clause_code kind,
  /* ...but don't add the [0:1] representation for references
 (because they have special handling elsewhere).  */
  if (!ref_p)
-   decl = tree_cons (integer_zero_node, integer_one_node, decl);
+   decl = build3_loc (input_location, OMP_ARRAY_SECTION, type,
+  decl, integer_zero_node, integer_one_node);
}
  else if (TREE_CODE (decl) == ARRAY_REF)
{
@@ -36934,7 +36938,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum 
omp_clause_code kind,
  decl = TREE_OPERAND (decl, 0);
  STRIP_NOPS (decl);
 
- decl = tree_cons (idx, integer_one_node, decl);
+ decl = build3_loc (input_location, OMP_ARRAY_SECTION,
+TREE_TYPE (decl), decl, idx, integer_one_node);
}
  else if (TREE_CODE (decl) == NON_LVALUE_EXPR
   || CONVERT_EXPR_P (decl))
@@ -37119,7 

[PATCH v3 11/11] FYI/unfinished: OpenMP 5.0 "declare mapper" support for C++

2022-09-13 Thread Julian Brown
This patch implements OpenMP 5.0 "declare mapper" support for C++.
This hasn't been fully revised yet following previous review comments,
but I am including it in this series to demonstrate the new approach to
gimplifying map clauses after "declare mapper" instantiation.

The "gimplify_scan_omp_clauses" function is still called twice: firstly
(before scanning the offload region's body) with SUPPRESS_GIMPLIFICATION
set to true.  This sets up variables in the splay tree, etc. but does
not gimplify anything.

Then, implicit "declare mappers" are instantiated after scanning the
region's body, then "gimplify_scan_omp_clauses" is called again, and
does the rest of its previous tasks -- builds struct sibling lists,
and gimplifies clauses. Then gimplify_adjust_omp_clauses is called,
and compilation continues.

Does this approach seem OK?

Thanks,

Julian
---
 gcc/cp/cp-gimplify.cc |   6 +
 gcc/cp/cp-objcp-common.h  |   2 +
 gcc/cp/cp-tree.h  |  10 +
 gcc/cp/decl.cc|  18 +-
 gcc/cp/mangle.cc  |   5 +-
 gcc/cp/name-lookup.cc |   3 +-
 gcc/cp/parser.cc  | 393 +-
 gcc/cp/pt.cc  |  92 +++-
 gcc/cp/semantics.cc   | 495 +-
 gcc/fortran/parse.cc  |   3 +
 gcc/gimplify.cc   | 379 --
 gcc/langhooks-def.h   |   3 +
 gcc/langhooks.cc  |   9 +
 gcc/langhooks.h   |   4 +
 gcc/omp-general.h |  52 ++
 gcc/testsuite/c-c++-common/gomp/map-6.c   |   6 +-
 gcc/testsuite/g++.dg/gomp/declare-mapper-1.C  |  58 ++
 gcc/testsuite/g++.dg/gomp/declare-mapper-2.C  |  30 ++
 gcc/testsuite/g++.dg/gomp/declare-mapper-3.C  |  27 +
 gcc/testsuite/g++.dg/gomp/declare-mapper-4.C  |  74 +++
 gcc/testsuite/g++.dg/gomp/ind-base-3.C|   1 -
 gcc/testsuite/g++.dg/gomp/member-array-2.C|   1 -
 gcc/tree-core.h   |   4 +
 gcc/tree-pretty-print.cc  |  42 ++
 gcc/tree.cc   |   2 +
 gcc/tree.def  |   7 +
 gcc/tree.h|  21 +
 include/gomp-constants.h  |   8 +-
 .../testsuite/libgomp.c++/declare-mapper-1.C  |  87 +++
 .../testsuite/libgomp.c++/declare-mapper-2.C  |  55 ++
 .../testsuite/libgomp.c++/declare-mapper-3.C  |  63 +++
 .../testsuite/libgomp.c++/declare-mapper-4.C  |  63 +++
 .../testsuite/libgomp.c++/declare-mapper-5.C  |  52 ++
 .../testsuite/libgomp.c++/declare-mapper-6.C  |  37 ++
 .../testsuite/libgomp.c++/declare-mapper-7.C  |  48 ++
 .../testsuite/libgomp.c++/declare-mapper-8.C  |  61 +++
 36 files changed, 2161 insertions(+), 60 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-2.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-3.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-4.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-1.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-2.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-3.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-4.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-5.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-6.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-7.C
 create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-8.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index c05be833357..050049f3c1a 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -2282,6 +2282,12 @@ cxx_omp_finish_clause (tree c, gimple_seq *, bool /* 
openacc */)
 }
 }
 
+tree
+cxx_omp_finish_mapper_clauses (tree clauses)
+{
+  return finish_omp_clauses (clauses, C_ORT_OMP);
+}
+
 /* Return true if DECL's DECL_VALUE_EXPR (if any) should be
disregarded in OpenMP construct, because it is going to be
remapped during OpenMP lowering.  SHARED is true if DECL
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 1a67f14d9b3..b6f72e004d3 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -185,6 +185,8 @@ extern tree cxx_simulate_record_decl (location_t, const 
char *,
 #define LANG_HOOKS_OMP_CLAUSE_DTOR cxx_omp_clause_dtor
 #undef LANG_HOOKS_OMP_FINISH_CLAUSE
 #define LANG_HOOKS_OMP_FINISH_CLAUSE cxx_omp_finish_clause
+#undef LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES
+#define LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES cxx_omp_finish_mapper_clauses
 #undef LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE
 #define LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE cxx_omp_privatize_by_reference
 #undef 

[PATCH v3 06/11] OpenMP: Pointers and member mappings

2022-09-13 Thread Julian Brown
This patch was previously posted as part of the series supporting
"declare mapper" for Fortran, here:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596041.html

Implementing the "omp declare mapper" functionality, I noticed some
cases where handling of derived type members that are pointers doesn't
seem to be quite right. At present, a type such as this:

  type T
  integer, pointer, dimension(:) :: arrptr
  end type T

  type(T) :: tvar
  [...]
  !$omp target map(tofrom: tvar%arrptr)

will be mapped using three mapping nodes:

  GOMP_MAP_TO tvar%arrptr   (the descriptor)
  GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data  (a pointer to the array data)

This follows OMP 5.0, 2.19.7.1 "map Clause":

  "If a list item in a map clause is an associated pointer and the
   pointer is not the base pointer of another list item in a map clause
   on the same construct, then it is treated as if its pointer target
   is implicitly mapped in the same clause. For the purposes of the map
   clause, the mapped pointer target is treated as if its base pointer
   is the associated pointer."

However, we can also write this:

  map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))

and then instead we should follow:

  "If the structure sibling list item is a pointer then it is treated
   as if its association status is undefined, unless it appears as
   the base pointer of another list item in a map clause on the same
   construct."

But, that's not implemented quite right at the moment (and completely
breaks once we introduce declare mappers), because we still map the "to:
tvar%arrptr" as the descriptor and the entire array, then we map the
"tvar%arrptr(3:8)" part using the descriptor (again!) and the array slice.

The solution is to detect when we're mapping a smaller part of the array
(or a subcomponent) on the same directive, and only map the descriptor
in that case. So we get mappings like this instead:

  map(to: tvar%arrptr)   -->
  GOMP_MAP_ALLOC  tvar%arrptr  (the descriptor)

  map(tofrom: tvar%arrptr(3:8)   -->
  GOMP_MAP_TOFROM tvar%arrptr%data(3) (size 8-3+1, etc.)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (bias 3, etc.)

2022-09-13  Julian Brown  

gcc/fortran/
* trans-openmp.cc (dependency.h): Include.
(gfc_trans_omp_array_section): Do not map descriptors here for OpenMP.
(gfc_trans_omp_clauses): Check subcomponent and subarray/element
accesses elsewhere in the clause list for pointers to derived types or
array descriptors, and map just the pointer/descriptor if we have any.

libgomp/
* testsuite/libgomp.fortran/map-subarray.f90: New test.
* testsuite/libgomp.fortran/map-subcomponents.f90: New test.
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Adjust for
descriptor-mapping changes.
---
 gcc/fortran/trans-openmp.cc   | 104 --
 .../libgomp.fortran/map-subarray.f90  |  33 ++
 .../libgomp.fortran/map-subcomponents.f90 |  32 ++
 .../libgomp.fortran/struct-elem-map-1.f90 |  10 +-
 4 files changed, 163 insertions(+), 16 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subarray.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subcomponents.f90

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 4949f001fed..a5411df4ad7 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-general.h"
 #include "omp-low.h"
 #include "memmodel.h"  /* For MEMMODEL_ enums.  */
+#include "dependency.h"
 
 #undef GCC_DIAG_STYLE
 #define GCC_DIAG_STYLE __gcc_tdiag__
@@ -2470,22 +2471,18 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
gfc_omp_namelist *n,
 }
   if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
 {
-  tree desc_node;
   tree type = TREE_TYPE (decl);
   ptr2 = gfc_conv_descriptor_data_get (decl);
-  desc_node = build_omp_clause (input_location, OMP_CLAUSE_MAP);
-  OMP_CLAUSE_DECL (desc_node) = decl;
-  OMP_CLAUSE_SIZE (desc_node) = TYPE_SIZE_UNIT (type);
   if (ptr_kind == GOMP_MAP_ALWAYS_POINTER)
{
- OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO);
- node2 = node;
- node = desc_node;  /* Needs to come first.  */
-   }
-  else
-   {
- OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO_PSET);
- node2 = desc_node;
+ /* For OpenMP, the descriptor must be mapped with its own explicit
+map clause (e.g. both "map(foo%arr)" and "map(foo%arr(:))" must
+be present in the clause list if "foo%arr" is a pointer to an
+array).  So, we don't create a GOMP_MAP_TO_PSET node here.  */
+ node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
+ OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET);
+ 

[PATCH v3 09/11] FYI/unfinished: OpenMP: lvalue parsing for map clauses (C++)

2022-09-13 Thread Julian Brown
This patch changes parsing for OpenMP map clauses in C++ to use the
generic expression parser, hence adds support for parsing general
lvalues (as required by OpenMP 5.0+).

This patch hasn't been fully revised following previous review comments
yet, but I'm including it in support of the following (declare mapper)
patch, and to demonstrate the cases in baseptrs-4.C that work with the
lvalue-parsing support enabled.
---
 gcc/c-family/c-omp.cc |   1 +
 gcc/cp/error.cc   |   9 +
 gcc/cp/parser.cc  | 141 +--
 gcc/cp/parser.h   |   3 +
 gcc/cp/semantics.cc   |   7 +-
 gcc/gimplify.cc   |  39 -
 gcc/testsuite/c-c++-common/gomp/map-6.c   |   4 +-
 gcc/testsuite/g++.dg/gomp/ind-base-3.C|  38 
 gcc/testsuite/g++.dg/gomp/map-assignment-1.C  |  12 ++
 gcc/testsuite/g++.dg/gomp/map-inc-1.C |  10 ++
 gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C  |  19 ++
 gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C  |  37 
 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C  |  40 +
 .../g++.dg/gomp/map-static-cast-lvalue-1.C|  17 ++
 gcc/testsuite/g++.dg/gomp/map-ternary-1.C |  20 +++
 gcc/testsuite/g++.dg/gomp/member-array-2.C|  92 ++
 gcc/testsuite/g++.dg/gomp/pr67522.C   |   2 +-
 gcc/tree-pretty-print.cc  |  14 ++
 gcc/tree.def  |   3 +
 libgomp/testsuite/libgomp.c++/baseptrs-4.C|  26 ++-
 libgomp/testsuite/libgomp.c++/ind-base-1.C| 162 ++
 libgomp/testsuite/libgomp.c++/ind-base-2.C|  49 ++
 libgomp/testsuite/libgomp.c++/map-comma-1.C   |  15 ++
 .../testsuite/libgomp.c++/map-rvalue-ref-1.C  |  22 +++
 libgomp/testsuite/libgomp.c++/struct-ref-1.C  |  97 +++
 .../libgomp.c-c++-common/array-field-1.c  |  35 
 .../libgomp.c-c++-common/array-of-struct-1.c  |  65 +++
 .../libgomp.c-c++-common/array-of-struct-2.c  |  65 +++
 28 files changed, 1009 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gomp/ind-base-3.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-assignment-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-inc-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-static-cast-lvalue-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/map-ternary-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-2.C
 create mode 100644 libgomp/testsuite/libgomp.c++/ind-base-1.C
 create mode 100644 libgomp/testsuite/libgomp.c++/ind-base-2.C
 create mode 100644 libgomp/testsuite/libgomp.c++/map-comma-1.C
 create mode 100644 libgomp/testsuite/libgomp.c++/map-rvalue-ref-1.C
 create mode 100644 libgomp/testsuite/libgomp.c++/struct-ref-1.C
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-field-1.c
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-of-struct-1.c
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-of-struct-2.c

diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc
index a2373fda537..e3cda716532 100644
--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -3220,6 +3220,7 @@ c_omp_address_inspector::map_supported_p ()
 || TREE_CODE (t) == SAVE_EXPR
 || TREE_CODE (t) == POINTER_PLUS_EXPR
 || TREE_CODE (t) == NON_LVALUE_EXPR
+|| TREE_CODE (t) == OMP_ARRAY_SECTION
 || TREE_CODE (t) == NOP_EXPR)
 if (TREE_CODE (t) == COMPOUND_EXPR)
   t = TREE_OPERAND (t, 1);
diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index 94181e76d0e..5a7f41c1ca0 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -2471,6 +2471,15 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
   pp_cxx_right_bracket (pp);
   break;
 
+case OMP_ARRAY_SECTION:
+  dump_expr (pp, TREE_OPERAND (t, 0), flags);
+  pp_cxx_left_bracket (pp);
+  dump_expr (pp, TREE_OPERAND (t, 1), flags);
+  pp_colon (pp);
+  dump_expr (pp, TREE_OPERAND (t, 2), flags);
+  pp_cxx_right_bracket (pp);
+  break;
+
 case UNARY_PLUS_EXPR:
   dump_unary_op (pp, "+", t, flags);
   break;
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 68fc78e7c5c..5468502cd26 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -4320,6 +4320,9 @@ cp_parser_new (cp_lexer *lexer)
   parser->omp_declare_simd = NULL;
   parser->oacc_routine = NULL;
 
+  /* Allow array slice in expression.  */
+  parser->omp_array_section_p = false;
+
   /* Not declaring an implicit function template.  */
   parser->auto_is_implicit_function_template_parm_p = false;
   parser->fully_implicit_function_template_p = false;
@@ -8100,6 +8103,7 @@ cp_parser_postfix_open_square_expression (cp_parser 
*parser,
   releasing_vec 

[PATCH v3 07/11] OpenMP/OpenACC: Reindent TO/FROM/_CACHE_ stanza in {c_}finish_omp_clause

2022-09-13 Thread Julian Brown
This patch trivially adds braces and reindents the
OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza in
c_finish_omp_clause and finish_omp_clause, in preparation for the
following patch (to clarify the diff a little).

2022-09-13  Julian Brown  

gcc/c/
* c-typeck.cc (c_finish_omp_clauses): Add braces and reindent
OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza.

gcc/cp/
* semantics.cc (finish_omp_clause): Add braces and reindent
OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza.
---
 gcc/c/c-typeck.cc   | 615 +-
 gcc/cp/semantics.cc | 778 ++--
 2 files changed, 702 insertions(+), 691 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7da8d70b3bd..5ebeeb8d956 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -14999,321 +14999,326 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
case OMP_CLAUSE_TO:
case OMP_CLAUSE_FROM:
case OMP_CLAUSE__CACHE_:
- t = OMP_CLAUSE_DECL (c);
- if (TREE_CODE (t) == TREE_LIST)
-   {
- grp_start_p = pc;
- grp_sentinel = OMP_CLAUSE_CHAIN (c);
+ {
+   t = OMP_CLAUSE_DECL (c);
+   if (TREE_CODE (t) == TREE_LIST)
+ {
+   grp_start_p = pc;
+   grp_sentinel = OMP_CLAUSE_CHAIN (c);
 
- if (handle_omp_array_sections (c, ort))
-   remove = true;
- else
-   {
- t = OMP_CLAUSE_DECL (c);
- if (!omp_mappable_type (TREE_TYPE (t)))
-   {
- error_at (OMP_CLAUSE_LOCATION (c),
-   "array section does not have mappable type "
-   "in %qs clause",
-   omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
- remove = true;
-   }
- else if (TYPE_ATOMIC (TREE_TYPE (t)))
-   {
- error_at (OMP_CLAUSE_LOCATION (c),
-   "%<_Atomic%> %qE in %qs clause", t,
-   omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
- remove = true;
-   }
- while (TREE_CODE (t) == ARRAY_REF)
-   t = TREE_OPERAND (t, 0);
- if (TREE_CODE (t) == COMPONENT_REF
- && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
-   {
- do
-   {
- t = TREE_OPERAND (t, 0);
- if (TREE_CODE (t) == MEM_REF
- || TREE_CODE (t) == INDIRECT_REF)
-   {
- t = TREE_OPERAND (t, 0);
- STRIP_NOPS (t);
- if (TREE_CODE (t) == POINTER_PLUS_EXPR)
-   t = TREE_OPERAND (t, 0);
-   }
-   }
- while (TREE_CODE (t) == COMPONENT_REF
-|| TREE_CODE (t) == ARRAY_REF);
-
- if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
- && OMP_CLAUSE_MAP_IMPLICIT (c)
- && (bitmap_bit_p (_head, DECL_UID (t))
- || bitmap_bit_p (_field_head, DECL_UID (t))
- || bitmap_bit_p (_firstprivate_head,
-  DECL_UID (t
-   {
- remove = true;
- break;
-   }
- if (bitmap_bit_p (_field_head, DECL_UID (t)))
-   break;
- if (bitmap_bit_p (_head, DECL_UID (t)))
-   {
- if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
-   error_at (OMP_CLAUSE_LOCATION (c),
- "%qD appears more than once in motion "
- "clauses", t);
- else if (ort == C_ORT_ACC)
-   error_at (OMP_CLAUSE_LOCATION (c),
- "%qD appears more than once in data "
- "clauses", t);
- else
-   error_at (OMP_CLAUSE_LOCATION (c),
- "%qD appears more than once in map "
- "clauses", t);
- remove = true;
-   }
- else
-   {
- bitmap_set_bit (_head, DECL_UID (t));
- bitmap_set_bit (_field_head, DECL_UID (t));
-   }
-   }
-   }
- if 

[PATCH v3 05/11] OpenMP: push attaches to end of clause list in "target" regions

2022-09-13 Thread Julian Brown
This patch moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes to
the end of the clause list, for offload regions.  This ensures that when
we do the attach operation, both the "attachment point" and the target
region have both already been mapped on the target.  This avoids a
pathological case that can otherwise happen with struct sibling-list
handling.

2022-09-13  Julian Brown  

gcc/
* gimplify.cc (omp_segregate_mapping_groups): Update comment.
(omp_push_attaches_to_end): New function.
(gimplify_scan_omp_clauses): Use omp_push_attaches_to_end for offloaded
regions.
---
 gcc/gimplify.cc | 66 -
 gcc/testsuite/g++.dg/gomp/target-lambda-1.C |  2 +-
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index c7998c2ccbd..bc7848843b3 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9641,7 +9641,9 @@ omp_tsort_mapping_groups (vec *groups,
 /* Split INLIST into two parts, moving groups corresponding to
ALLOC/RELEASE/DELETE mappings to one list, and other mappings to another.
The former list is then appended to the latter.  Each sub-list retains the
-   order of the original list.  */
+   order of the original list.
+   See also omp_push_attaches_to_end below -- we call that later after scanning
+   omp clauses.  */
 
 static omp_mapping_group *
 omp_segregate_mapping_groups (omp_mapping_group *inlist)
@@ -9681,6 +9683,55 @@ omp_segregate_mapping_groups (omp_mapping_group *inlist)
   return tf_groups;
 }
 
+/* This function moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes to the
+   end of the clause list, for offload regions.  This ensures that when we do
+   the attach, both the "attachment point" and the target region have both
+   already been mapped on the target.  This avoids a pathological case that can
+   otherwise happen with struct sibling-list handling.
+
+   Do not call this for non-offload regions, e.g. for "enter data" or
+   "exit data" directives.
+
+   The order of attach nodes and of non-attach nodes is otherwise retained.  */
+
+static tree
+omp_push_attaches_to_end (tree list)
+{
+  tree nonattach_list = NULL_TREE, attach_list = NULL_TREE;
+  tree *nonattach_tail = _list, *attach_tail = _list;
+
+  for (tree w = list; w;)
+{
+  tree next = OMP_CLAUSE_CHAIN (w);
+
+  if (OMP_CLAUSE_CODE (w) != OMP_CLAUSE_MAP)
+   goto nonattach;
+
+  switch (OMP_CLAUSE_MAP_KIND (w))
+   {
+   case GOMP_MAP_ATTACH:
+   case GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION:
+ *attach_tail = w;
+ OMP_CLAUSE_CHAIN (w) = NULL_TREE;
+ attach_tail = _CLAUSE_CHAIN (w);
+ break;
+
+   default:
+   nonattach:
+ *nonattach_tail = w;
+ OMP_CLAUSE_CHAIN (w) = NULL_TREE;
+ nonattach_tail = _CLAUSE_CHAIN (w);
+   }
+
+  w = next;
+}
+
+  /* Splice lists together.  */
+  *nonattach_tail = attach_list;
+
+  return nonattach_list;
+}
+
 /* Given a list LIST_P containing groups of mappings given by GROUPS, reorder
those groups based on the output list of omp_tsort_mapping_groups --
singly-linked, threaded through each element's NEXT pointer starting at
@@ -11950,7 +12001,18 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
list_p = _CLAUSE_CHAIN (c);
 }
 
-  ctx->clauses = *orig_list_p;
+  if ((region_type & ORT_TARGET) != 0)
+/* If we have a target region, we can push all the attaches to the end of
+   the list (we may have standalone "attach" operations synthesized for
+   GOMP_MAP_STRUCT nodes that must be processed after the attachment point
+   AND the pointed-to block have been mapped).  */
+ctx->clauses = omp_push_attaches_to_end (*orig_list_p);
+  else
+/* ...but if we have something else, e.g. "enter data", we need to keep
+   "attach" nodes together with the previous node they attach to so that
+   separate "exit data" operations work properly (see libgomp/target.c).  
*/
+ctx->clauses = *orig_list_p;
+
   gimplify_omp_ctxp = ctx;
 }
 
diff --git a/gcc/testsuite/g++.dg/gomp/target-lambda-1.C 
b/gcc/testsuite/g++.dg/gomp/target-lambda-1.C
index bff7fa7c669..5ce8ceadb19 100644
--- a/gcc/testsuite/g++.dg/gomp/target-lambda-1.C
+++ b/gcc/testsuite/g++.dg/gomp/target-lambda-1.C
@@ -87,7 +87,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump {#pragma omp target num_teams.* 
firstprivate\(b\) map\(alloc:MEM.* \[len: 0\]\) map\(firstprivate:iptr 
\[pointer assign, bias: 0\]\) map\(alloc:MEM.* \[len: 0\]\) 
map\(firstprivate:this \[pointer assign, bias: 0\]\) map\(to:\*__closure \[len: 
[0-9]+\]\) map\(firstprivate:__closure \[pointer assign, bias: 0\]\) 
map\(tofrom:\*_[0-9]+ \[len: [0-9]+\]\) map\(always_pointer:__closure->__this 
\[pointer assign, bias: 0\]\) map\(from:mapped \[len: [0-9]+\]\) 
map\(alloc:\*_[0-9]+ \[len: 0\]\) 
map\(attach_zero_length_array_section:__closure->__iptr 

[PATCH v3 04/11] OpenMP/OpenACC: mapping group list-handling improvements

2022-09-13 Thread Julian Brown
This patch adjusts OpenMP/OpenACC clause list handling in a couple of
places, in preparation for the following mapping-clause expansion rework
patch. Firstly mapping groups are removed as a whole in the C and C++
front-ends when an error is detected, which avoids leaving "badly-formed"
mapping clause groups in the list.

Secondly, reindexing of the omp_mapping_group hashmap (during
omp_build_struct_sibling_lists) has been reimplemented, fixing some
tricky corner-cases where mapping groups are removed from a list at the
same time as it is being reordered.

Thirdly, code to check if a different clause on the same directive maps
the whole of a struct that we have a component mapping for (for example)
has been outlined, removing a bit of code duplication.

(These changes could be split into different patches if necessary.)

2022-09-13  Julian Brown  

gcc/
* gimplify.cc (omp_group_last): Allow GOMP_MAP_ATTACH_DETACH after
GOMP_MAP_STRUCT (for reindexing).
(omp_gather_mapping_groups): Reimplement using...
(omp_gather_mapping_groups_1): This new function.  Stop processing at
GATHER_SENTINEL.
(omp_group_base): Allow GOMP_MAP_TO_PSET without any following node.
(omp_index_mapping_groups): Reimplement using...
(omp_index_mapping_groups_1): This new function.  Handle
REINDEX_SENTINEL.
(omp_reindex_mapping_groups, omp_mapped_by_containing_struct): New
functions.
(omp_tsort_mapping_groups_1): Adjust handling of base group being the
same as current group.  Use omp_mapped_by_containing_struct.
(omp_build_struct_sibling_lists): Use omp_mapped_by_containing_struct
and omp_reindex_mapping_groups.  Robustify group deletion for reordered
lists.
(gimplify_scan_omp_clauses): Update calls to
omp_build_struct_sibling_lists.

gcc/c/
* c-typeck.cc (c_finish_omp_clauses): Remove whole mapping node group
on error.

gcc/cp/
* semantics.cc (finish_omp_clauses): Likewise.
---
 gcc/c/c-typeck.cc   |  24 -
 gcc/cp/semantics.cc |  26 -
 gcc/gimplify.cc | 227 +++-
 3 files changed, 209 insertions(+), 68 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ee891ee33c2..7da8d70b3bd 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -14229,12 +14229,19 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
  break;
}
 
+  tree *grp_start_p = NULL, grp_sentinel = NULL_TREE;
+
   for (pc = , c = clauses; c ; c = *pc)
 {
   bool remove = false;
   bool need_complete = false;
   bool need_implicitly_determined = false;
 
+  /* We've reached the end of a list of expanded nodes.  Reset the group
+start pointer.  */
+  if (c == grp_sentinel)
+   grp_start_p = NULL;
+
   switch (OMP_CLAUSE_CODE (c))
{
case OMP_CLAUSE_SHARED:
@@ -14995,6 +15002,9 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
  t = OMP_CLAUSE_DECL (c);
  if (TREE_CODE (t) == TREE_LIST)
{
+ grp_start_p = pc;
+ grp_sentinel = OMP_CLAUSE_CHAIN (c);
+
  if (handle_omp_array_sections (c, ort))
remove = true;
  else
@@ -15638,7 +15648,19 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
}
 
   if (remove)
-   *pc = OMP_CLAUSE_CHAIN (c);
+   {
+ if (grp_start_p)
+   {
+ /* If we found a clause to remove, we want to remove the whole
+expanded group, otherwise gimplify
+(omp_resolve_clause_dependencies) can get confused.  */
+ *grp_start_p = grp_sentinel;
+ pc = grp_start_p;
+ grp_start_p = NULL;
+   }
+ else
+   *pc = OMP_CLAUSE_CHAIN (c);
+   }
   else
pc = _CLAUSE_CHAIN (c);
 }
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index ae7c8ea7b1f..7302d21fc54 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -6755,11 +6755,18 @@ finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
  break;
}
 
+  tree *grp_start_p = NULL, grp_sentinel = NULL_TREE;
+
   for (pc = , c = clauses; c ; c = *pc)
 {
   bool remove = false;
   bool field_ok = false;
 
+  /* We've reached the end of a list of expanded nodes.  Reset the group
+start pointer.  */
+  if (c == grp_sentinel)
+   grp_start_p = NULL;
+
   switch (OMP_CLAUSE_CODE (c))
{
case OMP_CLAUSE_SHARED:
@@ -7985,6 +7992,9 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
  t = OMP_CLAUSE_DECL (c);
  if (TREE_CODE (t) == TREE_LIST)
{
+ grp_start_p = pc;
+ grp_sentinel = OMP_CLAUSE_CHAIN (c);
+
  if (handle_omp_array_sections (c, ort))
remove = true;
   

[PATCH v3 03/11] OpenMP/OpenACC struct sibling list gimplification extension and rework

2022-09-13 Thread Julian Brown
This patch refactors struct sibling-list processing in gimplify.cc, and
adjusts some related mapping-clause processing in the Fortran FE and
omp-low.cc accordingly.  This version addresses some previous review
comments, mostly with regards to the naming of local helper functions.

(Previous version was approved already, pending those changes.)

2022-09-13  Julian Brown  

gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Don't create
GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER
mappings for POINTER_TYPE_P decls.

gcc/
* gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS.
(GOMP_FIRSTPRIVATE_IMPLICIT): Renumber.
(insert_struct_comp_map): Refactor function into...
(build_struct_comp_nodes): This new function.  Remove list handling
and improve self-documentation.
(extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters.  Move
code to strip outer parts of address out of function, but strip no-op
conversions.
(omp_mapping_group): Add DELETED field for use during reindexing.
(omp_strip_components_and_deref, omp_strip_indirections): New functions.
(omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling.
(omp_gather_mapping_groups): Initialise DELETED field for new groups.
(omp_index_mapping_groups): Notice DELETED groups when (re)indexing.
(omp_siblist_insert_node_after, omp_siblist_move_node_after,
omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New
helper functions.
(omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT
node groups for sibling lists. Outlined from gimplify_scan_omp_clauses.
(omp_build_struct_sibling_lists): New function.
(gimplify_scan_omp_clauses): Remove struct_map_to_clause,
struct_seen_clause, struct_deref_set.  Call
omp_build_struct_sibling_lists as pre-pass instead of handling sibling
lists in the function's main processing loop.
(gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS
handling, unused now.
* omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect
struct references, and references to pointers to structs also.

gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
* g++.dg/gomp/target-3.C: Update expected output.
* g++.dg/gomp/target-lambda-1.C: Likewise.
* g++.dg/gomp/target-this-2.C: Likewise.
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here.
* c-c++-common/gomp/target-50.c: New test.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move
test to here, make "run" test.
---
 gcc/fortran/trans-openmp.cc   |   20 +-
 gcc/gimplify.cc   | 1519 ++---
 gcc/omp-low.cc|   16 +-
 gcc/testsuite/c-c++-common/gomp/target-50.c   |   23 +
 gcc/testsuite/g++.dg/goacc/member-array-acc.C |   13 +
 gcc/testsuite/g++.dg/gomp/member-array-omp.C  |   13 +
 gcc/testsuite/g++.dg/gomp/target-3.C  |4 +-
 gcc/testsuite/g++.dg/gomp/target-lambda-1.C   |3 +-
 gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +-
 .../testsuite/libgomp.oacc-c++/deep-copy-17.C |  101 ++
 .../libgomp.oacc-c-c++-common/deep-copy-15.c  |   68 +
 .../libgomp.oacc-c-c++-common/deep-copy-16.c  |  231 +++
 .../deep-copy-arrayofstruct.c |2 +-
 13 files changed, 1362 insertions(+), 653 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/target-50.c
 create mode 100644 gcc/testsuite/g++.dg/goacc/member-array-acc.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-omp.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c
 rename {gcc/testsuite/c-c++-common/goacc => 
libgomp/testsuite/libgomp.oacc-c-c++-common}/deep-copy-arrayofstruct.c (98%)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index de27ed52c02..4949f001fed 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3117,30 +3117,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
  tree present = gfc_omp_check_optional_argument (decl, true);
  if (openacc && n->sym->ts.type == BT_CLASS)
{
- tree type = TREE_TYPE (decl);
  if (n->sym->attr.optional)
sorry 

[PATCH v3 01/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)

2022-09-13 Thread Julian Brown
This patch reimplements the omp_target_reorder_clauses function in
anticipation of supporting "deeper" struct mappings (that is, with
several structure dereference operators, or similar).

The idea is that in place of the (possibly quadratic) algorithm in
omp_target_reorder_clauses that greedily moves clauses containing
addresses that are subexpressions of other addresses before those other
addresses, we employ a topological sort algorithm to calculate a proper
order for map clauses. This should run in linear time, and hopefully
handles degenerate cases where multiple "levels" of indirect accesses
are present on a given directive.

The new method also takes care to keep clause groups together, addressing
the concerns raised in:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html

To figure out if some given clause depends on a base pointer in another
clause, we strip off the outer layers of the address expression, and check
(via a tree_operand_hash hash table we have built) if the result is a
"base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There
are some subtleties involved, however:

 - We must treat MEM_REF with zero offset the same as INDIRECT_REF.
   This should probably be fixed in the front ends instead so we always
   use a canonical form (probably INDIRECT_REF). The following patch
   shows one instance of the problem, but there may be others:

   https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html

 - Mapping a whole struct implies mapping each of that struct's
   elements, which may be base pointers. Because those base pointers
   aren't necessarily explicitly referenced in the directive in question,
   we treat the whole-struct mapping as a dependency instead.

This version of the patch has been rebased and incorporates several
previous review comments: one thing *not* addressed here is the following,
which is addressed by the C++ "declare mapper" patch:

On Tue, 24 May 2022 15:03:07 +0200
Jakub Jelinek wrote:

> I think big question is if we do want to do this map clause reordering
> before processing the  omp target etc. clauses, or after (during
> gimplify_adjust_omp_clauses, when clauses from the implicit mappings
> are added too and especially with the declare mapper expansions),
> or both before and after.

A WIP version of that patch demonstrating a new approach to clause
gimplification is included at the end of this series.

2022-09-13  Julian Brown  

gcc/
* gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete
functions.
(omp_tsort_mark): Add enum.
(omp_mapping_group): Add struct.
(debug_mapping_group, omp_get_base_pointer, omp_get_attachment,
omp_group_last, omp_gather_mapping_groups, omp_group_base,
omp_index_mapping_groups, omp_containing_struct,
omp_tsort_mapping_groups_1, omp_tsort_mapping_groups,
omp_segregate_mapping_groups, omp_reorder_mapping_groups): New
functions.
(gimplify_scan_omp_clauses): Call above functions instead of
omp_target_reorder_clauses, unless we've seen an error.
* omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't
sorted mapping groups.

gcc/testsuite/
* g++.dg/gomp/target-lambda-1.C: Adjust expected output.
* g++.dg/gomp/target-this-3.C: Likewise.
* g++.dg/gomp/target-this-4.C: Likewise.
---
 gcc/gimplify.cc | 766 +++-
 gcc/omp-low.cc  |   7 +-
 gcc/testsuite/g++.dg/gomp/target-lambda-1.C |   7 +-
 gcc/testsuite/g++.dg/gomp/target-this-3.C   |   4 +-
 gcc/testsuite/g++.dg/gomp/target-this-4.C   |   4 +-
 5 files changed, 774 insertions(+), 14 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index b828ddbfe6c..f3cd0df663e 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -8948,6 +8948,7 @@ extract_base_bit_offset (tree base, tree *base_ref, 
poly_int64 *bitposp,
   return base;
 }
 
+#if 0
 /* Returns true if EXPR is or contains (as a sub-component) BASE_PTR.  */
 
 static bool
@@ -9150,6 +9151,743 @@ omp_target_reorder_clauses (tree *list_p)
}
   }
 }
+#endif
+
+/* Used for topological sorting of mapping groups.  UNVISITED means we haven't
+   started processing the group yet.  The TEMPORARY mark is used when we first
+   encounter a group on a depth-first traversal, and the PERMANENT mark is used
+   when we have processed all the group's children (i.e. all the base pointers
+   referred to by the group's mapping nodes, recursively).  */
+
+enum omp_tsort_mark {
+  UNVISITED,
+  TEMPORARY,
+  PERMANENT
+};
+
+/* A group of OMP_CLAUSE_MAP nodes that correspond to a single "map"
+   clause.  */
+
+struct omp_mapping_group {
+  tree *grp_start;
+  tree grp_end;
+  omp_tsort_mark mark;
+  struct omp_mapping_group *sibling;
+  struct omp_mapping_group *next;
+};
+
+DEBUG_FUNCTION void
+debug_mapping_group (omp_mapping_group *grp)
+{
+  tree tmp = 

[PATCH v3 02/11] Remove omp_target_reorder_clauses

2022-09-13 Thread Julian Brown
This patch has been split out from the previous one to avoid a
confusingly-interleaved diff.  The two patches will be committed squashed
together.

2022-09-13  Julian Brown  

gcc/
* gimplify.c (omp_target_reorder_clauses): Delete.
---
 gcc/gimplify.cc | 205 
 1 file changed, 205 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index f3cd0df663e..2ba50d7d3ca 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -8948,211 +8948,6 @@ extract_base_bit_offset (tree base, tree *base_ref, 
poly_int64 *bitposp,
   return base;
 }
 
-#if 0
-/* Returns true if EXPR is or contains (as a sub-component) BASE_PTR.  */
-
-static bool
-is_or_contains_p (tree expr, tree base_ptr)
-{
-  if ((TREE_CODE (expr) == INDIRECT_REF && TREE_CODE (base_ptr) == MEM_REF)
-  || (TREE_CODE (expr) == MEM_REF && TREE_CODE (base_ptr) == INDIRECT_REF))
-return operand_equal_p (TREE_OPERAND (expr, 0),
-   TREE_OPERAND (base_ptr, 0));
-  while (!operand_equal_p (expr, base_ptr))
-{
-  if (TREE_CODE (base_ptr) == COMPOUND_EXPR)
-   base_ptr = TREE_OPERAND (base_ptr, 1);
-  if (TREE_CODE (base_ptr) == COMPONENT_REF
- || TREE_CODE (base_ptr) == POINTER_PLUS_EXPR
- || TREE_CODE (base_ptr) == SAVE_EXPR)
-   base_ptr = TREE_OPERAND (base_ptr, 0);
-  else
-   break;
-}
-  return operand_equal_p (expr, base_ptr);
-}
-
-/* Implement OpenMP 5.x map ordering rules for target directives. There are
-   several rules, and with some level of ambiguity, hopefully we can at least
-   collect the complexity here in one place.  */
-
-static void
-omp_target_reorder_clauses (tree *list_p)
-{
-  /* Collect refs to alloc/release/delete maps.  */
-  auto_vec ard;
-  tree *cp = list_p;
-  while (*cp != NULL_TREE)
-if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP
-   && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALLOC
-   || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_RELEASE
-   || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_DELETE))
-  {
-   /* Unlink cp and push to ard.  */
-   tree c = *cp;
-   tree nc = OMP_CLAUSE_CHAIN (c);
-   *cp = nc;
-   ard.safe_push (c);
-
-   /* Any associated pointer type maps should also move along.  */
-   while (*cp != NULL_TREE
-  && OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP
-  && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_REFERENCE
-  || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_POINTER
-  || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ATTACH_DETACH
-  || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_POINTER
-  || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALWAYS_POINTER
-  || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_TO_PSET))
- {
-   c = *cp;
-   nc = OMP_CLAUSE_CHAIN (c);
-   *cp = nc;
-   ard.safe_push (c);
- }
-  }
-else
-  cp = _CLAUSE_CHAIN (*cp);
-
-  /* Link alloc/release/delete maps to the end of list.  */
-  for (unsigned int i = 0; i < ard.length (); i++)
-{
-  *cp = ard[i];
-  cp = _CLAUSE_CHAIN (ard[i]);
-}
-  *cp = NULL_TREE;
-
-  /* OpenMP 5.0 requires that pointer variables are mapped before
- its use as a base-pointer.  */
-  auto_vec atf;
-  for (tree *cp = list_p; *cp; cp = _CLAUSE_CHAIN (*cp))
-if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP)
-  {
-   /* Collect alloc, to, from, to/from clause tree pointers.  */
-   gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp);
-   if (k == GOMP_MAP_ALLOC
-   || k == GOMP_MAP_TO
-   || k == GOMP_MAP_FROM
-   || k == GOMP_MAP_TOFROM
-   || k == GOMP_MAP_ALWAYS_TO
-   || k == GOMP_MAP_ALWAYS_FROM
-   || k == GOMP_MAP_ALWAYS_TOFROM)
- atf.safe_push (cp);
-  }
-
-  for (unsigned int i = 0; i < atf.length (); i++)
-if (atf[i])
-  {
-   tree *cp = atf[i];
-   tree decl = OMP_CLAUSE_DECL (*cp);
-   if (TREE_CODE (decl) == INDIRECT_REF || TREE_CODE (decl) == MEM_REF)
- {
-   tree base_ptr = TREE_OPERAND (decl, 0);
-   STRIP_TYPE_NOPS (base_ptr);
-   for (unsigned int j = i + 1; j < atf.length (); j++)
- if (atf[j])
-   {
- tree *cp2 = atf[j];
- tree decl2 = OMP_CLAUSE_DECL (*cp2);
-
- decl2 = OMP_CLAUSE_DECL (*cp2);
- if (is_or_contains_p (decl2, base_ptr))
-   {
- /* Move *cp2 to before *cp.  */
- tree c = *cp2;
- *cp2 = OMP_CLAUSE_CHAIN (c);
- OMP_CLAUSE_CHAIN (c) = *cp;
- *cp = c;
-
- if (*cp2 != NULL_TREE
- && OMP_CLAUSE_CODE (*cp2) == OMP_CLAUSE_MAP
- && OMP_CLAUSE_MAP_KIND (*cp2) == 
GOMP_MAP_ALWAYS_POINTER)
-   {

[PATCH v3 00/11] OpenMP 5.0: Struct & mapping clause expansion rework

2022-09-13 Thread Julian Brown
This is a new version of the first few patches of the series (up to
"Handle reference-typed struct members"):

  https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591973.html

Relative to the previously-posted series, this version addresses
several review comments, but more significantly it largely reworks how
OpenMP/OpenACC mapping clause expansion works in the C and C++ front
ends and in gimplify.cc, following discovery of some quite serious
shortcomings in the existing & previous support for same. (For the new
"baseptrs-4.C" test, approximately 135 out of 274 tests failed.)

Revisions/rework have been completed up to the 8th patch in the series.
The remaining patches address some previous review comments and are
included to support the direction of the first eight patches, but are
not ready for full re-review yet.

Individual patches bootstrapped (except the "trivial" ones) and regression
tested with offloading to NVPTX (up to "OpenMP/OpenACC: Rework clause
expansion and nested struct handling").

Further commentary on individual patches.

Julian Brown (11):
  OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by
base pointer)
  Remove omp_target_reorder_clauses
  OpenMP/OpenACC struct sibling list gimplification extension and rework
  OpenMP/OpenACC: mapping group list-handling improvements
  OpenMP: push attaches to end of clause list in "target" regions
  OpenMP: Pointers and member mappings
  OpenMP/OpenACC: Reindent TO/FROM/_CACHE_ stanza in
{c_}finish_omp_clause
  OpenMP/OpenACC: Rework clause expansion and nested struct handling
  FYI/unfinished: OpenMP: lvalue parsing for map clauses (C++)
  Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE
  FYI/unfinished: OpenMP 5.0 "declare mapper" support for C++

 gcc/c-family/c-common.h   |   68 +
 gcc/c-family/c-omp.cc |  780 +++-
 gcc/c/c-typeck.cc |  750 ++--
 gcc/cp/cp-gimplify.cc |6 +
 gcc/cp/cp-objcp-common.h  |2 +
 gcc/cp/cp-tree.h  |   10 +
 gcc/cp/decl.cc|   18 +-
 gcc/cp/error.cc   |9 +
 gcc/cp/mangle.cc  |5 +-
 gcc/cp/name-lookup.cc |3 +-
 gcc/cp/parser.cc  |  543 ++-
 gcc/cp/parser.h   |3 +
 gcc/cp/pt.cc  |  144 +-
 gcc/cp/semantics.cc   | 1570 ---
 gcc/fortran/parse.cc  |3 +
 gcc/fortran/trans-openmp.cc   |  160 +-
 gcc/gimplify.cc   | 3702 +
 gcc/langhooks-def.h   |3 +
 gcc/langhooks.cc  |9 +
 gcc/langhooks.h   |4 +
 gcc/omp-general.cc|  426 ++
 gcc/omp-general.h |  109 +
 gcc/omp-low.cc|   26 +-
 gcc/testsuite/c-c++-common/goacc/mdc-2.c  |2 +
 gcc/testsuite/c-c++-common/gomp/clauses-2.c   |2 +-
 gcc/testsuite/c-c++-common/gomp/map-6.c   |   10 +-
 gcc/testsuite/c-c++-common/gomp/target-50.c   |   23 +
 .../c-c++-common/gomp/target-implicit-map-2.c |2 +-
 gcc/testsuite/g++.dg/goacc/mdc.C  |2 +
 gcc/testsuite/g++.dg/goacc/member-array-acc.C |   13 +
 gcc/testsuite/g++.dg/gomp/declare-mapper-1.C  |   58 +
 gcc/testsuite/g++.dg/gomp/declare-mapper-2.C  |   30 +
 gcc/testsuite/g++.dg/gomp/declare-mapper-3.C  |   27 +
 gcc/testsuite/g++.dg/gomp/declare-mapper-4.C  |   74 +
 gcc/testsuite/g++.dg/gomp/ind-base-3.C|   37 +
 gcc/testsuite/g++.dg/gomp/map-assignment-1.C  |   12 +
 gcc/testsuite/g++.dg/gomp/map-inc-1.C |   10 +
 gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C  |   19 +
 gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C  |   37 +
 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C  |   40 +
 .../g++.dg/gomp/map-static-cast-lvalue-1.C|   17 +
 gcc/testsuite/g++.dg/gomp/map-ternary-1.C |   20 +
 gcc/testsuite/g++.dg/gomp/member-array-2.C|   91 +
 gcc/testsuite/g++.dg/gomp/member-array-omp.C  |   13 +
 gcc/testsuite/g++.dg/gomp/pr67522.C   |2 +-
 .../g++.dg/gomp/static-component-1.C  |   23 +
 gcc/testsuite/g++.dg/gomp/target-3.C  |4 +-
 gcc/testsuite/g++.dg/gomp/target-lambda-1.C   |6 +-
 gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +-
 gcc/testsuite/g++.dg/gomp/target-this-3.C |4 +-
 gcc/testsuite/g++.dg/gomp/target-this-4.C |4 +-
 gcc/testsuite/gcc.dg/gomp/target-3.c  |2 +-
 gcc/tree-core.h   |4 +
 gcc/tree-pretty-print.cc  |   56 +
 gcc/tree.cc   |2 +
 gcc/tree.def  |   10 +
 gcc/tree.h   

Re: libgo patch committed: Make runtime.Version return a useful value

2022-09-13 Thread Ian Lance Taylor via Gcc-patches
On Tue, Jun 28, 2022 at 10:20 AM Ian Lance Taylor  wrote:
>
> This libgo patch makes runtime.Version return a meaningful string.
> This also means that "go version" will print something useful, e.g.,
>
> go version go1.18 gccgo (GCC) 12.0.1 20220216 (experimental) linux/amd64
>
> This fixes https://go.dev/issue/51850.
>
> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
> to mainline.

I've committed this to the GCC 12 branch, for PR 106747.

Ian


[PATCH] Optimize (X<

2022-09-13 Thread Roger Sayle

This patch tweaks the match.pd transformation previously added to fold
(X<

gcc/ChangeLog
* match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the
expression (X

[PATCH] PR tree-optimization/71343: Value number X<<2 as X*4.

2022-09-13 Thread Roger Sayle

This patch is the second part of a fix for PR tree-optimization/71343,
that implements Richard Biener's suggestion of using tree-ssa's value
numbering instead of match.pd.  The change is that when assigning a
value number for the expression X<

gcc/ChangeLog
PR tree-optimization/71343
* tree-ssa-sccvn.cc (visit_nary_op) : Make
the value number of the expression X << C the same as the value
number for the multiplication X * (1< -Original Message-
> From: Richard Biener 
> Sent: 08 August 2022 12:42
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] PR tree-optimization/71343: Optimize (X< (X)< 
> On Mon, Aug 8, 2022 at 10:07 AM Roger Sayle
>  wrote:
> >
> > This patch resolves PR tree-optimization/71343, a missed-optimization
> > enhancement request where GCC fails to see that (a<<2)+(b<<2) == a*4+b*4.
> > This requires two related (sets of) optimizations to be added to match.pd.
> >
> > The first is that (X< > for many binary operators, including AND, IOR, XOR, and (if overflow
> > isn't an issue) PLUS and MINUS.  Likewise, the right shifts (both
> > logical and arithmetic) and bit-wise logical operators can be
> > simplified in a similar fashion.  These all reduce the number of
> > GIMPLE binary operations from 3 to 2, by combining/eliminating a shift
> operation.
> >
> > The second optimization reflects that the middle-end doesn't impose a
> > canonical form on multiplications by powers of two, vs. left shifts,
> > instead leaving these operations as specified by the programmer unless
> > there's a good reason to change them.  Hence, GIMPLE code may contain
> > the expressions "X * 8" and "X << 3" even though these represent the
> > same value/computation.  The tweak to match.pd is that comparison
> > operations whose operands are equivalent non-canonical expressions can
> > be taught their equivalence.  Hence "(X * 8) == (X << 3)" will always
> > evaluate to true, and "(X<<2) > 4*X" will always evaluate to false.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32},
> > with no new failures.  Ok for mainline?
> 
> +/* Shifts by constants distribute over several binary operations,
> +   hence (X << C) + (Y << C) can be simplified to (X + Y) << C.  */
> +(for op (plus minus)
> +  (simplify
> +(op (lshift:s @0 INTEGER_CST@1) (lshift:s @2 INTEGER_CST@1))
> +(if (INTEGRAL_TYPE_P (type)
> +&& TYPE_OVERFLOW_WRAPS (type)
> +&& !TYPE_SATURATING (type)
> +&& tree_fits_shwi_p (@1)
> +&& tree_to_shwi (@1) > 0
> +&& tree_to_shwi (@1) < TYPE_PRECISION (type))
> 
> I do wonder why we need to restrict this to shifts by constants?
> Any out-of-bound shift was already there, no?
> 
> +/* Some tree expressions are intentionally non-canonical.
> +   We handle the comparison of the equivalent forms here.  */ (for cmp
> +(eq le ge)
> +  (simplify
> +(cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +&& tree_fits_shwi_p (@1)
> +&& tree_to_shwi (@1) > 0
> +&& tree_to_shwi (@1) < TYPE_PRECISION  (TREE_TYPE (@0))
> +&& wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2)))
> +  { constant_boolean_node (true, type); })))
> +
> +(for cmp (ne lt gt)
> +  (simplify
> +(cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +&& tree_fits_shwi_p (@1)
> +&& tree_to_shwi (@1) > 0
> +&& tree_to_shwi (@1) < TYPE_PRECISION  (TREE_TYPE (@0))
> +&& wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2)))
> +  { constant_boolean_node (false, type); })))
> 
> hmm.  I wonder if it makes more sense to handle this in value-numbering.
> tree-ssa-sccvn.cc:visit_nary_op handles some cases that are not exactly
> canonicalization issues but the shift vs mult could be handled there by just
> performing the alternate lookup.  That would also enable CSE and by means of
> that of course the comparisons you do above.
> 
> Thanks,
> Richard.
> 
> >
> > 2022-08-08  Roger Sayle  
> >
> > gcc/ChangeLog
> > PR tree-optimization/71343
> > * match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the
> > expression (X< > (op (rshift @0 @1) (rshift @2 @1)): Likwise, simplify (X>>C)^(Y>>C)
> > to (X^Y)>>C for binary logical operators, AND, IOR and XOR.
> > (cmp:c (lshift @0) (mult @1)): Optimize comparisons between
> > shifts by integer constants and multiplications by powers of 2.
> >
> > gcc/testsuite/ChangeLog
> > PR tree-optimization/71343
> > * gcc.dg/pr71343-1.c: New test case.
> > * gcc.dg/pr71343-2.c: Likewise.
> >
> >
> > Thanks in advance,
> > Roger
> > --

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 74b8d8d..266 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ 

Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-13 Thread Joseph Myers
On Mon, 12 Sep 2022, Jakub Jelinek via Gcc-patches wrote:

> > We don't claim in glibc to support old snapshots from master, so checking 
> > for __GNUC_PREREQ (13, 0) and failing for such older GCC 13 versions is 
> > fine.
> 
> Ok, makes sense, especially if it is applied on the glibc side a few months
> after it is changed on the GCC side.  If it is applied immediately, there

Leaving the build of glibc C++ tests broken for months with mainline GCC 
doesn't seem like a good idea; it's a recipe for missing other 
regressions.  I think a fix should be applied to glibc immediately after 
the GCC change.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: libgo patch committed: Ignore __morestack in runtime.Callers

2022-09-13 Thread Ian Lance Taylor via Gcc-patches
On Tue, Sep 6, 2022 at 6:40 PM Ian Lance Taylor  wrote:
>
> This libgo patch ignores the __morestack function in runtime.Callers.
> We were ignoring all functions starting with "__morestack_", but not
> the function "__morestack" itself.  Without this change, some tests
> such as recover.go started failing recently, though I'm not sure
> exactly what changed.  Bootstrapped and ran Go testsuite on
> x86_64-pc-linux-gnu.  Committed to mainline.

I've also committed this patch to GCC 12 branch.

Ian


[committed] libgomp: Appease some static analyzers [PR106906]

2022-09-13 Thread Jakub Jelinek via Gcc-patches
Hi!

While icv_addr[1] = false; assignments where icv_addr has void *
element type is correct and matches how it is used (in those cases
the void * pointer is then cast to bool and used that way), there is no
reason not to add explicit (void *) casts there which are there already
for (void *) true.  And, there is in fact even no point in actually
doing those stores at all because we set that pointer to NULL a few
lines earlier.  So, this patch adds the explicit casts and then
comments those out to show intent.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2022-09-13  Jakub Jelinek  

PR libgomp/106906
* env.c (get_icv_member_addr): Cast false to void * before assigning
it to icv_addr[1], and comment the whole assignment out.

--- libgomp/env.c.jj2022-09-12 10:32:00.935086858 +0200
+++ libgomp/env.c   2022-09-12 13:27:22.893571697 +0200
@@ -1892,14 +1892,14 @@ get_icv_member_addr (struct gomp_initial
 {
 case GOMP_ICV_NTEAMS:
   icv_addr[0] = >nteams_var;
-  icv_addr[1] = false;
+  /* icv_addr[1] = (void *) false; */
   break;
 case GOMP_ICV_DYNAMIC:
   icv_addr[0] = &(*icvs).dyn_var;
   break;
 case GOMP_ICV_TEAMS_THREAD_LIMIT:
   icv_addr[0] = >teams_thread_limit_var;
-  icv_addr[1] = false;
+  /* icv_addr[1] = (void *) false; */
   break;
 case GOMP_ICV_SCHEDULE:
   icv_addr[0] = >run_sched_var;
@@ -1907,7 +1907,7 @@ get_icv_member_addr (struct gomp_initial
   break;
 case GOMP_ICV_THREAD_LIMIT:
   icv_addr[0] = >thread_limit_var;
-  icv_addr[1] = false;
+  /* icv_addr[1] = (void *) false; */
   icv_addr[2] = (void *) UINT_MAX;
   break;
 case GOMP_ICV_NTHREADS:

Jakub



[PATCH] c++: Implement C++23 P1169R4 - static operator() [PR106651]

2022-09-13 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch attempts to implement C++23 P1169R4 - static operator()
paper's compiler side (there is some small library side too not implemented
yet).  This allows static members as user operator() declarations and
static specifier on lambdas without lambda capture.  As decl specifier
parsing doesn't track about the presence and locations of all specifiers,
the patch unfortunately replaces the diagnostics about duplicate mutable
with diagnostics about conflicting specifiers because the information
whether it was mutable mutable, mutable static, static mutable or static
static is lost.  Beyond this, the synthetized conversion operator changes
for static lambdas as it can just return the operator() static method
address, doesn't need to create a thunk for it.
The change I'm least sure about is the call.cc (joust) change, one thing
is that we ICEd because we assumed that len could be different only if
both candidates are direct calls but it can be one direct and one indirect
call, and then I'm trying to implement my understanding of the
[over.match.best.general]/1 and [over.best.ics.general] changes from
the paper (implemented both for C++23 and when the static member function
is operator() which we accept with pedwarn in earlier standards too).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-09-13  Jakub Jelinek  

PR c++/106651
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_static_call_operator=202207L for C++23.
gcc/cp/
* cp-tree.h (LAMBDA_EXPR_STATIC_P): Implement C++23
P1169R4 - static operator().  Define.
* parser.cc (CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR): Document
that it also allows static.
(cp_parser_lambda_declarator_opt): Handle static lambda specifier.
(cp_parser_decl_specifier_seq): Allow RID_STATIC for
CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR.
* decl.cc (grok_op_properties): If operator() isn't a method,
use a different error wording, if it is static member function,
allow it (for C++20 and older with a pedwarn unless it is
a lambda function or template instantiation).
* call.cc (joust): Don't ICE if one candidate is static member
function and the other is an indirect call.  For C++23 or if
the static member is operator() and the parameter conversion on
the other candidate is user defined conversion, ellipsis or bad
conversion, make static member function candidate a winner for
that parameter. 
* lambda.cc (maybe_add_lambda_conv_op): Handle static lambdas.
* error.cc (dump_lambda_function): Print static for static lambdas.
gcc/testsuite/
* g++.dg/template/error30.C: Adjust expected diagnostics.
* g++.dg/cpp1z/constexpr-lambda13.C: Likewise.
* g++.dg/cpp23/feat-cxx2b.C: Test __cpp_static_call_operator.
* g++.dg/cpp23/static-operator-call1.C: New test.
* g++.dg/cpp23/static-operator-call2.C: New test.
* g++.old-deja/g++.jason/operator.C: Adjust expected diagnostics.

--- gcc/cp/cp-tree.h.jj 2022-09-13 09:21:28.052541628 +0200
+++ gcc/cp/cp-tree.h2022-09-13 12:14:31.674733861 +0200
@@ -504,6 +504,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   OVL_NESTED_P (in OVERLOAD)
   DECL_MODULE_EXPORT_P (in _DECL)
   PACK_EXPANSION_FORCE_EXTRA_ARGS_P (in *_PACK_EXPANSION)
+  LAMBDA_EXPR_STATIC_P (in LAMBDA_EXPR)
4: IDENTIFIER_MARKED (IDENTIFIER_NODEs)
   TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR,
  CALL_EXPR, or FIELD_DECL).
@@ -1488,6 +1489,10 @@ enum cp_lambda_default_capture_mode_type
 #define LAMBDA_EXPR_CAPTURE_OPTIMIZED(NODE) \
   TREE_LANG_FLAG_2 (LAMBDA_EXPR_CHECK (NODE))
 
+/* Predicate tracking whether the lambda was declared 'static'.  */
+#define LAMBDA_EXPR_STATIC_P(NODE) \
+  TREE_LANG_FLAG_3 (LAMBDA_EXPR_CHECK (NODE))
+
 /* True if this TREE_LIST in LAMBDA_EXPR_CAPTURE_LIST is for an explicit
capture.  */
 #define LAMBDA_CAPTURE_EXPLICIT_P(NODE) \
--- gcc/cp/parser.cc.jj 2022-09-13 09:21:01.276920558 +0200
+++ gcc/cp/parser.cc2022-09-13 12:14:31.683733738 +0200
@@ -1994,7 +1994,7 @@ enum
  constexpr.  */
   CP_PARSER_FLAGS_ONLY_TYPE_OR_CONSTEXPR = 0x8,
   /* When parsing a decl-specifier-seq, only allow mutable, constexpr or
- for C++20 consteval.  */
+ for C++20 consteval or for C++23 static.  */
   CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR = 0x10,
   /* When parsing a decl-specifier-seq, allow missing typename.  */
   CP_PARSER_FLAGS_TYPENAME_OPTIONAL = 0x20,
@@ -11714,13 +11714,26 @@ cp_parser_lambda_declarator_opt (cp_pars
   omitted_parms_loc = UNKNOWN_LOCATION;
 }
 
-  if (lambda_specs.storage_class == sc_mutable)
+  if (lambda_specs.storage_class == sc_mutable
+  || lambda_specs.storage_class == sc_static)
 {
-  LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1;
-  quals = TYPE_UNQUALIFIED;
+  if 

[PATCH] LoongArch: Prepare static PIE support

2022-09-13 Thread Xi Ruoyao via Gcc-patches
Static PIE allows us to extend the ASLR to cover static executables and
it's not too difficult to support it.  On GCC side, we just pass a group
of options to the linker, like other ports with static PIE support.

The real implementation of static PIE (rcrt1.o) will be added into Glibc
later.

gcc/ChangeLog:

* config/loongarch/gnu-user.h (GNU_USER_TARGET_LINK_SPEC): For
-static-pie, pass -static -pie --no-dynamic-linker -z text to
the linker, and do not pass --dynamic-linker.
---
 gcc/config/loongarch/gnu-user.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/gnu-user.h b/gcc/config/loongarch/gnu-user.h
index 664dc9206ad..c5b1afe530d 100644
--- a/gcc/config/loongarch/gnu-user.h
+++ b/gcc/config/loongarch/gnu-user.h
@@ -40,8 +40,10 @@ along with GCC; see the file COPYING3.  If not see
 #undef GNU_USER_TARGET_LINK_SPEC
 #define GNU_USER_TARGET_LINK_SPEC \
   "%{G*} %{shared} -m " GNU_USER_LINK_EMULATION \
-  "%{!shared: %{static} %{!static: %{rdynamic:-export-dynamic} " \
-  "-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}"
+  "%{!shared: %{static} " \
+  "%{!static: %{!static-pie: %{rdynamic:-export-dynamic} " \
+  "-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} " \
+  "%{static-pie: -static -pie --no-dynamic-linker -z text}}"
 
 
 /* Similar to standard Linux, but adding -ffast-math support.  */
-- 
2.37.3



[committed] c++: remove single-parameter version of mark_used

2022-09-13 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, committed to trunk as obvious.

gcc/cp/ChangeLog:

* cp-tree.h (mark_used): Remove single-parameter overload.  Add
default argument to the two-parameter overload.
* decl2.cc (mark_used): Likewise.
---
 gcc/cp/cp-tree.h | 4 ++--
 gcc/cp/decl2.cc  | 8 +---
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1eb176d4a50..f19ecafc266 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6945,8 +6945,8 @@ extern bool decl_constant_var_p   (tree);
 extern bool decl_maybe_constant_var_p  (tree);
 extern void no_linkage_error   (tree);
 extern void check_default_args (tree);
-extern bool mark_used  (tree);
-extern bool mark_used  (tree, tsubst_flags_t);
+extern bool mark_used  (tree,
+tsubst_flags_t = 
tf_warning_or_error);
 extern bool mark_single_function   (tree, tsubst_flags_t);
 extern void finish_static_data_member_decl (tree, tree, bool, tree, int);
 extern tree cp_build_parm_decl (tree, tree, tree);
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 684a2d06dde..9f18466192f 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5575,7 +5575,7 @@ mark_single_function (tree expr, tsubst_flags_t complain)
wrong, true otherwise.  */
 
 bool
-mark_used (tree decl, tsubst_flags_t complain)
+mark_used (tree decl, tsubst_flags_t complain /* = tf_warning_or_error */)
 {
   /* If we're just testing conversions or resolving overloads, we
  don't want any permanent effects like forcing functions to be
@@ -5817,12 +5817,6 @@ mark_used (tree decl, tsubst_flags_t complain)
   return true;
 }
 
-bool
-mark_used (tree decl)
-{
-  return mark_used (decl, tf_warning_or_error);
-}
-
 tree
 vtv_start_verification_constructor_init_function (void)
 {
-- 
2.37.3.542.gdd3f6c4cae



Re: [PATCH, libgomp] Fix chunk_size<1 for dynamic schedule

2022-09-13 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 04, 2022 at 09:17:09PM +0800, Chung-Lin Tang wrote:
> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
> > On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
> > > with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
> > > 
> > > (1) chunk_size <= -1: wraps into large unsigned value, seems to work 
> > > though.
> > > (2) chunk_size == 0:  infinite loop
> > > 
> > > The (2) behavior is obviously not desired. This patch fixes this by 
> > > changing
> > 
> > Why?  It is a user error, undefined behavior, we shouldn't slow down valid
> > code for users who don't bother reading the standard.
> 
> This is loop init code, not per-iteration. The overhead really isn't that 
> much.

But still, we slow down valid code for the sake of garbage.
That is a task for sanitizers, not production libraries.
> 
> The question should be, if GCC having infinite loop behavior is reasonable,
> even if it is undefined in the spec.

Sure, it is perfectly valid implementation of the undefined behavior.
UB can leads to tons of surprising behavior, hangs etc. and this is exactly
the same category.

On Thu, Aug 04, 2022 at 01:31:50PM +, Koning, Paul via Gcc-patches wrote:
> I wouldn't think so.  The way I see "undefined code" is that you can't 
> complain
> about "wrong code" produced by the compiler.  But for the compiler to 
> malfunction
> on wrong input is an entirely differerent matter.  For one thing, it's hard 
> to fix
> your code if the compiler fails.  How would you locate the offending source 
> line?

The compiler isn't malfunctioning here.  It is similar to calling a library
function with bogus arguments, say memcpy with NULL source or destination or
some invalid pointer not pointing anywhere valid, etc.
The spec clearly says zero or negative chunk size is not valid, you use it,
you get what you ask for.  Furthermore, it is easy to find out when it hangs
on which construct it is and check if that construct is valid.

I'm strongly against slowing valid code for this.

If you want to implement -fsanitize=openmp and either in that case perform
checks on the generated code side or link with an instrumented version of
libgomp that explains users what errors they do, that is fine.

Jakub



Re: [PATCH] c++: some missing-SFINAE fixes

2022-09-13 Thread Jason Merrill via Gcc-patches

On 9/13/22 09:46, Patrick Palka wrote:

On Tue, 13 Sep 2022, Jason Merrill wrote:


On 9/13/22 07:45, Patrick Palka wrote:

It looks like we aren't respecting SFINAE for:

* an invalid/non-constant conditional explicit-specifier
* a non-constant conditional noexcept-specifier
* a non-constant argument to __integer_pack

This patch fixes these issues in the usual way, by passing complain
and propagating error_mark_node appropriately.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* decl.cc (build_explicit_specifier): Pass complain to
cxx_constant_value.
* except.cc (build_noexcept_spec): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(tsubst_function_decl): Propagate error_mark_node returned
from build_explicit_specifier.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/noexcept-type26.C: New test.
* g++.dg/cpp2a/explicit19.C: New test.
* g++.dg/ext/integer-pack6.C: New test.
---
   gcc/cp/decl.cc   |  2 +-
   gcc/cp/except.cc |  2 +-
   gcc/cp/pt.cc |  6 --
   gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 
   gcc/testsuite/g++.dg/cpp2a/explicit19.C  | 12 
   gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 +
   6 files changed, 43 insertions(+), 4 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C
   create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 936f1cf0197..5404d7e084c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t
complain)
   expr = build_converted_constant_bool_expr (expr, complain);
 expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr);
+  expr = cxx_constant_value (expr, NULL_TREE, complain);


The patch is OK, but perhaps we want another overload that omits the object
parameter?


Thanks -- sounds good, like the following incremental patch?


OK.


-- >8 --

Subject: [PATCH] c++: two-parameter version of cxx_constant_value

Since some callers need the complain parameter but not the object
parameter, let's introduce and use an overload of cxx_constant_value
that omits the latter.

gcc/cp/ChangeLog:

* cp-tree.h (cxx_constant_value): Define two-parameter version
that omits the object parameter.
* decl.cc (build_explicit_specifier): Omit NULL_TREE object
argument to cxx_constant_value.
* except.cc (build_noexcept_spec): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(fold_targs_r): Likewise.
* semantics.cc (finish_if_stmt_cond): Likewise.
---
  gcc/cp/cp-tree.h| 2 ++
  gcc/cp/decl.cc  | 2 +-
  gcc/cp/except.cc| 2 +-
  gcc/cp/pt.cc| 4 ++--
  gcc/cp/semantics.cc | 2 +-
  5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 2ba44e80e20..1eb176d4a50 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8414,6 +8414,8 @@ extern bool require_rvalue_constant_expression (tree);
  extern bool require_potential_rvalue_constant_expression (tree);
  extern tree cxx_constant_value(tree, tree = NULL_TREE,
 tsubst_flags_t = tf_error);
+inline tree cxx_constant_value (tree t, tsubst_flags_t complain)
+{ return cxx_constant_value (t, NULL_TREE, complain); }
  extern void cxx_constant_dtor (tree, tree);
  extern tree cxx_constant_init (tree, tree = NULL_TREE);
  extern tree maybe_constant_value  (tree, tree = NULL_TREE, bool = 
false);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 5404d7e084c..006e9affcba 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t 
complain)
  
expr = build_converted_constant_bool_expr (expr, complain);

expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr, NULL_TREE, complain);
+  expr = cxx_constant_value (expr, complain);
return expr;
  }
  
diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc

index 4d7f0ce102d..048612de400 100644
--- a/gcc/cp/except.cc
+++ b/gcc/cp/except.cc
@@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
  {
expr = build_converted_constant_bool_expr (expr, complain);
expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr, NULL_TREE, complain);
+  expr = cxx_constant_value (expr, complain);
  }
if (TREE_CODE (expr) == INTEGER_CST)
  {
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 074179288b6..db4e808adec 100644
--- a/gcc/cp/pt.cc
+++ 

Re: [PATCH] c++: some missing-SFINAE fixes

2022-09-13 Thread Patrick Palka via Gcc-patches
On Tue, 13 Sep 2022, Jason Merrill wrote:

> On 9/13/22 07:45, Patrick Palka wrote:
> > It looks like we aren't respecting SFINAE for:
> > 
> >* an invalid/non-constant conditional explicit-specifier
> >* a non-constant conditional noexcept-specifier
> >* a non-constant argument to __integer_pack
> > 
> > This patch fixes these issues in the usual way, by passing complain
> > and propagating error_mark_node appropriately.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > * decl.cc (build_explicit_specifier): Pass complain to
> > cxx_constant_value.
> > * except.cc (build_noexcept_spec): Likewise.
> > * pt.cc (expand_integer_pack): Likewise.
> > (tsubst_function_decl): Propagate error_mark_node returned
> > from build_explicit_specifier.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp1z/noexcept-type26.C: New test.
> > * g++.dg/cpp2a/explicit19.C: New test.
> > * g++.dg/ext/integer-pack6.C: New test.
> > ---
> >   gcc/cp/decl.cc   |  2 +-
> >   gcc/cp/except.cc |  2 +-
> >   gcc/cp/pt.cc |  6 --
> >   gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 
> >   gcc/testsuite/g++.dg/cpp2a/explicit19.C  | 12 
> >   gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 +
> >   6 files changed, 43 insertions(+), 4 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C
> >   create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C
> > 
> > diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
> > index 936f1cf0197..5404d7e084c 100644
> > --- a/gcc/cp/decl.cc
> > +++ b/gcc/cp/decl.cc
> > @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t
> > complain)
> >   expr = build_converted_constant_bool_expr (expr, complain);
> > expr = instantiate_non_dependent_expr (expr, complain);
> > -  expr = cxx_constant_value (expr);
> > +  expr = cxx_constant_value (expr, NULL_TREE, complain);
> 
> The patch is OK, but perhaps we want another overload that omits the object
> parameter?

Thanks -- sounds good, like the following incremental patch?

-- >8 --

Subject: [PATCH] c++: two-parameter version of cxx_constant_value

Since some callers need the complain parameter but not the object
parameter, let's introduce and use an overload of cxx_constant_value
that omits the latter.

gcc/cp/ChangeLog:

* cp-tree.h (cxx_constant_value): Define two-parameter version
that omits the object parameter.
* decl.cc (build_explicit_specifier): Omit NULL_TREE object
argument to cxx_constant_value.
* except.cc (build_noexcept_spec): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(fold_targs_r): Likewise.
* semantics.cc (finish_if_stmt_cond): Likewise.
---
 gcc/cp/cp-tree.h| 2 ++
 gcc/cp/decl.cc  | 2 +-
 gcc/cp/except.cc| 2 +-
 gcc/cp/pt.cc| 4 ++--
 gcc/cp/semantics.cc | 2 +-
 5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 2ba44e80e20..1eb176d4a50 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8414,6 +8414,8 @@ extern bool require_rvalue_constant_expression (tree);
 extern bool require_potential_rvalue_constant_expression (tree);
 extern tree cxx_constant_value (tree, tree = NULL_TREE,
 tsubst_flags_t = tf_error);
+inline tree cxx_constant_value (tree t, tsubst_flags_t complain)
+{ return cxx_constant_value (t, NULL_TREE, complain); }
 extern void cxx_constant_dtor  (tree, tree);
 extern tree cxx_constant_init  (tree, tree = NULL_TREE);
 extern tree maybe_constant_value   (tree, tree = NULL_TREE, bool = 
false);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 5404d7e084c..006e9affcba 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t 
complain)
 
   expr = build_converted_constant_bool_expr (expr, complain);
   expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr, NULL_TREE, complain);
+  expr = cxx_constant_value (expr, complain);
   return expr;
 }
 
diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc
index 4d7f0ce102d..048612de400 100644
--- a/gcc/cp/except.cc
+++ b/gcc/cp/except.cc
@@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
 {
   expr = build_converted_constant_bool_expr (expr, complain);
   expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr, NULL_TREE, complain);
+  expr = cxx_constant_value (expr, complain);
 }
   if (TREE_CODE (expr) == INTEGER_CST)
 {
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 

Re: [PATCH] c++: some missing-SFINAE fixes

2022-09-13 Thread Jason Merrill via Gcc-patches

On 9/13/22 07:45, Patrick Palka wrote:

It looks like we aren't respecting SFINAE for:

   * an invalid/non-constant conditional explicit-specifier
   * a non-constant conditional noexcept-specifier
   * a non-constant argument to __integer_pack

This patch fixes these issues in the usual way, by passing complain
and propagating error_mark_node appropriately.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* decl.cc (build_explicit_specifier): Pass complain to
cxx_constant_value.
* except.cc (build_noexcept_spec): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(tsubst_function_decl): Propagate error_mark_node returned
from build_explicit_specifier.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/noexcept-type26.C: New test.
* g++.dg/cpp2a/explicit19.C: New test.
* g++.dg/ext/integer-pack6.C: New test.
---
  gcc/cp/decl.cc   |  2 +-
  gcc/cp/except.cc |  2 +-
  gcc/cp/pt.cc |  6 --
  gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 
  gcc/testsuite/g++.dg/cpp2a/explicit19.C  | 12 
  gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 +
  6 files changed, 43 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C
  create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 936f1cf0197..5404d7e084c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t 
complain)
  
expr = build_converted_constant_bool_expr (expr, complain);

expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr);
+  expr = cxx_constant_value (expr, NULL_TREE, complain);


The patch is OK, but perhaps we want another overload that omits the 
object parameter?



return expr;
  }
  
diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc

index 7fdbc747c22..4d7f0ce102d 100644
--- a/gcc/cp/except.cc
+++ b/gcc/cp/except.cc
@@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
  {
expr = build_converted_constant_bool_expr (expr, complain);
expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr);
+  expr = cxx_constant_value (expr, NULL_TREE, complain);
  }
if (TREE_CODE (expr) == INTEGER_CST)
  {
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1c6f4b84612..f16f5ffca20 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3869,7 +3869,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t 
complain,
else
  {
hi = instantiate_non_dependent_expr (hi, complain);
-  hi = cxx_constant_value (hi);
+  hi = cxx_constant_value (hi, NULL_TREE, complain);
int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1;
  
/* Calculate the largest value of len that won't make the size of the vec

@@ -14312,7 +14312,9 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t 
complain,
/*function_p=*/false,
/*i_c_e_p=*/true);
spec = build_explicit_specifier (spec, complain);
-  if (instantiation_dependent_expression_p (spec))
+  if (spec == error_mark_node)
+   return error_mark_node;
+  else if (instantiation_dependent_expression_p (spec))
store_explicit_specifier (r, spec);
else
{
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C 
b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
new file mode 100644
index 000..73058395f0a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
@@ -0,0 +1,12 @@
+// Verify a non-constant conditional noexcept-specifier of a function type
+// is a SFINAE error.
+// { dg-do compile { target c++17 } }
+
+template void f(void() noexcept(T::value)) = delete;
+template void f(...);
+
+struct B { static bool value; };
+
+int main() {
+  f(nullptr);
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit19.C 
b/gcc/testsuite/g++.dg/cpp2a/explicit19.C
new file mode 100644
index 000..47903813680
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/explicit19.C
@@ -0,0 +1,12 @@
+// Verify a conditional explicit-specifier is a SFINAE context.
+// { dg-do compile { target c++20 } }
+
+struct A {
+  template explicit(T::value) A(T) = delete;
+  A(...);
+};
+
+struct B { static bool value; };
+
+A x(0);
+A y(B{});
diff --git a/gcc/testsuite/g++.dg/ext/integer-pack6.C 
b/gcc/testsuite/g++.dg/ext/integer-pack6.C
new file mode 100644
index 000..7e16bc44b40
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/integer-pack6.C
@@ -0,0 +1,13 @@
+// Verify a non-constant argument to __integer_pack is a SFINAE error.
+// { dg-do compile { target c++11 } }
+

[PATCH] c++: some missing-SFINAE fixes

2022-09-13 Thread Patrick Palka via Gcc-patches
It looks like we aren't respecting SFINAE for:

  * an invalid/non-constant conditional explicit-specifier
  * a non-constant conditional noexcept-specifier
  * a non-constant argument to __integer_pack

This patch fixes these issues in the usual way, by passing complain
and propagating error_mark_node appropriately.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* decl.cc (build_explicit_specifier): Pass complain to
cxx_constant_value.
* except.cc (build_noexcept_spec): Likewise.
* pt.cc (expand_integer_pack): Likewise.
(tsubst_function_decl): Propagate error_mark_node returned
from build_explicit_specifier.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/noexcept-type26.C: New test.
* g++.dg/cpp2a/explicit19.C: New test.
* g++.dg/ext/integer-pack6.C: New test.
---
 gcc/cp/decl.cc   |  2 +-
 gcc/cp/except.cc |  2 +-
 gcc/cp/pt.cc |  6 --
 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 
 gcc/testsuite/g++.dg/cpp2a/explicit19.C  | 12 
 gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 +
 6 files changed, 43 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C
 create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 936f1cf0197..5404d7e084c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t 
complain)
 
   expr = build_converted_constant_bool_expr (expr, complain);
   expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr);
+  expr = cxx_constant_value (expr, NULL_TREE, complain);
   return expr;
 }
 
diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc
index 7fdbc747c22..4d7f0ce102d 100644
--- a/gcc/cp/except.cc
+++ b/gcc/cp/except.cc
@@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
 {
   expr = build_converted_constant_bool_expr (expr, complain);
   expr = instantiate_non_dependent_expr (expr, complain);
-  expr = cxx_constant_value (expr);
+  expr = cxx_constant_value (expr, NULL_TREE, complain);
 }
   if (TREE_CODE (expr) == INTEGER_CST)
 {
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1c6f4b84612..f16f5ffca20 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3869,7 +3869,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t 
complain,
   else
 {
   hi = instantiate_non_dependent_expr (hi, complain);
-  hi = cxx_constant_value (hi);
+  hi = cxx_constant_value (hi, NULL_TREE, complain);
   int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1;
 
   /* Calculate the largest value of len that won't make the size of the vec
@@ -14312,7 +14312,9 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t 
complain,
/*function_p=*/false,
/*i_c_e_p=*/true);
   spec = build_explicit_specifier (spec, complain);
-  if (instantiation_dependent_expression_p (spec))
+  if (spec == error_mark_node)
+   return error_mark_node;
+  else if (instantiation_dependent_expression_p (spec))
store_explicit_specifier (r, spec);
   else
{
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C 
b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
new file mode 100644
index 000..73058395f0a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C
@@ -0,0 +1,12 @@
+// Verify a non-constant conditional noexcept-specifier of a function type
+// is a SFINAE error.
+// { dg-do compile { target c++17 } }
+
+template void f(void() noexcept(T::value)) = delete;
+template void f(...);
+
+struct B { static bool value; };
+
+int main() {
+  f(nullptr);
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit19.C 
b/gcc/testsuite/g++.dg/cpp2a/explicit19.C
new file mode 100644
index 000..47903813680
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/explicit19.C
@@ -0,0 +1,12 @@
+// Verify a conditional explicit-specifier is a SFINAE context.
+// { dg-do compile { target c++20 } }
+
+struct A {
+  template explicit(T::value) A(T) = delete;
+  A(...);
+};
+
+struct B { static bool value; };
+
+A x(0);
+A y(B{});
diff --git a/gcc/testsuite/g++.dg/ext/integer-pack6.C 
b/gcc/testsuite/g++.dg/ext/integer-pack6.C
new file mode 100644
index 000..7e16bc44b40
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/integer-pack6.C
@@ -0,0 +1,13 @@
+// Verify a non-constant argument to __integer_pack is a SFINAE error.
+// { dg-do compile { target c++11 } }
+
+template struct A { };
+
+template auto f(int) -> A<__integer_pack(T::value)...> = delete;
+template void f(...);
+
+struct B { static int value; };
+
+int main() {
+  f(0);

Re: [OG12][PATCH] openmp: Fix handling of target constructs in static member

2022-09-13 Thread Andrew Stubbs

On 13/09/2022 12:03, Paul-Antoine Arras wrote:

Hello,

This patch intends to backport e90af965e5c by Jakub Jelinek to 
devel/omp/gcc-12.
The original patch was described here: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601189.html


I've merged and committed it for you.

Andrew



Re: [PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P

2022-09-13 Thread Max Filippov via Gcc-patches
On Mon, Sep 12, 2022 at 8:00 PM Takayuki 'January June' Suwa
 wrote:
> On 2022/09/13 4:34, Max Filippov wrote:
> > On Sun, Sep 11, 2022 at 1:50 PM Takayuki 'January June' Suwa
> >  wrote:
> >>
> >> This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in
> >> order to exclude CONST_INTs that cannot fit into a MOVI machine instruction
> >> from cprop.
> >>
> >> gcc/ChangeLog:
> >>
> >> * config/xtensa/xtensa.c (TARGET_CONSTANT_OK_FOR_CPROP_P):
> >> New macro definition.
> >> (xtensa_constant_ok_for_cprop_p):
> >> Implement the hook as mentioned above.
> >> ---
> >>  gcc/config/xtensa/xtensa.cc | 20 +---
> >>  1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > Regtested for target=xtensa-linux-uclibc, no new regressions.
> > Committed to master.
> >
>
> Oops, sorry, this patch doesn't have the prerequisite patch merged in, so 
> please revert (that target hook isn't working yet).

OK, reverted.

-- 
Thanks.
-- Max


Re: [PATCH 4/4] libstdc++: Implement ranges::slide_view from P2442R1

2022-09-13 Thread Jonathan Wakely via Gcc-patches
On Mon, 12 Sept 2022 at 17:48, Patrick Palka via Libstdc++
 wrote:
>
> This also implements the LWG 3711 and 3712 changes to slide_view.

OK, thanks.



Re: [PATCH 3/4] libstdc++: Implement ranges::chunk_view from P2442R1

2022-09-13 Thread Jonathan Wakely via Gcc-patches
On Mon, 12 Sept 2022 at 17:48, Patrick Palka via Libstdc++
 wrote:
>
> This also implements the LWG 3707, 3710 and 3712 changes to chunk_view.


> +
> +  template
> +requires input_range<_Vp>
> +  class chunk_view<_Vp>::_OuterIter
> +  {
> +chunk_view* _M_parent;
> +
> +constexpr explicit
> +_OuterIter(chunk_view& __parent)

This can be noexcept.

> +  : _M_parent(std::__addressof(__parent))
> +{ }
> +
> +friend chunk_view;
> +
> +  public:
> +using iterator_concept = input_iterator_tag;
> +using difference_type = range_difference_t<_Vp>;
> +
> +struct value_type;
> +
> +_OuterIter(_OuterIter&&) = default;
> +_OuterIter& operator=(_OuterIter&&) = default;
> +
> +constexpr value_type
> +operator*() const
> +{
> +  __glibcxx_assert(*this != default_sentinel);
> +  return value_type(*_M_parent);
> +}
> +
> +constexpr _OuterIter&
> +operator++()
> +{
> +  __glibcxx_assert(*this != default_sentinel);
> +  ranges::advance(*_M_parent->_M_current, _M_parent->_M_remainder,
> + ranges::end(_M_parent->_M_base));
> +  _M_parent->_M_remainder = _M_parent->_M_n;
> +  return *this;
> +}
> +
> +constexpr void
> +operator++(int)
> +{ ++*this; }
> +
> +friend constexpr bool
> +operator==(const _OuterIter& __x, default_sentinel_t)
> +{
> +  return *__x._M_parent->_M_current == 
> ranges::end(__x._M_parent->_M_base)
> +   && __x._M_parent->_M_remainder != 0;
> +}
> +
> +friend constexpr difference_type
> +operator-(default_sentinel_t, const _OuterIter& __x)
> +requires sized_sentinel_for, iterator_t<_Vp>>
> +{
> +  const auto __dist = ranges::end(__x._M_parent->_M_base) - 
> *__x._M_parent->_M_current;
> +
> +  if (__dist < __x._M_parent->_M_remainder)
> +   return __dist == 0 ? 0 : 1;
> +
> +  return 1 + __detail::__div_ceil(__dist - __x._M_parent->_M_remainder,
> + __x._M_parent->_M_n);
> +}
> +
> +friend constexpr difference_type
> +operator-(const _OuterIter& __x, default_sentinel_t __y)
> +requires sized_sentinel_for, iterator_t<_Vp>>
> +{ return -(__y - __x); }
> +  };
> +
> +  template
> +requires input_range<_Vp>
> +  struct chunk_view<_Vp>::_OuterIter::value_type : view_interface
> +  {
> +  private:
> +chunk_view* _M_parent;
> +
> +constexpr explicit
> +value_type(chunk_view& __parent)

And this.

> +: _M_parent(std::__addressof(__parent))
> +{ }
> +
> +friend _OuterIter;
> +



> +  template
> +requires input_range<_Vp>
> +  class chunk_view<_Vp>::_InnerIter
> +  {
> +chunk_view* _M_parent;
> +
> +constexpr explicit
> +_InnerIter(chunk_view& __parent) noexcept

And this already is, so that's nice.


> +: _M_parent(std::__addressof(__parent))
> +{ }
> +
> +friend _OuterIter::value_type;
> +
> +  public:
> +using iterator_concept = input_iterator_tag;
> +using difference_type = range_difference_t<_Vp>;
> +using value_type = range_value_t<_Vp>;
> +
> +_InnerIter(_InnerIter&&) = default;
> +_InnerIter& operator=(_InnerIter&&) = default;
> +
> +constexpr const iterator_t<_Vp>&
> +base() const &
> +{ return *_M_parent->_M_current; }
> +
> +constexpr range_reference_t<_Vp>
> +operator*() const
> +{
> +  __glibcxx_assert(*this != default_sentinel);
> +  return **_M_parent->_M_current;
> +}
> +
> +constexpr _InnerIter&
> +operator++()
> +{
> +  __glibcxx_assert(*this != default_sentinel);
> +  ++*_M_parent->_M_current;
> +  if (*_M_parent->_M_current == ranges::end(_M_parent->_M_base))
> +   _M_parent->_M_remainder = 0;
> +  else
> +   --_M_parent->_M_remainder;
> +  return *this;
> +}
> +
> +constexpr void
> +operator++(int)
> +{ ++*this; }
> +
> +friend constexpr bool
> +operator==(const _InnerIter& __x, default_sentinel_t)

noexcept

OK with those tweaks.



[OG12][PATCH] openmp: Fix handling of target constructs in static member

2022-09-13 Thread Paul-Antoine Arras

Hello,

This patch intends to backport e90af965e5c by Jakub Jelinek to 
devel/omp/gcc-12.
The original patch was described here: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601189.html


Thanks,
--
Paul-Antoine ArrasFrom c1fb6ff897d0b929807d52cf52d4894e252e7d96 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek 
Date: Wed, 7 Sep 2022 08:54:13 +0200
Subject: [PATCH] openmp: Fix handling of target constructs in static member
 functions [PR106829]

Just calling current_nonlambda_class_type in static member functions returns
non-NULL, but something that isn't *this and if unlucky can match part of the
IL and can be added to target clauses.
  if (DECL_NONSTATIC_MEMBER_P (decl)
  && current_class_ptr)
is a guard used elsewhere (in check_accessibility_of_qualified_id).

2022-09-07  Jakub Jelinek  

PR c++/106829
* semantics.cc (finish_omp_target_clauses): If current_function_decl
isn't a nonstatic member function, don't set data.current_object to
non-NULL.

* g++.dg/gomp/pr106829.C: New test.

(cherry picked from commit e90af965e5c858ba02c0cdfbac35d0a19da1c2f6)
---
 gcc/cp/ChangeLog.omp | 10 ++
 gcc/cp/semantics.cc  | 17 -
 gcc/testsuite/ChangeLog.omp  |  8 
 gcc/testsuite/g++.dg/gomp/pr106829.C | 15 +++
 4 files changed, 41 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gomp/pr106829.C

diff --git gcc/cp/ChangeLog.omp gcc/cp/ChangeLog.omp
index 35504f4c92b..c3214f31252 100644
--- gcc/cp/ChangeLog.omp
+++ gcc/cp/ChangeLog.omp
@@ -1,3 +1,13 @@
+2022-09-09  Paul-Antoine Arras  
+
+   Backport from mainline:
+   2022-09-07  Jakub Jelinek  
+
+   PR c++/106829
+   * semantics.cc (finish_omp_target_clauses): If current_function_decl
+   isn't a nonstatic member function, don't set data.current_object to
+   non-NULL.
+
 2022-09-07  Tobias Burnus  
 
Backport from mainline:
diff --git gcc/cp/semantics.cc gcc/cp/semantics.cc
index 7717a820f0d..dad04dc4757 100644
--- gcc/cp/semantics.cc
+++ gcc/cp/semantics.cc
@@ -9727,16 +9727,15 @@ finish_omp_target_clauses (location_t loc, tree body, 
tree *clauses_ptr)
 {
   omp_target_walk_data data;
   data.this_expr_accessed = false;
+  data.current_object = NULL_TREE;
 
-  tree ct = current_nonlambda_class_type ();
-  if (ct)
-{
-  tree object = maybe_dummy_object (ct, NULL);
-  object = maybe_resolve_dummy (object, true);
-  data.current_object = object;
-}
-  else
-data.current_object = NULL_TREE;
+  if (DECL_NONSTATIC_MEMBER_P (current_function_decl) && current_class_ptr)
+if (tree ct = current_nonlambda_class_type ())
+  {
+   tree object = maybe_dummy_object (ct, NULL);
+   object = maybe_resolve_dummy (object, true);
+   data.current_object = object;
+  }
 
   if (DECL_LAMBDA_FUNCTION_P (current_function_decl))
 {
diff --git gcc/testsuite/ChangeLog.omp gcc/testsuite/ChangeLog.omp
index ed42b20b099..9ca4d936ff5 100644
--- gcc/testsuite/ChangeLog.omp
+++ gcc/testsuite/ChangeLog.omp
@@ -1,3 +1,11 @@
+2022-09-09  Paul-Antoine Arras  
+
+   Backport from mainline:
+   2022-09-07  Jakub Jelinek  
+
+   PR c++/106829
+   * g++.dg/gomp/pr106829.C: New test.
+
 2022-09-07  Tobias Burnus  
 
Backport from mainline:
diff --git gcc/testsuite/g++.dg/gomp/pr106829.C 
gcc/testsuite/g++.dg/gomp/pr106829.C
new file mode 100644
index 000..0295efb88ee
--- /dev/null
+++ gcc/testsuite/g++.dg/gomp/pr106829.C
@@ -0,0 +1,15 @@
+// PR c++/106829
+
+namespace std
+{
+  template  class complex;
+  template <> struct complex { complex (double); _Complex double d; };
+}
+struct S { void static foo (); };
+
+void
+S::foo ()
+{
+#pragma omp target
+  std::complex c = 0.0;
+}
-- 
2.31.1



Re: [PATCH 2/4] libstdc++: Implement LWG 3569 changes to join_view::_Iterator

2022-09-13 Thread Jonathan Wakely via Gcc-patches
On Mon, 12 Sept 2022 at 17:46, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk only?

I briefly wondered whether we could just use a union there and provide
the special members to init/copy/destroy it properly, but we already
use  in  so it's probably not worth it.

OK for trunk.

>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (join_view::_Iterator::_M_satisfy):
> Adjust resetting _M_inner as per LWG 3569.
> (join_view::_Iterator::_M_inner): Wrap in std::optional
> as per LWG 3569.
> (join_view::_Iterator::_Iterator): Relax constraints as
> per LWG 3569.
> (join_view::_Iterator::operator*): Adjust as per LWG 3569.
> (join_view::_Iterator::operator->): Likewise.
> (join_view::_Iterator::operator++): Likewise.
> (join_view::_Iterator::operator--): Likewise.
> (join_view::_Iterator::iter_move): Likewise.
> (join_view::_Iterator::iter_swap): Likewise.
> * testsuite/std/ranges/adaptor/join.cc (test14): New test.
> ---
>  libstdc++-v3/include/std/ranges   | 28 +--
>  .../testsuite/std/ranges/adaptors/join.cc | 17 +++
>  2 files changed, 30 insertions(+), 15 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 20eb4e82ac8..6297ce7cee3 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -2746,7 +2746,7 @@ namespace views::__adaptor
>   }
>
> if constexpr (_S_ref_is_glvalue)
> - _M_inner = _Inner_iter();
> + _M_inner.reset();
>   }
>
>   static constexpr auto
> @@ -2769,7 +2769,7 @@ namespace views::__adaptor
>   using _Inner_iter = join_view::_Inner_iter<_Const>;
>
>   _Outer_iter _M_outer = _Outer_iter();
> - _Inner_iter _M_inner = _Inner_iter();
> + optional<_Inner_iter> _M_inner;
>   _Parent* _M_parent = nullptr;
>
> public:
> @@ -2780,9 +2780,7 @@ namespace views::__adaptor
> = common_type_t,
> range_difference_t>>;
>
> - _Iterator() requires (default_initializable<_Outer_iter>
> -   && default_initializable<_Inner_iter>)
> -   = default;
> + _Iterator() requires default_initializable<_Outer_iter> = default;
>
>   constexpr
>   _Iterator(_Parent* __parent, _Outer_iter __outer)
> @@ -2801,7 +2799,7 @@ namespace views::__adaptor
>
>   constexpr decltype(auto)
>   operator*() const
> - { return *_M_inner; }
> + { return **_M_inner; }
>
>   // _GLIBCXX_RESOLVE_LIB_DEFECTS
>   // 3500. join_view::iterator::operator->() is bogus
> @@ -2809,7 +2807,7 @@ namespace views::__adaptor
>   operator->() const
> requires __detail::__has_arrow<_Inner_iter>
>   && copyable<_Inner_iter>
> - { return _M_inner; }
> + { return *_M_inner; }
>
>   constexpr _Iterator&
>   operator++()
> @@ -2820,7 +2818,7 @@ namespace views::__adaptor
>   else
> return *_M_parent->_M_inner;
> }();
> -   if (++_M_inner == ranges::end(__inner_range))
> +   if (++*_M_inner == ranges::end(__inner_range))
>   {
> ++_M_outer;
> _M_satisfy();
> @@ -2850,9 +2848,9 @@ namespace views::__adaptor
>   {
> if (_M_outer == ranges::end(_M_parent->_M_base))
>   _M_inner = ranges::end(*--_M_outer);
> -   while (_M_inner == ranges::begin(*_M_outer))
> - _M_inner = ranges::end(*--_M_outer);
> -   --_M_inner;
> +   while (*_M_inner == ranges::begin(*_M_outer))
> + *_M_inner = ranges::end(*--_M_outer);
> +   --*_M_inner;
> return *this;
>   }
>
> @@ -2879,14 +2877,14 @@ namespace views::__adaptor
>
>   friend constexpr decltype(auto)
>   iter_move(const _Iterator& __i)
> - noexcept(noexcept(ranges::iter_move(__i._M_inner)))
> - { return ranges::iter_move(__i._M_inner); }
> + noexcept(noexcept(ranges::iter_move(*__i._M_inner)))
> + { return ranges::iter_move(*__i._M_inner); }
>
>   friend constexpr void
>   iter_swap(const _Iterator& __x, const _Iterator& __y)
> -   noexcept(noexcept(ranges::iter_swap(__x._M_inner, __y._M_inner)))
> +   noexcept(noexcept(ranges::iter_swap(*__x._M_inner, 
> *__y._M_inner)))
> requires indirectly_swappable<_Inner_iter>
> - { return ranges::iter_swap(__x._M_inner, __y._M_inner); }
> + { return ranges::iter_swap(*__x._M_inner, *__y._M_inner); }
>
>   friend _Iterator;
>   template friend struct _Sentinel;
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc 
> 

Re: [PATCH] libstdc++: Squelch -Wparentheses warning with debug iterators

2022-09-13 Thread Jonathan Wakely via Gcc-patches
On Mon, 12 Sept 2022 at 18:40, Patrick Palka via Libstdc++
 wrote:
>
> I noticed compiling e.g. std/ranges/adaptors/join.cc with
> -D_GLIBCXX_DEBUG -Wsystem-headers -Wall gives the warning:
>
>   gcc/libstdc++-v3/include/debug/safe_iterator.h:477:9: warning: suggest 
> parentheses around ‘&&’ within ‘||’ [-Wparentheses]
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK, thanks.


>
> libstdc++-v3/ChangeLog:
>
> * include/debug/safe_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS):
> Add parentheses to squelch -Wparentheses.
> ---
>  libstdc++-v3/include/debug/safe_iterator.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/debug/safe_iterator.h 
> b/libstdc++-v3/include/debug/safe_iterator.h
> index 33f7a86478a..117dc93de60 100644
> --- a/libstdc++-v3/include/debug/safe_iterator.h
> +++ b/libstdc++-v3/include/debug/safe_iterator.h
> @@ -40,7 +40,7 @@
>  #endif
>
>  #define _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, _BadMsgId, _DiffMsgId) \
> -  _GLIBCXX_DEBUG_VERIFY(!_Lhs._M_singular() && !_Rhs._M_singular() \
> +  _GLIBCXX_DEBUG_VERIFY((!_Lhs._M_singular() && !_Rhs._M_singular())   \
> || (_Lhs._M_value_initialized() \
> && _Rhs._M_value_initialized()),\
> _M_message(_BadMsgId)   \
> --
> 2.37.3.542.gdd3f6c4cae
>



Re: [PATCH] PR target/106877: Robustify reg-stack to malformed asm.

2022-09-13 Thread Uros Bizjak via Gcc-patches
On Tue, Sep 13, 2022 at 11:57 AM Roger Sayle  wrote:
>
>
> This patch resolves PR target/106877 an ICE-on-invalid inline-asm
> regression.  An innocent upstream change means that the test case
> from PR inline-asm/84683 now hits a different assert in reg-stack.cc's
> move_for_stack_reg.  Fixed by duplicating Jakub's solution to PR 84683
> https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495193.html
> at this second (similar) gcc_assert.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-09-13  Roger Sayle  
>
> gcc/ChangeLog
> PR target/106877
> * reg-stack.cc (move_for_stack_reg): Check for any_malformed_asm
> in gcc_assert.
>
> gcc/testsuite/ChangeLog
> PR target/106877
> * g++.dg/ext/pr106877.C: New test case.

OK.

Thanks,
Uros.


[PATCH] PR target/106877: Robustify reg-stack to malformed asm.

2022-09-13 Thread Roger Sayle

This patch resolves PR target/106877 an ICE-on-invalid inline-asm
regression.  An innocent upstream change means that the test case
from PR inline-asm/84683 now hits a different assert in reg-stack.cc's
move_for_stack_reg.  Fixed by duplicating Jakub's solution to PR 84683
https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495193.html
at this second (similar) gcc_assert.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-09-13  Roger Sayle  

gcc/ChangeLog
PR target/106877
* reg-stack.cc (move_for_stack_reg): Check for any_malformed_asm
in gcc_assert.

gcc/testsuite/ChangeLog
PR target/106877
* g++.dg/ext/pr106877.C: New test case.


Thanks,
Roger
--

diff --git a/gcc/reg-stack.cc b/gcc/reg-stack.cc
index fd03250..95e0e61 100644
--- a/gcc/reg-stack.cc
+++ b/gcc/reg-stack.cc
@@ -1073,7 +1073,8 @@ move_for_stack_reg (rtx_insn *insn, stack_ptr regstack, 
rtx pat)
  break;
 
  /* The destination must be dead, or life analysis is borked.  */
- gcc_assert (get_hard_regnum (regstack, dest) < FIRST_STACK_REG);
+ gcc_assert (get_hard_regnum (regstack, dest) < FIRST_STACK_REG
+ || any_malformed_asm);
 
  /* If the source is not live, this is yet another case of
 uninitialized variables.  Load up a NaN instead.  */
diff --git a/gcc/testsuite/g++.dg/ext/pr106877.C 
b/gcc/testsuite/g++.dg/ext/pr106877.C
new file mode 100644
index 000..6bffed9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/pr106877.C
@@ -0,0 +1,13 @@
+// PR target/106877
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O1 -m16 -mtune=sandybridge -flive-range-shrinkage -fno-dce" }
+
+void
+foo (float b, double c)
+{
+  for (int e = 0; e < 2; e++)
+{
+  asm volatile ("" : "+f" (c));// { dg-error "must specify a single 
register" }
+  asm ("" : "+rm" (c = b));
+}
+}


[V2 PATCH] RISC-V:Add '-m[no]-csr-check' option in gcc.

2022-09-13 Thread jiawei
From: Jiawei 

Add -m[no]-csr-check option in gcc part, when enable -mcsr-check option,
it will add csr-check in .option section and pass this to assembler.

V2: Add assembler support check info for -mcsr-check. Thanks for Kito's
suggestions.

gcc/ChangeLog:

* config.in: New def.
* config/riscv/riscv.cc (riscv_file_start): New .option.
* config/riscv/riscv.opt: New options.
* configure.ac: New check.
* doc/invoke.texi: New def.

---
 gcc/config.in  | 6 ++
 gcc/config/riscv/riscv.cc  | 5 +
 gcc/config/riscv/riscv.opt | 6 ++
 gcc/configure.ac   | 5 +
 gcc/doc/invoke.texi| 6 ++
 5 files changed, 28 insertions(+)

diff --git a/gcc/config.in b/gcc/config.in
index 9c53319b544..a4c39e1384d 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -616,6 +616,12 @@
 #endif
 
 
+/* Define if your assembler supports -mcsr-check. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_MCSR_CHECK
+#endif
+
+
 /* Define if your Mac OS X assembler supports -mllvm -x86-pad-for-align=false.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..e98e6b1f561 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5135,6 +5135,11 @@ riscv_file_start (void)
   if (! riscv_mrelax)
 fprintf (asm_out_file, "\t.option norelax\n");
 
+  /* If the user specifies "-mcsr-check" on the command line then enable csr
+ check in the assembler.  */
+  if (riscv_mcsr_check)
+fprintf (asm_out_file, "\t.option csr-check\n");
+
   if (riscv_emit_attribute_p)
 riscv_emit_attribute ();
 }
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index fbca91b956c..3a12dd47310 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -132,6 +132,12 @@ Target Bool Var(riscv_mrelax) Init(1)
 Take advantage of linker relaxations to reduce the number of instructions
 required to materialize symbol addresses.
 
+mcsr-check
+Target Bool Var(riscv_mcsr_check) Init(1)
+Enable the CSR checking for the ISA-dependent CRS and the read-only CSR.
+The ISA-dependent CSR are only valid when the specific ISA is set.  The
+read-only CSR can not be written by the CSR instructions.
+
 Mask(64BIT)
 
 Mask(MUL)
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 50bb61c1b61..1a9288ee659 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5269,6 +5269,11 @@ configured with --enable-newlib-nano-formatted-io.])
   [-march=rv32i_zifencei2p0],,,
   [AC_DEFINE(HAVE_AS_MARCH_ZIFENCEI, 1,
 [Define if the assembler understands -march=rv*_zifencei.])])
+gcc_GAS_CHECK_FEATURE([-mcsr-check],
+  gcc_cv_as_riscv_csr_check,
+  [-mcsr-check],,,
+  [AC_DEFINE(HAVE_AS_MCSR_CHECK, 1,
+[Define if the assembler understands -mcsr-check.])])
 ;;
 loongarch*-*-*)
 gcc_GAS_CHECK_FEATURE([.dtprelword support],
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dd3302fcd15..7caade26b94 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1224,6 +1224,7 @@ See RS/6000 and PowerPC Options.
 -mbig-endian  -mlittle-endian @gol
 -mstack-protector-guard=@var{guard}  -mstack-protector-guard-reg=@var{reg} @gol
 -mstack-protector-guard-offset=@var{offset}}
+-mcsr-check -mno-csr-check @gol
 
 @emph{RL78 Options}
 @gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs @gol
@@ -28551,6 +28552,11 @@ linker relaxations.
 Emit (do not emit) RISC-V attribute to record extra information into ELF
 objects.  This feature requires at least binutils 2.32.
 
+@item -mcsr-check
+@itemx -mno-csr-check
+@opindex mcsr-check
+Enables or disables the CSR checking.
+
 @item -malign-data=@var{type}
 @opindex malign-data
 Control how GCC aligns variables and constants of array, structure, or union
-- 
2.34.1



[PATCH] middle-end/106909 - CTRL altering flag after folding

2022-09-13 Thread Richard Biener via Gcc-patches
The following makes sure to clear the CTRL altering flag when
folding emits a __builitin_unreachable in place of a virtual call
which now might become a trap.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR middle-end/106909
* gimple-fold.cc (gimple_fold_call): Clear the ctrl-altering
flag of a unreachable call.
---
 gcc/gimple-fold.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index a1704784bc9..9055cd8982d 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -5512,6 +5512,7 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace)
{
  location_t loc = gimple_location (stmt);
  gimple *new_stmt = gimple_build_builtin_unreachable (loc);
+ gimple_call_set_ctrl_altering (new_stmt, false);
  /* If the call had a SSA name as lhs morph that into
 an uninitialized value.  */
  if (lhs && TREE_CODE (lhs) == SSA_NAME)
-- 
2.35.3


[PATCH] tree-optimization/106913 - ICE with -da and -Wuninitialized

2022-09-13 Thread Richard Biener via Gcc-patches
The following avoids setting and not clearing an auto_bb_flag
on EXIT_BLOCK which we don't verify for such stale flags but
dump_bb_info still asserts on them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/106913
* tree-ssa-uninit.cc (warn_uninitialized_vars): Do not set
ft_reachable on EXIT_BLOCK.
---
 gcc/tree-ssa-uninit.cc | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-ssa-uninit.cc b/gcc/tree-ssa-uninit.cc
index 4a1c333d9cb..eae29f88f9d 100644
--- a/gcc/tree-ssa-uninit.cc
+++ b/gcc/tree-ssa-uninit.cc
@@ -1013,11 +1013,9 @@ warn_uninitialized_vars (bool wmaybe_uninit)
   if (ee)
bb = ee->dest;
   else
-   {
- bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
- if (!bb || bb->index == EXIT_BLOCK)
-   break;
-   }
+   bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
+  if (!bb || bb->index == EXIT_BLOCK)
+   break;
 }
 
   FOR_EACH_BB_FN (bb, cfun)
-- 
2.35.3


[pushed] aarch64: Vector move fixes for +nosimd

2022-09-13 Thread Richard Sandiford via Gcc-patches
This patch fixes various issues around the handling of vectors
and (particularly) vector structures with +nosimd.  Previously,
passing and returning structures would trigger an ICE, since:

* we didn't allow the structure modes to be stored in FPRs

* we didn't provide +nosimd move patterns

* splitting the moves into word-sized pieces (the default
  strategy without move patterns) doesn't work because the
  registers are doubleword sized.

The patch is a bit of a hodge-podge since a lot of the handling of
moves, register costs, and register legitimacy is so interconnected.
It didn't seem feasible to split things further.

Some notes:

* The patch recognises vector and tuple modes based on TARGET_FLOAT
  rather than TARGET_SIMD, and instead adds TARGET_SIMD to places
  that really do need the vector ISA.  This is necessary for the
  modes to be handled correctly in register arguments and returns.

* The 64-bit (DREG) STP peephole required TARGET_SIMD but the
  LDP peephole didn't.  I think the LDP one is right, since
  DREG moves could involve GPRs as well as FPRs.

* The patch keeps the existing choices of instructions for
  TARGET_SIMD, just in case they happen to be better than FMOV
  on some uarches.

* Before the patch, +nosimd Q<->Q moves of 128-bit scalars went via
  a GPR, thanks to a secondary reload pattern.  This approach might
  not be ideal, but there's no reason that 128-bit vectors should
  behave differently from 128-bit scalars.  The patch therefore
  extends the current scalar approach to vectors.

* Multi-vector LD1 and ST1 require TARGET_SIMD, so the TARGET_FLOAT
  structure moves need to use LDP/STP and LDR/STR combinations
  instead.  That's also what we do for big-endian even with
  TARGET_SIMD, so most of the code was already there.  The patterns
  for structures of 64-bit vectors are identical, but the patterns
  for structures of 128-bit vectors need to cope with the lack of
  128-bit Q<->Q moves.

  It isn't feasible to move multi-vector tuples via GPRs, so the
  patch moves them via memory instead.  This contaminates the port
  with its first secondary memory reload.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/

* config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Use
TARGET_FLOAT instead of TARGET_SIMD.
(aarch64_vectorize_related_mode): Restrict ADVSIMD handling to
TARGET_SIMD.
(aarch64_hard_regno_mode_ok): Don't allow tuples of 2 64-bit vectors
in GPRs.
(aarch64_classify_address): Treat little-endian structure moves
like big-endian for TARGET_FLOAT && !TARGET_SIMD.
(aarch64_secondary_memory_needed): New function.
(aarch64_secondary_reload): Handle 128-bit Advanced SIMD vectors
in the same way as TF, TI and TD.
(aarch64_rtx_mult_cost): Restrict ADVSIMD handling to TARGET_SIMD.
(aarch64_rtx_costs): Likewise.
(aarch64_register_move_cost): Treat a pair of 64-bit vectors
separately from a single 128-bit vector.  Handle the cost implied
by aarch64_secondary_memory_needed.
(aarch64_simd_valid_immediate): Restrict ADVSIMD handling to
TARGET_SIMD.
(aarch64_expand_vec_perm_const_1): Likewise.
(TARGET_SECONDARY_MEMORY_NEEDED): New macro.
* config/aarch64/iterators.md (VTX): New iterator.
* config/aarch64/aarch64.md (arches): Add fp_q as a synonym of simd.
(arch_enabled): Adjust accordingly.
(@aarch64_reload_mov): Extend to...
(@aarch64_reload_mov): ...this.
* config/aarch64/aarch64-simd.md (mov): Require TARGET_FLOAT
rather than TARGET_SIMD.
(movmisalign): Likewise.
(load_pair): Likewise.
(vec_store_pair): Likewise.
(load_pair): Likewise.
(vec_store_pair): Likewise.
(@aarch64_split_simd_mov): Likewise.
(aarch64_get_low): Likewise.
(aarch64_get_high): Likewise.
(aarch64_get_half): Likewise.  Canonicalize to a move for
lowpart extracts.
(*aarch64_simd_mov): Require TARGET_FLOAT rather than
TARGET_SIMD.  Use different w<-w and r<-w instructions for
!TARGET_SIMD.  Disable immediate moves for !TARGET_SIMD but
add an alternative specifically for w<-Z.
(*aarch64_simd_mov): Require TARGET_FLOAT rather than
TARGET_SIMD.  Likewise for the associated define_splits.  Disable
FPR moves and immediate moves for !TARGET_SIMD but add an alternative
specifically for w<-Z.
(aarch64_simd_mov_from_high): Require TARGET_FLOAT rather than
TARGET_SIMD.  Restrict the existing alternatives to TARGET_SIMD
but add a new r<-w one for !TARGET_SIMD.
(*aarch64_get_high): New pattern.
(load_pair_lanes): Require TARGET_FLOAT rather than TARGET_SIMD.
(store_pair_lanes): Likewise.
(*aarch64_combine_internal): Likewise.  Restrict existing
w<-w, w<-r and w<-m alternatives to TARGET_SIMD but 

[pushed] aarch64: Disassociate ls64 from simd

2022-09-13 Thread Richard Sandiford via Gcc-patches
The ls64-related move expanders and splits required TARGET_SIMD.
That isn't necessary, since the 64-byte values are stored entirely
in GPRs.  (The associated define_insn was already correct.)

I wondered about moving the patterns to aarch64.md, but it wasn't
clear-cut.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
* config/aarch64/aarch64-simd.md (movv8di): Remove TARGET_SIMD
condition.  Likewise for the related define_split.  Tweak formatting.

gcc/testsuite/
* gcc.target/aarch64/acle/ls64_asm_2.c: New test.
---
 gcc/config/aarch64/aarch64-simd.md | 18 +-
 .../gcc.target/aarch64/acle/ls64_asm_2.c   |  9 +
 2 files changed, 18 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 587a45d7772..d4662c76a58 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7087,7 +7087,7 @@ (define_expand "mov"
 (define_expand "movv8di"
   [(set (match_operand:V8DI 0 "nonimmediate_operand")
(match_operand:V8DI 1 "general_operand"))]
-  "TARGET_SIMD"
+  ""
 {
   if (can_create_pseudo_p () && MEM_P (operands[0]))
 operands[1] = force_reg (V8DImode, operands[1]);
@@ -7479,7 +7479,7 @@ (define_split
 (define_split
   [(set (match_operand:V8DI 0 "nonimmediate_operand")
 (match_operand:V8DI 1 "general_operand"))]
-  "TARGET_SIMD && reload_completed"
+  "reload_completed"
   [(const_int 0)]
 {
   if (register_operand (operands[0], V8DImode)
@@ -7489,15 +7489,15 @@ (define_split
   DONE;
 }
   else if ((register_operand (operands[0], V8DImode)
-&& memory_operand (operands[1], V8DImode))
-   || (memory_operand (operands[0], V8DImode)
-&& register_operand (operands[1], V8DImode)))
+   && memory_operand (operands[1], V8DImode))
+  || (memory_operand (operands[0], V8DImode)
+  && register_operand (operands[1], V8DImode)))
 {
   for (int offset = 0; offset < 64; offset += 16)
-emit_move_insn (simplify_gen_subreg (TImode, operands[0],
- V8DImode, offset),
-simplify_gen_subreg (TImode, operands[1],
- V8DImode, offset));
+   emit_move_insn (simplify_gen_subreg (TImode, operands[0],
+V8DImode, offset),
+   simplify_gen_subreg (TImode, operands[1],
+V8DImode, offset));
   DONE;
 }
   else
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c 
b/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c
new file mode 100644
index 000..1b4277180a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+#pragma GCC target "+ls64+nofp"
+
+#include "ls64_asm.c"
+
+/* { dg-final { scan-assembler-times {\tldp\t} 12 } } */
+/* { dg-final { scan-assembler-times {\tstp\t} 4 } } */
-- 
2.25.1



RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]

2022-09-13 Thread Kong, Lingling via Gcc-patches
Hi Richard,

Thanks you so much for reviewing this patch.  I really appreciate it. For these 
review comments, I have made some changes.

> That's a single-stmt match, you shouldn't use match.pd matching for this.
> Instead just do
> 
>   if (is_gimple_assign (stmt)
>   && ((code = gimple_assign_rhs_code (stmt)), true)
>   && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code ==
> BIT_XOR_EXPR))

Yes, I fixed it and dropped modification for match.pd.

> and pick gimple_assign_rhs{1,2} (stmt) as the operands.  The :c in bit_op:c is
> redundant btw. - while the name suggests "with invariant" you don't actually
> check for that.  But again, given canonicalization rules the invariant will 
> be rhs2
> so above add
> 
> && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST

For " with invariant", this needed op1 is invariant, and I used 
`expr_invariant_in_loop_p (loop, match_op[0])` for check.
And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly 
optimized and do not need modification.

> you probably need dg-require-effective-target longlong, but is it necessary to
> use long long for the testcases in the first place?
> The IV seems to be unused, if it should match the variables bit size use 
> sizeof
> (type) * 8

Yes, It is not necessary to use long long for the testcases. I changed type to 
unsigned int.

> > +  inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge
> > + (loop));  return fold_build2 (code1, type, inv, match_op[0]); }
> 
> The } goes to the next line.

Sorry, It might be something wrong with my use of gcc send-email format.

> > +  tree bitinv_def;
> > +  if ((bitinv_def
> 
> please use else if here

Sorry, If use the else if here, there is no corresponding above if. I'm not 
sure if you mean change bitwise induction expression if to else if.

Do you agree with these changes?  Thanks again for taking a look.

Thanks,
Lingling

> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, August 23, 2022 3:27 PM
> To: Kong, Lingling 
> Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop
> with an invariant induction.[PR105735]
> 
> On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches  patc...@gcc.gnu.org> wrote:
> >
> > Hi,
> >
> > This patch is for pr105735/pr101991. It will enable below optimization:
> > {
> > -  long unsigned int bit;
> > -
> > -   [local count: 32534376]:
> > -
> > -   [local count: 1041207449]:
> > -  # tmp_10 = PHI 
> > -  # bit_12 = PHI 
> > -  tmp_7 = bit2_6(D) & tmp_10;
> > -  bit_8 = bit_12 + 1;
> > -  if (bit_8 != 32)
> > -goto ; [96.97%]
> > -  else
> > -goto ; [3.03%]
> > -
> > -   [local count: 1009658865]:
> > -  goto ; [100.00%]
> > -
> > -   [local count: 32534376]:
> > -  # tmp_11 = PHI 
> > -  return tmp_11;
> > +  tmp_11 = tmp_4(D) & bit2_6(D);
> > +  return tmp_11;
> >
> > }
> >
> > Ok for master ?
> >
> > gcc/ChangeLog:
> >
> > PR middle-end/105735
> > * match.pd (bitop_with_inv_p): New match.
> > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare.
> > (analyze_and_compute_bitop_with_inv_effect): New function.
> > (final_value_replacement_loop): Enhanced to handle bitop
> > with inv induction.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr105735-1.c: New test.
> > * gcc.target/i386/pr105735-2.c: New test.
> > ---
> >  gcc/match.pd   |  4 +
> >  gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++
> gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++
> >  gcc/tree-scalar-evolution.cc   | 59 +++
> >  4 files changed, 179 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd index
> > 562138a8034..cfe593ebb02 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -8050,6 +8050,10 @@ and,
> >   (bit_not
> >(nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2))
> > @3
> >
> > +(for bit_op (bit_and bit_ior bit_xor)  (match (bitop_with_inv_p @0
> > +@1)
> > +  (bit_op:c @0 @1)))
> > +
> 
> That's a single-stmt match, you shouldn't use match.pd matching for this.
> Instead just do
> 
>   if (is_gimple_assign (stmt)
>   && ((code = gimple_assign_rhs_code (stmt)), true)
>   && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code ==
> BIT_XOR_EXPR))
>..
> 
> and pick gimple_assign_rhs{1,2} (stmt) as the operands.  The :c in bit_op:c is
> redundant btw. - while the name suggests "with invariant" you don't actually
> check for that.  But again, given canonicalization rules the invariant will 
> be rhs2
> so above add
> 
> && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST
> 
> for example.
> 
> >  /* n - (((n > C1) ? n : C1) & -C2) ->  n & C1 for unsigned case.
> 

[committed] libgomp.texi: move item from gcn to nvptx

2022-09-13 Thread Tobias Burnus

Placed the reverse-offload -march=sm_30 vs. >=sm_35 remark accidentally
to the 'gcn' section. The surrounding wording it very similar, hence, I
did not spot it.

Committed as r13-2636.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit eec36f27c3ca85d3b6b469e7161b63b69a5823ac
Author: Tobias Burnus 
Date:   Tue Sep 13 09:08:57 2022 +0200

libgomp.texi: move item from gcn to nvptx

I misplaced one remark into 'gcn' instead of 'nvptx' in
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2

libgomp/ChangeLog:

* libgomp.texi (gcn): Move misplaced -march=sm_30 remark to ...
(nvptx): ... here.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 1f402d6df79..5aef987f00a 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -4386,9 +4386,6 @@ The implementation remark:
 @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
   using the C library @code{printf} functions and the Fortran
   @code{print}/@code{write} statements.
-@item Compilation OpenMP code that contains @code{requires reverse_offload}
-  requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
-  is not supported.
 @end itemize
 
 
@@ -4435,6 +4432,9 @@ The implementation remark:
 @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
   using the C library @code{printf} functions. Note that the Fortran
   @code{print}/@code{write} statements are not supported, yet.
+@item Compilation OpenMP code that contains @code{requires reverse_offload}
+  requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
+  is not supported.
 @end itemize
 
 


Re: [PATCH v2, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-09-13 Thread Iain Sandoe via Gcc-patches
Hi


> On 13 Sep 2022, at 03:34, HAO CHEN GUI via Gcc-patches 
>  wrote:

> On 7/9/2022 上午 1:19, Segher Boessenkool wrote:
>> make -k -j60 check 
>> RUNTESTFLAGS="--target_board=unix'{-m64,-m32,-m32/-mpowerpc64}'"
>> 
>> It is fine to not test -m32/-mpowerpc64 so often, and certaionly not
>> something I will ask everyone to always do :-)
> 
> IMO, if we add "-mpowerpc64" into dg-options, the "-m32/-mpowerpc64" will be 
> tested
> automatically. It will increase the test coverage. So the concern is it 
> increases test
> time?

Do you mean for the powerpc-target tests, or for the entire suite?
 - 50% increase is pretty significant to the entire suite.  It would probably 
only be acceptable on
the newer powerful machines.

The stategy would need to be a little more subtle than adding it everywhere  - 
it is not applicable
to 32b targets running on 32b hardware - however, it *is* potentially 
applicable to 32b targets
running on 64b hardware (I have one of those).



There are tricks you can play with dejagnu to automate this for your setup… by 
pointing the 
DEJAGNU environment to a script (or by using .dejagnurc in your home directory).

You can add customisation in this way, even to set different parameters per 
host.  This saves
me from remembering to put the RUNTESTFLAGS=… extras for the target on each 
test line.

— so here’s an example as I use it..

I have an environment/script setup like this (which I use to control what is 
done across a range of
machines with capabilities that vary by at least a factor of 10):

DEJAGNU=/path/to/gcc-runtest-site.exp

gcc-runtest-site.exp:

global target_list
global tool_timeout

case "$target_triplet" in {
   { "*-*-darwin[5-8]*" } {
set target_list { "unix" }
set tool_timeout 240
puts "tool timeout 3mins"
   }
   { "*-*-darwin9*" } {
set target_list { "unix{-m32,-m64}" }
set tool_timeout 240
   }
   { "i?86-*-darwin10*" } {
set target_list { "unix" }
   }
   { "powerpc-*-darwin10*" } {
set target_list { "unix" }
   }
   { "*-*-darwin1[01234567]*" } {
set target_list { "unix{-m64,-m32}" }
set tool_timeout 120
puts "gcc-runtest-site : tool timeout 2 mins"
   }
   { "*-*-darwin1[89]*" } {
set target_list { "unix" }
set tool_timeout 60
puts "gcc-runtest-site : tool timeout 1 min"
   }
   { "x86_64-*-linux-gnu" } {
set target_list { "unix{-m32,-m64}" }
set tool_timeout 60
   }



   default {
# Works for most unixy things and for darwin < 9.
   set target_list { "unix" }
   }
}

- sometimes I override this with specific values (especially for timeout) in 
~/..dejagnurc on
  very slow boxes.



0.02GBP only,

cheers
Iain



Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-13 Thread Tobias Burnus

@Alexander/@Tom – Can you comment on both libgomp/config/nvptx + 
libgomp/plugin/plugin-nvptx.c ? (Comments on the rest are welcome, too)

(Updated patch enclosed)

Because Jakub asked:

I'm afraid you need Alexander or Tom here, I don't feel I can review it;
I could rubber stamp it if they are ok with it.


Regarding:

How did the standardization process for this feature look like, how did it pass
if it's not efficiently implementable for the major offloading targets?

It doesn't have to be implementable on all major offloading targets, it is 
enough when it can work on some. As one needs to request the reverse offloading 
through a declarative directive, it is always possible in that case to just 
pretend devices that don't support it don't exist.

First, I think it is better to provide a feature even if it is slow – than not 
providing it at all. Secondly, as Jakub mentioned, it is not required that all 
devices support this feature well. It is sufficient some do.

I believe on of the main uses is debugging and for this use, the performance is 
not critical. This patch attempts to have no overhead if the feature is not 
used (no 'omp requires reverse_offload' and no actual reverse offload).

Additionally, for GCN, it can be implemented with almost no overhead by using 
the feature used for I/O. (CUDA implements 'printf' internally – but does not 
permit to piggyback on this feature.)

* * *

I think in the future, we could additionally pass information to 
GOMP_target_ext whether a target region is known not to do reverse offload – 
both by checking what's in the region and by utilizing an 'absent(target)' 
assumption places in the outer target regsion on an '(begin) assume(s)' 
directive. That should at least help with the common case of having no reverse 
offload – even if it does not for some large kernel which does use reverse 
offload for non-debugging purpose (e.g. to trigger file I/O or inter-node 
communication).

* * *

Regarding the implementation: I left in 'usleep(1)' for now – 1µs seems to be 
not too bad and I have no idea what's better.

I also don't have any idea what's the overhead for accessing concurrently 
accessible managed memory from the host (is it on the host until requested from 
the device – or is it always on the device and needs to be copied/migrated to 
the host for every access). Likewise, I don't know how much overhead it is to 
D2H copy the memory via the second CUDA stream.

Suggestions are welcome. But as this code is strictly confined to a single 
function, it can easily be modified later.

Documentation: I have not mentioned caveats in 
https://gcc.gnu.org/onlinedocs/libgomp/nvptx.html as the reverse offload is not 
yet enabled, even with this patch.

On 26.08.22 11:07, Tobias Burnus wrote:

PRE-REMARK

As nvptx (and all other plugins) returns <= 0 for
GOMP_OFFLOAD_get_num_devices if GOMP_REQUIRES_REVERSE_OFFLOAD is
set. This patch is currently still a no op.

The patch is almost stand alone, except that it either needs a
 void *rev_fn_table = NULL;
in GOMP_OFFLOAD_load_image or the following patch:
 [Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600348.html
(which in turn needs the '[1/3]' patch).

Not required to be compilable, but the patch is based on the ideas/code from
the reverse-offload ME patch; the latter adds calls to
 GOMP_target_ext (omp_initial_device,
which is for host fallback code processed by the normal target_ext and for
device code by the target_ext of this patch.
→ "[Patch] OpenMP: Support reverse offload (middle end part)"
 https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html

* * *

This patch adds initial offloading support for nvptx.
When the nvptx's device GOMP_target_ext is called - it creates a lock,
fills a struct with the argument pointers (addr, kinds, sizes), its
device number and the set the function pointer address.

On the host side, the last address is checked - if fn_addr != NULL,
it passes all arguments on to the generic (target.c) gomp_target_rev
to do the actual offloading.

CUDA does lockup when trying to copy data from the currently running
stream; hence, a new stream is generated to do the memory copying.
Just having managed memory is not enough - it needs to be concurrently
accessible - otherwise, it will segfault on the host when migrated to
the device.

OK for mainline?

* * *

Future work for nvptx:
* Adjust 'sleep', possibly using different values with and without USM and
 to do shorter sleeps than usleep(1)?
* Set a flag whether there is any offload function at all, avoiding to run
 the more expensive check if there is 'requires reverse_offload' without
 actual reverse-offloading functions present.
 (Recall that the '2/3' patch, mentioned above, only has fn != NULL for
 reverse-offload functions.)
* Document → libgomp.texi that reverse offload may cause some performance
 overhead for all target regions. + That reverse