date:20240531

[gcc r15-965] [to-be-committed] [RISC-V] Use Zbkb for general 64 bit constants when profitable

2024-05-31 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:c0ded050cd29cc73f78cb4ab23674c7bc024969e

commit r15-965-gc0ded050cd29cc73f78cb4ab23674c7bc024969e
Author: Jeff Law 
Date:   Fri May 31 21:45:01 2024 -0600

[to-be-committed] [RISC-V] Use Zbkb for general 64 bit constants when 
profitable

Basically this adds the ability to generate two independent constants during
synthesis, then bring them together with a pack instruction. Thus we never 
need
to go out to the constant pool when zbkb is enabled. The worst sequence we 
ever
generate is

lui+addi+lui+addi+pack

Obviously if either half can be synthesized with just a lui or just an addi,
then we'll DTRT automagically.   So for example:

unsigned long foo_0xf857f2def857f2de(void) {
return 0x14252800;
}

The high and low halves are just a lui.  So the final synthesis is:

> li  a5,671088640# 15[c=4 l=4]  *movdi_64bit/1
> li  a0,337969152# 16[c=4 l=4]  *movdi_64bit/1
> packa0,a5,a0# 17[c=12 l=4]  riscv_xpack_di_si_2

On the implementation side, I think the bits I've put in here likely can be
used to handle the repeating constant case for !zbkb.  I think it likely 
could
be used to help capture cases where the upper half can be derived from the
lower half (say by turning a bit on or off, shifting or something similar).
The key in both of these cases is we need a temporary register holding an
intermediate value.

Ventana's internal tester enables zbkb, but I don't think any of the other
testers currently exercise zbkb.  We'll probably want to change that at some
point, but I don't think it's super-critical yet.

While I can envision a few more cases where we could improve constant
synthesis,   No immediate plans to work in this space, but if someone is
interested, some thoughts are recorded here:

> 
https://wiki.riseproject.dev/display/HOME/CT_00_031+--+Additional+Constant+Synthesis+Improvements

gcc/
* config/riscv/riscv.cc (riscv_integer_op): Add new field.
(riscv_build_integer_1): Initialize the new field.
(riscv_built_integer): Recognize more cases where Zbkb's
pack instruction is profitable.
(riscv_move_integer): Loop over all the codes.  If requested,
save the current constant into a temporary.  Generate pack
for more cases using the saved constant.

gcc/testsuite

* gcc.target/riscv/synthesis-10.c: New test.

Diff:
---
 gcc/config/riscv/riscv.cc | 108 ++
 gcc/testsuite/gcc.target/riscv/synthesis-10.c |  18 +
 2 files changed, 110 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 91fefacee80..10af38a5a81 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -250,6 +250,7 @@ struct riscv_arg_info {
and each VALUE[i] is a constant integer.  CODE[0] is undefined.  */
 struct riscv_integer_op {
   bool use_uw;
+  bool save_temporary;
   enum rtx_code code;
   unsigned HOST_WIDE_INT value;
 };
@@ -759,6 +760,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   codes[0].code = UNKNOWN;
   codes[0].value = value;
   codes[0].use_uw = false;
+  codes[0].save_temporary = false;
   return 1;
 }
   if (TARGET_ZBS && SINGLE_BIT_MASK_OPERAND (value))
@@ -767,6 +769,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   codes[0].code = UNKNOWN;
   codes[0].value = value;
   codes[0].use_uw = false;
+  codes[0].save_temporary = false;
 
   /* RISC-V sign-extends all 32bit values that live in a 32bit
 register.  To avoid paradoxes, we thus need to use the
@@ -796,6 +799,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  alt_codes[alt_cost-1].code = PLUS;
  alt_codes[alt_cost-1].value = low_part;
  alt_codes[alt_cost-1].use_uw = false;
+ alt_codes[alt_cost-1].save_temporary = false;
  memcpy (codes, alt_codes, sizeof (alt_codes));
  cost = alt_cost;
}
@@ -810,6 +814,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  alt_codes[alt_cost-1].code = XOR;
  alt_codes[alt_cost-1].value = low_part;
  alt_codes[alt_cost-1].use_uw = false;
+ alt_codes[alt_cost-1].save_temporary = false;
  memcpy (codes, alt_codes, sizeof (alt_codes));
  cost = alt_cost;
}
@@ -852,6 +857,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  alt_codes[alt_cost-1].code = ASHIFT;
  alt_codes[alt_cost-1].value = shift;
  alt_codes[alt_cost-1].use_uw = use_uw;
+ alt_codes[alt_cost-1].save_temporary = fal

[gcc r15-964] c++/modules: Fix revealing with using-decls [PR114867]

2024-05-31 Thread Nathaniel Shead via Gcc-cvs

https://gcc.gnu.org/g:85f15ea65a97686ad39af0c14b7dd9a9372e3a19

commit r15-964-g85f15ea65a97686ad39af0c14b7dd9a9372e3a19
Author: Nathaniel Shead 
Date:   Sat Jun 1 01:14:44 2024 +1000

c++/modules: Fix revealing with using-decls [PR114867]

This patch fixes a couple issues with the current handling of revealing
declarations with using-decls.

Firstly, doing 'remove_node' when handling function overload sets is not
safe, because it not only mutates the OVERLOAD we're walking over but
potentially any other references to this OVERLOAD that are cached from
phase-1 template lookup.  This causes the attached using-17 testcase to
fail because the overload set in 'X::test()' no longer contains the
'ns::f(T)' template once instantiated at the end of the file.

This patch works around this by simply not removing the old declaration.
This does make the overload list potentially longer than it otherwise
would have been, but only when re-exporting the same set of functions in
a using-decl.  Additionally, because 'ovl_insert' always prepends these
newly inserted overloads, repeated exported using-decls won't continue
to add declarations, as the first exported using-decl will be found
before the original (unexported) declaration.

Another, related, issue is that using-decls of GMF entities currently
doesn't mark them as reachable unless they are also exported, and thus
they may not be available in e.g. module implementation units.  We solve
this with a new flag on OVERLOADs set when they are declared within the
module purview.  This starts to run into the more general issue of
handling using-decls of non-functions (see e.g. PR114863) but by just
marking such GMF entities as purview we can work around this for now.

This also allows us to get rid of the special-casing of exported
using-decls in 'add_binding_entity', which was incorrect anyway: a
non-exported using-decl still needs to be emitted anyway if it lives in
the module purview, even if referring to a non-purview item.

PR c++/114867

gcc/cp/ChangeLog:

* cp-tree.h (OVL_PURVIEW_P): New.
(ovl_iterator::purview_p): New.
* module.cc (depset::hash::add_binding_entity): Only ignore
entities not within module purview. Set OVL_PURVIEW_P on new
OVERLOADs for emitted declarations.
(module_state::read_cluster): Imported using-decls are always
in purview, mark as OVL_PURVIEW_P.
* name-lookup.h (enum WMB_Flags): New WMB_Purview flag.
* name-lookup.cc (walk_module_binding): Set WMB_Purview as
needed.
(do_nonmember_using_decl): Don't remove from existing OVERLOADs.
Also reveal non-exported decls. Also reveal 'extern "C"' decls.
Add workaround to reveal non-function decls.
* tree.cc (ovl_insert): Adjust to also set OVL_PURVIEW_P when
needed.

gcc/testsuite/ChangeLog:

* g++.dg/modules/using-17_a.C: New test.
* g++.dg/modules/using-17_b.C: New test.
* g++.dg/modules/using-18_a.C: New test.
* g++.dg/modules/using-18_b.C: New test.

Signed-off-by: Nathaniel Shead 

Diff:
---
 gcc/cp/cp-tree.h  |   7 ++
 gcc/cp/module.cc  |   8 ++-
 gcc/cp/name-lookup.cc | 106 --
 gcc/cp/name-lookup.h  |   1 +
 gcc/cp/tree.cc|  12 ++--
 gcc/testsuite/g++.dg/modules/using-17_a.C |  31 +
 gcc/testsuite/g++.dg/modules/using-17_b.C |  13 
 gcc/testsuite/g++.dg/modules/using-18_a.C |  29 
 gcc/testsuite/g++.dg/modules/using-18_b.C |  11 
 9 files changed, 161 insertions(+), 57 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6206482c602..565e4a9290e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -813,6 +813,8 @@ typedef struct ptrmem_cst * ptrmem_cst_t;
 #define OVL_LOOKUP_P(NODE) TREE_LANG_FLAG_4 (OVERLOAD_CHECK (NODE))
 /* If set, this OVL_USING_P overload is exported.  */
 #define OVL_EXPORT_P(NODE) TREE_LANG_FLAG_5 (OVERLOAD_CHECK (NODE))
+/* If set, this OVL_USING_P overload is in the module purview.  */
+#define OVL_PURVIEW_P(NODE)(OVERLOAD_CHECK (NODE)->base.public_flag)
 /* If set, this overload includes name-independent declarations.  */
 #define OVL_NAME_INDEPENDENT_DECL_P(NODE) \
   TREE_LANG_FLAG_6 (OVERLOAD_CHECK (NODE))
@@ -887,6 +889,11 @@ class ovl_iterator {
 return (TREE_CODE (ovl) == USING_DECL
|| (TREE_CODE (ovl) == OVERLOAD && OVL_USING_P (ovl)));
   }
+  /* Whether this using is in the module purview.  */
+  bool purview_p () const
+  {
+return OVL_PURVIEW_P (get_using ());
+  }
   /* Whether this using is being exported.  */
   bool exporting_p () const
   {
dif

[gcc r15-963] vect: Bind input vectype to lane-reducing operation

2024-05-31 Thread Feng Xue via Gcc-cvs

https://gcc.gnu.org/g:d53f555edb95248dbf81347ba5e4136e9a491eca

commit r15-963-gd53f555edb95248dbf81347ba5e4136e9a491eca
Author: Feng Xue 
Date:   Wed May 29 16:41:57 2024 +0800

vect: Bind input vectype to lane-reducing operation

The input vectype is an attribute of lane-reducing operation, instead of
reduction PHI that it is associated to, since there might be more than one
lane-reducing operations with different type in a loop reduction chain. So
bind each lane-reducing operation with its own input type.

2024-05-29 Feng Xue 

gcc/
* tree-vect-loop.cc (vect_is_emulated_mixed_dot_prod): Remove 
parameter
loop_vinfo. Get input vectype from stmt_info instead of reduction 
PHI.
(vect_model_reduction_cost): Remove loop_vinfo argument of call to
vect_is_emulated_mixed_dot_prod.
(vect_transform_reduction): Likewise.
(vectorizable_reduction): Likewise, and bind input vectype to
lane-reducing operation.

Diff:
---
 gcc/tree-vect-loop.cc | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 7a6a6b6161d..5b85cffb37f 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -5270,8 +5270,7 @@ have_whole_vector_shift (machine_mode mode)
See vect_emulate_mixed_dot_prod for the actual sequence used.  */
 
 static bool
-vect_is_emulated_mixed_dot_prod (loop_vec_info loop_vinfo,
-stmt_vec_info stmt_info)
+vect_is_emulated_mixed_dot_prod (stmt_vec_info stmt_info)
 {
   gassign *assign = dyn_cast (stmt_info->stmt);
   if (!assign || gimple_assign_rhs_code (assign) != DOT_PROD_EXPR)
@@ -5282,10 +5281,9 @@ vect_is_emulated_mixed_dot_prod (loop_vec_info 
loop_vinfo,
   if (TYPE_SIGN (TREE_TYPE (rhs1)) == TYPE_SIGN (TREE_TYPE (rhs2)))
 return false;
 
-  stmt_vec_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
-  gcc_assert (reduc_info->is_reduc_info);
+  gcc_assert (STMT_VINFO_REDUC_VECTYPE_IN (stmt_info));
   return !directly_supported_p (DOT_PROD_EXPR,
-   STMT_VINFO_REDUC_VECTYPE_IN (reduc_info),
+   STMT_VINFO_REDUC_VECTYPE_IN (stmt_info),
optab_vector_mixed_sign);
 }
 
@@ -5324,8 +5322,8 @@ vect_model_reduction_cost (loop_vec_info loop_vinfo,
   if (!gimple_extract_op (orig_stmt_info->stmt, &op))
 gcc_unreachable ();
 
-  bool emulated_mixed_dot_prod
-= vect_is_emulated_mixed_dot_prod (loop_vinfo, stmt_info);
+  bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info);
+
   if (reduction_type == EXTRACT_LAST_REDUCTION)
 /* No extra instructions are needed in the prologue.  The loop body
operations are costed in vectorizable_condition.  */
@@ -7837,6 +7835,11 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 vectype_in = STMT_VINFO_VECTYPE (phi_info);
   STMT_VINFO_REDUC_VECTYPE_IN (reduc_info) = vectype_in;
 
+  /* Each lane-reducing operation has its own input vectype, while reduction
+ PHI records the input vectype with least lanes.  */
+  if (lane_reducing)
+STMT_VINFO_REDUC_VECTYPE_IN (stmt_info) = vectype_in;
+
   enum vect_reduction_type v_reduc_type = STMT_VINFO_REDUC_TYPE (phi_info);
   STMT_VINFO_REDUC_TYPE (reduc_info) = v_reduc_type;
   /* If we have a condition reduction, see if we can simplify it further.  */
@@ -8363,7 +8366,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   if (single_defuse_cycle || lane_reducing)
 {
   int factor = 1;
-  if (vect_is_emulated_mixed_dot_prod (loop_vinfo, stmt_info))
+  if (vect_is_emulated_mixed_dot_prod (stmt_info))
/* Three dot-products and a subtraction.  */
factor = 4;
   record_stmt_cost (cost_vec, ncopies * factor, vector_stmt,
@@ -8615,8 +8618,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
: &vec_oprnds2));
 }
 
-  bool emulated_mixed_dot_prod
-= vect_is_emulated_mixed_dot_prod (loop_vinfo, stmt_info);
+  bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info);
+
   FOR_EACH_VEC_ELT (vec_oprnds0, i, def0)
 {
   gimple *new_stmt;

[gcc r15-962] vect: Split out partial vect checking for reduction into a function

2024-05-31 Thread Feng Xue via Gcc-cvs

https://gcc.gnu.org/g:79c3547b8adfdfdb2a167c1b9c9428902510adab

commit r15-962-g79c3547b8adfdfdb2a167c1b9c9428902510adab
Author: Feng Xue 
Date:   Wed May 29 13:45:09 2024 +0800

vect: Split out partial vect checking for reduction into a function

Partial vectorization checking for vectorizable_reduction is a piece of
relatively isolated code, which may be reused by other places. Move the
code into a new function for sharing.

2024-05-29 Feng Xue 

gcc/
* tree-vect-loop.cc (vect_reduction_update_partial_vector_usage): 
New
function.
(vectorizable_reduction): Move partial vectorization checking code 
to
vect_reduction_update_partial_vector_usage.

Diff:
---
 gcc/tree-vect-loop.cc | 137 --
 1 file changed, 77 insertions(+), 60 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index a42d79c7cbf..7a6a6b6161d 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7391,6 +7391,79 @@ build_vect_cond_expr (code_helper code, tree vop[3], 
tree mask,
 }
 }
 
+/* Given an operation with CODE in loop reduction path whose reduction PHI is
+   specified by REDUC_INFO, the operation has TYPE of scalar result, and its
+   input vectype is represented by VECTYPE_IN. The vectype of vectorized result
+   may be different from VECTYPE_IN, either in base type or vectype lanes,
+   lane-reducing operation is the case.  This function check if it is possible,
+   and how to perform partial vectorization on the operation in the context
+   of LOOP_VINFO.  */
+
+static void
+vect_reduction_update_partial_vector_usage (loop_vec_info loop_vinfo,
+   stmt_vec_info reduc_info,
+   slp_tree slp_node,
+   code_helper code, tree type,
+   tree vectype_in)
+{
+  enum vect_reduction_type reduc_type = STMT_VINFO_REDUC_TYPE (reduc_info);
+  internal_fn reduc_fn = STMT_VINFO_REDUC_FN (reduc_info);
+  internal_fn cond_fn = get_conditional_internal_fn (code, type);
+
+  if (reduc_type != FOLD_LEFT_REDUCTION
+  && !use_mask_by_cond_expr_p (code, cond_fn, vectype_in)
+  && (cond_fn == IFN_LAST
+ || !direct_internal_fn_supported_p (cond_fn, vectype_in,
+ OPTIMIZE_FOR_SPEED)))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"can't operate on partial vectors because"
+" no conditional operation is available.\n");
+  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
+}
+  else if (reduc_type == FOLD_LEFT_REDUCTION
+  && reduc_fn == IFN_LAST
+  && !expand_vec_cond_expr_p (vectype_in, truth_type_for (vectype_in),
+  SSA_NAME))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+   "can't operate on partial vectors because"
+   " no conditional operation is available.\n");
+  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
+}
+  else if (reduc_type == FOLD_LEFT_REDUCTION
+  && internal_fn_mask_index (reduc_fn) == -1
+  && FLOAT_TYPE_P (vectype_in)
+  && HONOR_SIGN_DEPENDENT_ROUNDING (vectype_in))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"can't operate on partial vectors because"
+" signed zeros cannot be preserved.\n");
+  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
+}
+  else
+{
+  internal_fn mask_reduc_fn
+   = get_masked_reduction_fn (reduc_fn, vectype_in);
+  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
+  unsigned nvectors;
+
+  if (slp_node)
+   nvectors = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
+  else
+   nvectors = vect_get_num_copies (loop_vinfo, vectype_in);
+
+  if (mask_reduc_fn == IFN_MASK_LEN_FOLD_LEFT_PLUS)
+   vect_record_loop_len (loop_vinfo, lens, nvectors, vectype_in, 1);
+  else
+   vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype_in, NULL);
+}
+}
+
 /* Function vectorizable_reduction.
 
Check if STMT_INFO performs a reduction operation that can be vectorized.
@@ -7456,7 +7529,6 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   bool single_defuse_cycle = false;
   bool nested_cycle = false;
   bool double_reduc = false;
-  int vec_num;
   tree cr_index_scalar_type = NULL_TREE, cr_index_vector_type = NULL_TREE;
   tree cond_reduc_val = NULL_TREE;
 
@@ -8283,11 +8355,6 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
  return false;
}
 
-

[gcc r15-961] vect: Add a function to check lane-reducing code

2024-05-31 Thread Feng Xue via Gcc-cvs

https://gcc.gnu.org/g:c0f31701556c4162463f28bc0f03007f40a6176e

commit r15-961-gc0f31701556c4162463f28bc0f03007f40a6176e
Author: Feng Xue 
Date:   Wed May 29 13:12:12 2024 +0800

vect: Add a function to check lane-reducing code

Check if an operation is lane-reducing requires comparison of code against
three kinds (DOT_PROD_EXPR/WIDEN_SUM_EXPR/SAD_EXPR).  Add an utility
function to make source coding for the check handy and concise.

2024-05-29 Feng Xue 

gcc/
* tree-vectorizer.h (lane_reducing_op_p): New function.
* tree-vect-slp.cc (vect_analyze_slp): Use new function
lane_reducing_op_p to check statement code.
* tree-vect-loop.cc (vect_transform_reduction): Likewise.
(vectorizable_reduction): Likewise, and change name of a local
variable that holds the result flag.

Diff:
---
 gcc/tree-vect-loop.cc | 29 -
 gcc/tree-vect-slp.cc  |  4 +---
 gcc/tree-vectorizer.h |  6 ++
 3 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 04a9ac64df7..a42d79c7cbf 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7650,9 +7650,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   gimple_match_op op;
   if (!gimple_extract_op (stmt_info->stmt, &op))
 gcc_unreachable ();
-  bool lane_reduc_code_p = (op.code == DOT_PROD_EXPR
-   || op.code == WIDEN_SUM_EXPR
-   || op.code == SAD_EXPR);
+  bool lane_reducing = lane_reducing_op_p (op.code);
 
   if (!POINTER_TYPE_P (op.type) && !INTEGRAL_TYPE_P (op.type)
   && !SCALAR_FLOAT_TYPE_P (op.type))
@@ -7664,7 +7662,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 
   /* For lane-reducing ops we're reducing the number of reduction PHIs
  which means the only use of that may be in the lane-reducing operation.  
*/
-  if (lane_reduc_code_p
+  if (lane_reducing
   && reduc_chain_length != 1
   && !only_slp_reduc_chain)
 {
@@ -7678,7 +7676,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
  since we'll mix lanes belonging to different reductions.  But it's
  OK to use them in a reduction chain or when the reduction group
  has just one element.  */
-  if (lane_reduc_code_p
+  if (lane_reducing
   && slp_node
   && !REDUC_GROUP_FIRST_ELEMENT (stmt_info)
   && SLP_TREE_LANES (slp_node) > 1)
@@ -7738,7 +7736,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   /* To properly compute ncopies we are interested in the widest
 non-reduction input type in case we're looking at a widening
 accumulation that we later handle in vect_transform_reduction.  */
-  if (lane_reduc_code_p
+  if (lane_reducing
  && vectype_op[i]
  && (!vectype_in
  || (GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE (vectype_in)))
@@ -8211,7 +8209,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   && loop_vinfo->suggested_unroll_factor == 1)
 single_defuse_cycle = true;
 
-  if (single_defuse_cycle || lane_reduc_code_p)
+  if (single_defuse_cycle || lane_reducing)
 {
   gcc_assert (op.code != COND_EXPR);
 
@@ -8227,7 +8225,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 mixed-sign dot-products can be implemented using signed
 dot-products.  */
   machine_mode vec_mode = TYPE_MODE (vectype_in);
-  if (!lane_reduc_code_p
+  if (!lane_reducing
  && !directly_supported_p (op.code, vectype_in, optab_vector))
 {
   if (dump_enabled_p ())
@@ -8252,7 +8250,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
  For the other cases try without the single cycle optimization.  */
   if (!ok)
{
- if (lane_reduc_code_p)
+ if (lane_reducing)
return false;
  else
single_defuse_cycle = false;
@@ -8263,7 +8261,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   /* If the reduction stmt is one of the patterns that have lane
  reduction embedded we cannot handle the case of ! single_defuse_cycle.  */
   if ((ncopies > 1 && ! single_defuse_cycle)
-  && lane_reduc_code_p)
+  && lane_reducing)
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -8274,7 +8272,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 
   if (slp_node
   && !(!single_defuse_cycle
-  && !lane_reduc_code_p
+  && !lane_reducing
   && reduction_type != FOLD_LEFT_REDUCTION))
 for (i = 0; i < (int) op.num_ops; i++)
   if (!vect_maybe_update_slp_op_vectype (slp_op[i], vectype_op[i]))
@@ -8295,7 +8293,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
   /* Cost the reduction op inside the loop if transformed via
  vect_transform_reduction.  Otherwise this is costed by the
  separate vectorizable_* routines.  */
-  if (single_de

[gcc r15-959] xtensa: Prepend "(use A0_REG)" to sibling call CALL_INSN_FUNCTION_USAGE instead of emitting it as in

2024-05-31 Thread Max Filippov via Gcc-cvs

https://gcc.gnu.org/g:be9b3f4375e74b6f10dd15fc563c93f803e91db5

commit r15-959-gbe9b3f4375e74b6f10dd15fc563c93f803e91db5
Author: Takayuki 'January June' Suwa 
Date:   Fri May 31 19:24:48 2024 +0900

xtensa: Prepend "(use A0_REG)" to sibling call CALL_INSN_FUNCTION_USAGE 
instead of emitting it as insn at the end of epilogue

No functional changes.

gcc/ChangeLog:

* config/xtensa/xtensa-protos.h (xtensa_expand_call):
Add the third argument as boolean.
(xtensa_expand_epilogue): Remove the first argument.
* config/xtensa/xtensa.cc (xtensa_expand_call):
Add the third argument "sibcall_p", and modify in order to prepend
"(use A0_REG)" to CALL_INSN_FUNCTION_USAGE if the argument is true.
(xtensa_expand_epilogue): Remove the first argument "sibcall_p" and
its conditional clause.
* config/xtensa/xtensa.md (call, call_value, sibcall, 
sibcall_value):
Append a boolean value to the argument of xtensa_expand_call()
indicating whether it is sibling call or not.
(epilogue): Remove the boolean argument from 
xtensa_expand_epilogue(),
and then append emitting "(return)".
(sibcall_epilogue): Remove the boolean argument from
xtensa_expand_epilogue().

Diff:
---
 gcc/config/xtensa/xtensa-protos.h |  4 ++--
 gcc/config/xtensa/xtensa.cc   | 16 ++--
 gcc/config/xtensa/xtensa.md   | 13 +++--
 3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/gcc/config/xtensa/xtensa-protos.h 
b/gcc/config/xtensa/xtensa-protos.h
index 834f15e0e48..77553b0453f 100644
--- a/gcc/config/xtensa/xtensa-protos.h
+++ b/gcc/config/xtensa/xtensa-protos.h
@@ -53,7 +53,7 @@ extern void xtensa_expand_atomic (enum rtx_code, rtx, rtx, 
rtx, bool);
 extern void xtensa_emit_loop_end (rtx_insn *, rtx *);
 extern char *xtensa_emit_branch (bool, rtx *);
 extern char *xtensa_emit_movcc (bool, bool, bool, rtx *);
-extern void xtensa_expand_call (int, rtx *);
+extern void xtensa_expand_call (int, rtx *, bool);
 extern char *xtensa_emit_call (int, rtx *);
 extern char *xtensa_emit_sibcall (int, rtx *);
 extern bool xtensa_tls_referenced_p (rtx);
@@ -76,7 +76,7 @@ extern void xtensa_setup_frame_addresses (void);
 extern int xtensa_debugger_regno (int);
 extern long compute_frame_size (poly_int64);
 extern void xtensa_expand_prologue (void);
-extern void xtensa_expand_epilogue (bool);
+extern void xtensa_expand_epilogue (void);
 extern void xtensa_adjust_reg_alloc_order (void);
 extern enum reg_class xtensa_regno_to_class (int regno);
 extern HOST_WIDE_INT xtensa_initial_elimination_offset (int from, int to);
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index 301448c65e2..45dc1be3ff5 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -2239,7 +2239,7 @@ xtensa_emit_movcc (bool inverted, bool isfp, bool isbool, 
rtx *operands)
 
 
 void
-xtensa_expand_call (int callop, rtx *operands)
+xtensa_expand_call (int callop, rtx *operands, bool sibcall_p)
 {
   rtx call;
   rtx_insn *call_insn;
@@ -2281,6 +2281,14 @@ xtensa_expand_call (int callop, rtx *operands)
   CALL_INSN_FUNCTION_USAGE (call_insn) =
gen_rtx_EXPR_LIST (Pmode, clob, CALL_INSN_FUNCTION_USAGE (call_insn));
 }
+  else if (sibcall_p)
+{
+  /* Sibling call requires a return address to the caller, similar to
+"return" insn.  */
+  rtx use = gen_rtx_USE (VOIDmode, gen_rtx_REG (SImode, A0_REG));
+  CALL_INSN_FUNCTION_USAGE (call_insn) =
+   gen_rtx_EXPR_LIST (Pmode, use, CALL_INSN_FUNCTION_USAGE (call_insn));
+}
 }
 
 
@@ -3671,7 +3679,7 @@ xtensa_expand_prologue (void)
 }
 
 void
-xtensa_expand_epilogue (bool sibcall_p)
+xtensa_expand_epilogue (void)
 {
   if (!TARGET_WINDOWED_ABI)
 {
@@ -3736,10 +3744,6 @@ xtensa_expand_epilogue (bool sibcall_p)
  stack_pointer_rtx,
  EH_RETURN_STACKADJ_RTX));
 }
-  if (sibcall_p)
-emit_use (gen_rtx_REG (SImode, A0_REG));
-  else
-emit_jump_insn (gen_return ());
 }
 
 void
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 03e816b8a12..ef826b63e44 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -2553,7 +2553,7 @@
 (match_operand 1 "" ""))]
   ""
 {
-  xtensa_expand_call (0, operands);
+  xtensa_expand_call (0, operands, false);
   DONE;
 })
 
@@ -2574,7 +2574,7 @@
  (match_operand 2 "" "")))]
   ""
 {
-  xtensa_expand_call (1, operands);
+  xtensa_expand_call (1, operands, false);
   DONE;
 })
 
@@ -2595,7 +2595,7 @@
 (match_operand 1 "" ""))]
   "!TARGET_WINDOWED_ABI"
 {
-  xtensa_expand_call (0, operands);
+  xtensa_expand_call (0, operands, true);
   DONE;
 })
 
@@ -2616,7 +2616,7 @@
  (match_operand 2 "" "")))]
   "!TARGET_WINDOWED_ABI"
 {
-  xtensa_expand_call (1, o

[gcc r15-958] xtensa: Simplify several MD templates

2024-05-31 Thread Max Filippov via Gcc-cvs

https://gcc.gnu.org/g:68cda24d3ac12292a599ff8f9b58fdbc95baba4e

commit r15-958-g68cda24d3ac12292a599ff8f9b58fdbc95baba4e
Author: Takayuki 'January June' Suwa 
Date:   Fri May 31 19:23:13 2024 +0900

xtensa: Simplify several MD templates

No functional changes.

gcc/ChangeLog:

* config/xtensa/predicates.md
(subreg_HQI_lowpart_operator, xtensa_sminmax_operator):
New operator predicates.
* config/xtensa/xtensa-protos.h (xtensa_match_CLAMPS_imms_p):
Remove.
* config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p): Ditto.
* config/xtensa/xtensa.md
(*addsubx, *extzvsi-1bit_ashlsi3, *extzvsi-1bit_addsubx):
Revise the output statements by conditional ternary operator rather
than switch-case clause in order to avoid using gcc_unreachable().
(xtensa_clamps): Reduce to a single pattern definition using the
predicate added above.
(Some split patterns to assist *masktrue_const_bitcmpl): Ditto.

Diff:
---
 gcc/config/xtensa/predicates.md   |  23 
 gcc/config/xtensa/xtensa-protos.h |   1 -
 gcc/config/xtensa/xtensa.cc   |  10 
 gcc/config/xtensa/xtensa.md   | 109 +++---
 4 files changed, 43 insertions(+), 100 deletions(-)

diff --git a/gcc/config/xtensa/predicates.md b/gcc/config/xtensa/predicates.md
index a296c7ecc99..19b9f4cd7ef 100644
--- a/gcc/config/xtensa/predicates.md
+++ b/gcc/config/xtensa/predicates.md
@@ -200,6 +200,29 @@
 (define_predicate "xtensa_bit_join_operator"
   (match_code "plus,ior"))
 
+(define_predicate "subreg_HQI_lowpart_operator"
+  (match_code "subreg")
+{
+  int ofs = SUBREG_BYTE (op), pos = 0;
+  switch (GET_MODE (op))
+{
+case QImode:
+  if (BYTES_BIG_ENDIAN)
+   pos = 3;
+  break;
+case HImode:
+  if (BYTES_BIG_ENDIAN)
+   pos = 2;
+  break;
+default:
+  return false;
+}
+  return ofs == pos;
+})
+
+(define_predicate "xtensa_sminmax_operator"
+  (match_code "smin,smax"))
+
 (define_predicate "tls_symbol_operand"
   (and (match_code "symbol_ref")
(match_test "SYMBOL_REF_TLS_MODEL (op) != 0")))
diff --git a/gcc/config/xtensa/xtensa-protos.h 
b/gcc/config/xtensa/xtensa-protos.h
index b87b3e8ac48..834f15e0e48 100644
--- a/gcc/config/xtensa/xtensa-protos.h
+++ b/gcc/config/xtensa/xtensa-protos.h
@@ -60,7 +60,6 @@ extern bool xtensa_tls_referenced_p (rtx);
 extern enum rtx_code xtensa_shlrd_which_direction (rtx, rtx);
 extern bool xtensa_split1_finished_p (void);
 extern void xtensa_split_DI_reg_imm (rtx *);
-extern bool xtensa_match_CLAMPS_imms_p (rtx, rtx);
 
 #ifdef TREE_CODE
 extern void init_cumulative_args (CUMULATIVE_ARGS *, int);
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index 84268db5c9d..301448c65e2 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -2666,16 +2666,6 @@ xtensa_emit_add_imm (rtx dst, rtx src, HOST_WIDE_INT 
imm, rtx scratch,
 }
 
 
-/* Return true if the constants used in the application of smin() following
-   smax() meet the specifications of the CLAMPS machine instruction.  */
-bool
-xtensa_match_CLAMPS_imms_p (rtx cst_max, rtx cst_min)
-{
-  return IN_RANGE (exact_log2 (-INTVAL (cst_max)), 7, 22)
-&& (INTVAL (cst_max) + INTVAL (cst_min)) == -1;
-}
-
-
 /* Implement TARGET_CANNOT_FORCE_CONST_MEM.  */
 
 static bool
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 6061a86ee13..03e816b8a12 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -180,15 +180,8 @@
   "TARGET_ADDX"
 {
   operands[3] = GEN_INT (1 << INTVAL (operands[3]));
-  switch (GET_CODE (operands[4]))
-{
-case PLUS:
-  return "addx%3\t%0, %1, %2";
-case MINUS:
-  return "subx%3\t%0, %1, %2";
-default:
-  gcc_unreachable ();
-}
+  return GET_CODE (operands[4]) == PLUS
+ ? "addx%3\t%0, %1, %2" : "subx%3\t%0, %1, %2";
 }
   [(set_attr "type""arith")
(set_attr "mode""SI")
@@ -535,34 +528,23 @@
 
 ;; Signed clamp.
 
-(define_insn_and_split "*xtensa_clamps"
-  [(set (match_operand:SI 0 "register_operand" "=a")
-   (smax:SI (smin:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:SI 2 "const_int_operand" "i"))
-(match_operand:SI 3 "const_int_operand" "i")))]
-  "TARGET_MINMAX && TARGET_CLAMPS
-   && xtensa_match_CLAMPS_imms_p (operands[3], operands[2])"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-   (smin:SI (smax:SI (match_dup 1)
- (match_dup 3))
-(match_dup 2)))]
-  ""
-  [(set_attr "type""arith")
-   (set_attr "mode""SI")
-   (set_attr "length"  "3")])
-
 (define_insn "*xtensa_clamps"
   [(set (match_operand:SI 0 "register_operand" "=a")
-   (smin:SI (smax:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:SI 2 "const

[gcc r15-957] RISC-V: Remove dead perm series code and document.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:30cfdd6ff56972d9d1b9dbdd43a8333c85618775

commit r15-957-g30cfdd6ff56972d9d1b9dbdd43a8333c85618775
Author: Robin Dapp 
Date:   Fri May 17 12:48:52 2024 +0200

RISC-V: Remove dead perm series code and document.

With the introduction of shuffle_series_patterns the explicit handler
code for a perm series is dead.  This patch removes it and also adds
a function-level comment to shuffle_series_patterns.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Document.
(shuffle_extract_and_slide1up_patterns): Remove.

Diff:
---
 gcc/config/riscv/riscv-v.cc | 26 --
 1 file changed, 4 insertions(+), 22 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 9428beca268..948aaf7d8dd 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1485,28 +1485,6 @@ expand_const_vector (rtx target, rtx src)
  emit_vlmax_insn (code_for_pred_merge (mode), MERGE_OP, ops);
}
}
-  else if (npatterns == 1 && nelts_per_pattern == 3)
-   {
- /* Generate the following CONST_VECTOR:
-{ base0, base1, base1 + step, base1 + step * 2, ... }  */
- rtx base0 = builder.elt (0);
- rtx base1 = builder.elt (1);
- rtx base2 = builder.elt (2);
-
- rtx step = simplify_binary_operation (MINUS, builder.inner_mode (),
-   base2, base1);
-
- /* Step 1 - { base1, base1 + step, base1 + step * 2, ... }  */
- rtx tmp = gen_reg_rtx (mode);
- expand_vec_series (tmp, base1, step);
- /* Step 2 - { base0, base1, base1 + step, base1 + step * 2, ... }  */
- if (!rtx_equal_p (base0, const0_rtx))
-   base0 = force_reg (builder.inner_mode (), base0);
-
- insn_code icode = optab_handler (vec_shl_insert_optab, mode);
- gcc_assert (icode != CODE_FOR_nothing);
- emit_insn (GEN_FCN (icode) (target, tmp, base0));
-   }
   else
/* TODO: We will enable more variable-length vector in the future.  */
gcc_unreachable ();
@@ -3580,6 +3558,10 @@ shuffle_extract_and_slide1up_patterns (struct 
expand_vec_perm_d *d)
   return true;
 }
 
+/* This looks for a series pattern in the provided vector permute structure D.
+   If successful it emits a series insn as well as a gather to implement it.
+   Return true if successful, false otherwise.  */
+
 static bool
 shuffle_series_patterns (struct expand_vec_perm_d *d)
 {

[gcc r15-956] RISC-V: Add vector popcount, clz, ctz.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:6fa4b0135439d64c0ea1816594d7dc830e836376

commit r15-956-g6fa4b0135439d64c0ea1816594d7dc830e836376
Author: Robin Dapp 
Date:   Wed May 15 17:41:07 2024 +0200

RISC-V: Add vector popcount, clz, ctz.

This patch adds the zvbb vcpop, vclz and vctz to the autovec machinery
as well as tests for them.

gcc/ChangeLog:

* config/riscv/autovec.md (ctz2): New expander.
(clz2): Ditto.
* config/riscv/generic-vector-ooo.md: Add bitmanip ops to insn
reservation.
* config/riscv/vector-crypto.md: Add VLS modes to insns.
* config/riscv/vector.md: Add bitmanip ops to mode_idx and other
attributes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/popcount-1.c: Adjust check
for zvbb.
* gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/popcount-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/popcount-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/popcount-template.h: New test.
* gcc.target/riscv/rvv/autovec/unop/clz-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/clz-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/clz-template.h: New test.
* gcc.target/riscv/rvv/autovec/unop/ctz-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/ctz-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/ctz-template.h: New test.

Diff:
---
 gcc/config/riscv/autovec.md|  30 -
 gcc/config/riscv/generic-vector-ooo.md |   2 +-
 gcc/config/riscv/vector-crypto.md  | 137 +++--
 gcc/config/riscv/vector.md |  14 +--
 .../gcc.target/riscv/rvv/autovec/unop/clz-1.c  |   8 ++
 .../gcc.target/riscv/rvv/autovec/unop/clz-run.c|  36 ++
 .../riscv/rvv/autovec/unop/clz-template.h  |  21 
 .../gcc.target/riscv/rvv/autovec/unop/ctz-1.c  |   8 ++
 .../gcc.target/riscv/rvv/autovec/unop/ctz-run.c|  36 ++
 .../riscv/rvv/autovec/unop/ctz-template.h  |  21 
 .../gcc.target/riscv/rvv/autovec/unop/popcount-1.c |   4 +-
 .../gcc.target/riscv/rvv/autovec/unop/popcount-2.c |   4 +-
 .../gcc.target/riscv/rvv/autovec/unop/popcount-3.c |   8 ++
 .../riscv/rvv/autovec/unop/popcount-run-1.c|   3 +-
 .../riscv/rvv/autovec/unop/popcount-template.h |  21 
 15 files changed, 272 insertions(+), 81 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 87d4171bc89..15db26d52c6 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1566,7 +1566,7 @@
 })
 
 ;; 
---
-;; - [INT] POPCOUNT.
+;; - [INT] POPCOUNT, CTZ and CLZ.
 ;; 
---
 
 (define_expand "popcount2"
@@ -1574,10 +1574,36 @@
(match_operand:V_VLSI 1 "register_operand")]
   "TARGET_VECTOR"
 {
-  riscv_vector::expand_popcount (operands);
+  if (!TARGET_ZVBB)
+riscv_vector::expand_popcount (operands);
+  else
+{
+  riscv_vector::emit_vlmax_insn (code_for_pred_v (POPCOUNT, mode),
+riscv_vector::CPOP_OP, operands);
+}
   DONE;
 })
 
+(define_expand "ctz2"
+  [(match_operand:V_VLSI 0 "register_operand")
+   (match_operand:V_VLSI 1 "register_operand")]
+  "TARGET_ZVBB"
+  {
+riscv_vector::emit_vlmax_insn (code_for_pred_v (CTZ, mode),
+  riscv_vector::CPOP_OP, operands);
+DONE;
+})
+
+(define_expand "clz2"
+  [(match_operand:V_VLSI 0 "register_operand")
+   (match_operand:V_VLSI 1 "register_operand")]
+  "TARGET_ZVBB"
+  {
+riscv_vector::emit_vlmax_insn (code_for_pred_v (CLZ, mode),
+  riscv_vector::CPOP_OP, operands);
+DONE;
+})
+
 
 ;; -
 ;;  [INT] Highpart multiplication
diff --git a/gcc/config/riscv/generic-vector-ooo.md 
b/gcc/config/riscv/generic-vector-ooo.md
index 96cb1a0be29..5e933c83841 100644
--- a/gcc/config/riscv/generic-vector-ooo.md
+++ b/gcc/config/riscv/generic-vector-ooo.md
@@ -74,7 +74,7 @@
 
 ;; Vector crypto, assumed to be a generic operation for now.
 (define_insn_reservation "vec_crypto" 4
-  (eq_attr "type" "crypto")
+  (eq_attr "type" "crypto,vclz,vctz,vcpop")
   "vxu_ooo_issue,vxu_ooo_alu")
 
 ;; Vector crypto, AES
diff --git a/gcc/config/riscv/vector-crypto.md 
b/gcc/config/riscv/vector-crypto.md
index 0ddc2f3f3c6..17432b15815 100755
--- a/gcc/config/riscv/vector-crypto.md
+++ b/gcc/config/riscv/vector-crypto.md
@@ -99,42 +99,43 @@
 ;; vror.vv vror.vx vror.vi
 ;; vwsll.vv vwsll.vx vwsll.vi
 (define_insn "@pred_vandn"
-  [(set (match_operand:VI 0 "register_operand" "=vd, vr, vd, v

[gcc r15-955] RISC-V: Add vandn combine helper.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:f48448276f29a3823827292c72b7fc8e9cd39e1e

commit r15-955-gf48448276f29a3823827292c72b7fc8e9cd39e1e
Author: Robin Dapp 
Date:   Wed May 15 15:01:35 2024 +0200

RISC-V: Add vandn combine helper.

This patch adds a combine pattern for vandn as well as tests for it.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*vandn_): New pattern.
* config/riscv/vector.md: Add vandn to mode_idx.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vandn-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vandn-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vandn-template.h: New test.

Diff:
---
 gcc/config/riscv/autovec-opt.md| 18 
 gcc/config/riscv/vector.md |  2 +-
 .../gcc.target/riscv/rvv/autovec/binop/vandn-1.c   |  8 
 .../gcc.target/riscv/rvv/autovec/binop/vandn-run.c | 54 ++
 .../riscv/rvv/autovec/binop/vandn-template.h   | 38 +++
 5 files changed, 119 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index bc6af042bcf..6a2eabbd854 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1591,3 +1591,21 @@
 DONE;
   }
   [(set_attr "type" "vwsll")])
+
+;; vnot + vand = vandn.
+(define_insn_and_split "*vandn_"
+ [(set (match_operand:V_VLSI 0 "register_operand" "=vr")
+   (and:V_VLSI
+(not:V_VLSI
+  (match_operand:V_VLSI  2 "register_operand"  "vr"))
+(match_operand:V_VLSI1 "register_operand"  "vr")))]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vandn (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vandn")])
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 69423be6917..c15af17ec62 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -743,7 +743,7 @@
vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\

vfcvtitof,vfncvtitof,vfncvtftoi,vfncvtftof,vmalu,vmiota,vmidx,\

vimovxv,vfmovfv,vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,\
-   vgather,vcompress,vmov,vnclip,vnshift")
+   vgather,vcompress,vmov,vnclip,vnshift,vandn")
   (const_int 0)
 
   (eq_attr "type" "vimovvx,vfmovvf")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-1.c
new file mode 100644
index 000..3bb5bf8dd5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-add-options "riscv_v" } */
+/* { dg-add-options "riscv_zvbb" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */
+
+#include "vandn-template.h"
+
+/* { dg-final { scan-assembler-times {\tvandn\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-run.c
new file mode 100644
index 000..243c5975068
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vandn-run.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target "riscv_zvbb_ok" } */
+/* { dg-add-options "riscv_v" } */
+/* { dg-add-options "riscv_zvbb" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */
+
+#include "vandn-template.h"
+
+#include 
+
+#define SZ 512
+
+#define RUN(TYPE, VAL) 
\
+  TYPE a##TYPE[SZ];
\
+  TYPE b##TYPE[SZ];
\
+  for (int i = 0; i < SZ; i++) 
\
+{  
\
+  a##TYPE[i] = 123;
\
+  b##TYPE[i] = VAL;
\
+}  
\
+  vandn_##TYPE (a##TYPE, a##TYPE, b##TYPE, SZ);
\
+  for (int i = 0; i < SZ; i++) 
\
+assert (a##TYPE[i] == (TYPE) (123 & ~VAL));
+
+#define RUN2(TYPE, VAL)
\
+  TYPE as##TYPE[SZ];   
\
+  for (int i = 0; i < SZ; i++) 
\
+as##TYPE[i] = 123; 
\
+  vandns_##TYPE (as##TYPE, as##TYPE, VAL, SZ);

[gcc r15-954] RISC-V: Use widening shift for scatter/gather if applicable.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:309ee005aa871286c8daccbce7586f82be347440

commit r15-954-g309ee005aa871286c8daccbce7586f82be347440
Author: Robin Dapp 
Date:   Fri May 10 13:37:03 2024 +0200

RISC-V: Use widening shift for scatter/gather if applicable.

With the zvbb extension we can emit a widening shift for scatter/gather
index preparation in case we need to multiply by 2 and zero extend.

The patch also adds vwsll to the mode_idx attribute and removes the
mode from shift-count operand of the insn pattern.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_gather_scatter): Use vwsll if
applicable.
* config/riscv/vector-crypto.md: Remove mode from vwsll shift
count operator.
* config/riscv/vector.md: Add vwsll to mode iterator.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add zvbb.
* 
gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-12-zvbb.c: New test.

Diff:
---
 gcc/config/riscv/riscv-v.cc|  42 +---
 gcc/config/riscv/vector-crypto.md  |   4 +-
 gcc/config/riscv/vector.md |   4 +-
 .../gather-scatter/gather_load_64-12-zvbb.c| 113 +
 gcc/testsuite/lib/target-supports.exp  |  48 -
 5 files changed, 193 insertions(+), 18 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index f105f470495..9428beca268 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4016,7 +4016,7 @@ expand_gather_scatter (rtx *ops, bool is_load)
 {
   rtx ptr, vec_offset, vec_reg;
   bool zero_extend_p;
-  int scale_log2;
+  int shift;
   rtx mask = ops[5];
   rtx len = ops[6];
   if (is_load)
@@ -4025,7 +4025,7 @@ expand_gather_scatter (rtx *ops, bool is_load)
   ptr = ops[1];
   vec_offset = ops[2];
   zero_extend_p = INTVAL (ops[3]);
-  scale_log2 = exact_log2 (INTVAL (ops[4]));
+  shift = exact_log2 (INTVAL (ops[4]));
 }
   else
 {
@@ -4033,7 +4033,7 @@ expand_gather_scatter (rtx *ops, bool is_load)
   ptr = ops[0];
   vec_offset = ops[1];
   zero_extend_p = INTVAL (ops[2]);
-  scale_log2 = exact_log2 (INTVAL (ops[3]));
+  shift = exact_log2 (INTVAL (ops[3]));
 }
 
   machine_mode vec_mode = GET_MODE (vec_reg);
@@ -4043,9 +4043,12 @@ expand_gather_scatter (rtx *ops, bool is_load)
   poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
   bool is_vlmax = is_vlmax_len_p (vec_mode, len);
 
+  bool use_widening_shift = false;
+
   /* Extend the offset element to address width.  */
   if (inner_offsize < BITS_PER_WORD)
 {
+  use_widening_shift = TARGET_ZVBB && zero_extend_p && shift == 1;
   /* 7.2. Vector Load/Store Addressing Modes.
 If the vector offset elements are narrower than XLEN, they are
 zero-extended to XLEN before adding to the ptr effective address. If
@@ -4054,8 +4057,8 @@ expand_gather_scatter (rtx *ops, bool is_load)
 raise an illegal instruction exception if the EEW is not supported for
 offset elements.
 
-RVV spec only refers to the scale_log == 0 case.  */
-  if (!zero_extend_p || scale_log2 != 0)
+RVV spec only refers to the shift == 0 case.  */
+  if (!zero_extend_p || shift)
{
  if (zero_extend_p)
inner_idx_mode
@@ -4064,19 +4067,32 @@ expand_gather_scatter (rtx *ops, bool is_load)
inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
  machine_mode new_idx_mode
= get_vector_mode (inner_idx_mode, nunits).require ();
- rtx tmp = gen_reg_rtx (new_idx_mode);
- emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
- zero_extend_p ? true : false));
- vec_offset = tmp;
+ if (!use_widening_shift)
+   {
+ rtx tmp = gen_reg_rtx (new_idx_mode);
+ emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, 
idx_mode,
+ zero_extend_p ? true : false));
+ vec_offset = tmp;
+   }
  idx_mode = new_idx_mode;
}
 }
 
-  if (scale_log2 != 0)
+  if (shift)
 {
-  rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
- gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
+  rtx tmp;
+  if (!use_widening_shift)
+   tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
+   gen_int_mode (shift, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+  else
+   {
+ tmp = gen_reg_rtx (idx_mode);
+ insn_code icode = code_for_pred_vwsll_scalar (idx_mode);
+ rtx ops[] = {tmp, vec_offset, const1_rtx};
+ emit_vlmax_insn (icode, BINARY_OP, ops);
+   }
+
   vec_offset = tmp;
 }
 
diff -

[gcc r15-953] RISC-V: Add vwsll combine helpers.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:af4bf422a699de0e7af5a26e02997d313e7301a6

commit r15-953-gaf4bf422a699de0e7af5a26e02997d313e7301a6
Author: Robin Dapp 
Date:   Mon May 13 22:09:35 2024 +0200

RISC-V: Add vwsll combine helpers.

This patch enables the usage of vwsll in autovec context by adding the
necessary combine patterns and tests.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*vwsll_zext1_): New
pattern.
(*vwsll_zext2_): Ditto.
(*vwsll_zext1_scalar_): Ditto.
(*vwsll_zext1_trunc_): Ditto.
(*vwsll_zext2_trunc_): Ditto.
(*vwsll_zext1_trunc_scalar_): Ditto.
* config/riscv/vector-crypto.md: Make pattern similar to other
narrowing/widening patterns.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vwsll-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vwsll-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vwsll-template.h: New test.

Diff:
---
 gcc/config/riscv/autovec-opt.md| 126 -
 gcc/config/riscv/vector-crypto.md  |   2 +-
 .../gcc.target/riscv/rvv/autovec/binop/vwsll-1.c   |  10 ++
 .../gcc.target/riscv/rvv/autovec/binop/vwsll-run.c |  67 +++
 .../riscv/rvv/autovec/binop/vwsll-template.h   |  49 
 5 files changed, 251 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 04f85d8e455..bc6af042bcf 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1467,5 +1467,127 @@
operands, operands[4]);
 DONE;
   }
-  [(set_attr "type" "vector")]
-)
+  [(set_attr "type" "vector")])
+
+;; vzext.vf2 + vsll = vwsll.
+(define_insn_and_split "*vwsll_zext1_"
+  [(set (match_operand:VWEXTI 0"register_operand" "=vr 
")
+  (ashift:VWEXTI
+   (zero_extend:VWEXTI
+ (match_operand: 1 "register_operand" " vr "))
+ (match_operand: 2 "vector_shift_operand" "vrvk")))]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vwsll (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vwsll")])
+
+(define_insn_and_split "*vwsll_zext2_"
+  [(set (match_operand:VWEXTI 0"register_operand" "=vr 
")
+  (ashift:VWEXTI
+   (zero_extend:VWEXTI
+ (match_operand: 1 "register_operand" " vr "))
+   (zero_extend:VWEXTI
+ (match_operand: 2 "vector_shift_operand" "vrvk"]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vwsll (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vwsll")])
+
+
+(define_insn_and_split "*vwsll_zext1_scalar_"
+  [(set (match_operand:VWEXTI 0"register_operand"  
  "=vr")
+  (ashift:VWEXTI
+   (zero_extend:VWEXTI
+ (match_operand: 1 "register_operand"" 
vr"))
+ (match_operand:2 "vector_scalar_shift_operand" " 
rK")))]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+if (GET_CODE (operands[2]) == SUBREG)
+  operands[2] = SUBREG_REG (operands[2]);
+insn_code icode = code_for_pred_vwsll_scalar (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vwsll")])
+
+;; For
+;;   uint16_t dst;
+;;   uint8_t a, b;
+;;   dst = vwsll (a, b)
+;; we seem to create
+;;   aa = (int) a;
+;;   bb = (int) b;
+;;   dst = (short) vwsll (aa, bb);
+;; The following patterns help to combine this idiom into one vwsll.
+
+(define_insn_and_split "*vwsll_zext1_trunc_"
+  [(set (match_operand: 0   "register_operand""=vr ")
+(truncate:
+  (ashift:VQEXTI
+   (zero_extend:VQEXTI
+ (match_operand: 1   "register_operand" " vr "))
+   (match_operand:VQEXTI   2   "vector_shift_operand" "vrvk"]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vwsll (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vwsll")])
+
+(define_insn_and_split "*vwsll_zext2_trunc_"
+  [(set (match_operand: 0   "register_operand""=vr ")
+(truncate:
+  (ashift:VQEXTI
+   (zero_extend:VQEXTI
+ (match_operand: 1   "register_operand" " vr "))
+   (zero_extend:VQEXTI
+ (match_operand: 2   "vector_shift_operand" "vrvk")]
+  "TARGET_ZVBB && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vwsll (mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY

[gcc r15-952] RISC-V: Split vwadd.wx and vwsub.wx and add helpers.

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:9781885a624f3e29634d95c14cd10940cefb1a5a

commit r15-952-g9781885a624f3e29634d95c14cd10940cefb1a5a
Author: Robin Dapp 
Date:   Thu May 16 12:43:43 2024 +0200

RISC-V: Split vwadd.wx and vwsub.wx and add helpers.

vwadd.wx and vwsub.wx have the same problem vfwadd.wf had.  This patch
splits the insn pattern in the same way vfwadd.wf was split.

It also adds two patterns to recognize extended scalars.  In practice
those do not provide a lot of improvement over what we already have but
in some instances we can get rid of redundant extensions.

gcc/ChangeLog:

* config/riscv/vector.md: Split vwadd.wx/vwsub.wx pattern and
add extended_scalar patterns.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr115068.c: Add vwadd.wx/vwsub.wx
tests.
* gcc.target/riscv/rvv/base/pr115068-run.c: Include pr115068.c.
* gcc.target/riscv/rvv/base/vwaddsub-1.c: New test.

Diff:
---
 gcc/config/riscv/vector.md | 62 ++
 .../gcc.target/riscv/rvv/base/pr115068-run.c   | 24 +
 gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c | 26 +
 .../gcc.target/riscv/rvv/base/vwaddsub-1.c | 48 +
 4 files changed, 128 insertions(+), 32 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 92bbb8ce6ae..dccf76f0003 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -3877,27 +3877,71 @@
(set_attr "mode" "")])
 
 (define_insn 
"@pred_single_widen__scalar"
-  [(set (match_operand:VWEXTI 0 "register_operand"   "=vr,   
vr")
+  [(set (match_operand:VWEXTI 0 "register_operand" "=vd,vd, 
vr, vr")
(if_then_else:VWEXTI
  (unspec:
-   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
-(match_operand 5 "vector_length_operand"  "   rK,   
rK")
-(match_operand 6 "const_int_operand"  "i,
i")
-(match_operand 7 "const_int_operand"  "i,
i")
-(match_operand 8 "const_int_operand"  "i,
i")
+   [(match_operand: 1 "vector_mask_operand"   " 
vm,vm,Wc1,Wc1")
+(match_operand 5 "vector_length_operand"  " rK,rK, rK, 
rK")
+(match_operand 6 "const_int_operand"  "  i, i,  i, 
 i")
+(match_operand 7 "const_int_operand"  "  i, i,  i, 
 i")
+(match_operand 8 "const_int_operand"  "  i, i,  i, 
 i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (plus_minus:VWEXTI
-   (match_operand:VWEXTI 3 "register_operand" "   vr,   
vr")
+   (match_operand:VWEXTI 3 "register_operand" " vr,vr, vr, 
vr")
(any_extend:VWEXTI
  (vec_duplicate:
-   (match_operand: 4 "reg_or_0_operand"   "   rJ,   
rJ"
- (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,
0")))]
+   (match_operand: 4 "reg_or_0_operand"   " rJ,rJ, rJ, 
rJ"
+ (match_operand:VWEXTI 2 "vector_merge_operand"   " vu, 0, vu, 
 0")))]
   "TARGET_VECTOR"
   "vw.wx\t%0,%3,%z4%p1"
   [(set_attr "type" "vi")
(set_attr "mode" "")])
 
+(define_insn "@pred_single_widen_add_extended_scalar"
+  [(set (match_operand:VWEXTI 0 "register_operand" "=vd,vd, 
vr, vr")
+   (if_then_else:VWEXTI
+ (unspec:
+   [(match_operand: 1 "vector_mask_operand"   " 
vm,vm,Wc1,Wc1")
+(match_operand 5 "vector_length_operand"  " rK,rK, rK, 
rK")
+(match_operand 6 "const_int_operand"  "  i, i,  i, 
 i")
+(match_operand 7 "const_int_operand"  "  i, i,  i, 
 i")
+(match_operand 8 "const_int_operand"  "  i, i,  i, 
 i")
+(reg:SI VL_REGNUM)
+(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+ (plus:VWEXTI
+   (vec_duplicate:VWEXTI
+ (any_extend:
+   (match_operand: 4 "reg_or_0_operand"   " rJ,rJ, rJ, 
rJ")))
+   (match_operand:VWEXTI 3 "register_operand" " vr,vr, vr, 
vr"))
+ (match_operand:VWEXTI 2 "vector_merge_operand"   " vu, 0, vu, 
 0")))]
+  "TARGET_VECTOR"
+  "vwadd.wx\t%0,%3,%z4%p1"
+  [(set_attr "type" "viwalu")
+   (set_attr "mode" "")])
+
+(define_insn "@pred_single_widen_sub_extended_scalar"
+  [(set (match_operand:VWEXTI 0 "register_operand" "=vd,vd, 
vr, vr")
+   (if_then_else:VWEXTI
+ (unspec:
+   [(match_operand: 1 "vector_mask_operand"   " 
vm,vm,Wc1,Wc1")
+(match_operand 5 "vector_length_operand"  " rK,rK, rK, 
rK")
+(

[gcc r15-951] RISC-V: Do not allow v0 as dest when merging [PR115068].

2024-05-31 Thread Robin Dapp via Gcc-cvs

https://gcc.gnu.org/g:a2fd0812a54cf51520f15e900df4cfb5874b75ed

commit r15-951-ga2fd0812a54cf51520f15e900df4cfb5874b75ed
Author: Robin Dapp 
Date:   Mon May 13 13:49:57 2024 +0200

RISC-V: Do not allow v0 as dest when merging [PR115068].

This patch splits the vfw...wf pattern so we do not emit e.g. vfwadd.wf
v0,v8,fa5,v0.t anymore.

gcc/ChangeLog:

PR target/115068

* config/riscv/vector.md:  Split vfw.wf pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr115068-run.c: New test.
* gcc.target/riscv/rvv/base/pr115068.c: New test.

Diff:
---
 gcc/config/riscv/vector.md | 20 +++
 .../gcc.target/riscv/rvv/base/pr115068-run.c   | 28 +
 gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c | 29 ++
 3 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index c8c9667eaa2..92bbb8ce6ae 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7178,24 +7178,24 @@
(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
 
 (define_insn "@pred_single_widen__scalar"
-  [(set (match_operand:VWEXTF 0 "register_operand"   "=vr,   
vr")
+  [(set (match_operand:VWEXTF 0 "register_operand""=vd, vd, 
vr, vr")
(if_then_else:VWEXTF
  (unspec:
-   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
-(match_operand 5 "vector_length_operand"  "   rK,   
rK")
-(match_operand 6 "const_int_operand"  "i,
i")
-(match_operand 7 "const_int_operand"  "i,
i")
-(match_operand 8 "const_int_operand"  "i,
i")
-(match_operand 9 "const_int_operand"  "i,
i")
+   [(match_operand: 1 "vector_mask_operand"  " vm, 
vm,Wc1,Wc1")
+(match_operand 5 "vector_length_operand" " rK, rK, rK, 
rK")
+(match_operand 6 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 7 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 8 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 9 "const_int_operand" "  i,  i,  i, 
 i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)
 (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
  (plus_minus:VWEXTF
-   (match_operand:VWEXTF 3 "register_operand" "   vr,   
vr")
+   (match_operand:VWEXTF 3 "register_operand"" vr, vr, vr, 
vr")
(float_extend:VWEXTF
  (vec_duplicate:
-   (match_operand: 4 "register_operand"   "f,
f"
- (match_operand:VWEXTF 2 "vector_merge_operand"   "   vu,
0")))]
+   (match_operand: 4 "register_operand"  "  f,  f,  f, 
 f"
+ (match_operand:VWEXTF 2 "vector_merge_operand"  " vu,  0, vu, 
 0")))]
   "TARGET_VECTOR"
   "vfw.wf\t%0,%3,%4%p1"
   [(set_attr "type" "vf")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c
new file mode 100644
index 000..95ec8e06021
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-std=gnu99" } */
+
+#include 
+#include 
+
+vfloat64m8_t
+test_vfwadd_wf_f64m8_m (vbool8_t vm, vfloat64m8_t vs2, float rs1, size_t vl)
+{
+  return __riscv_vfwadd_wf_f64m8_m (vm, vs2, rs1, vl);
+}
+
+char global_memory[1024];
+void *fake_memory = (void *) global_memory;
+
+int
+main ()
+{
+  asm volatile ("fence" ::: "memory");
+  vfloat64m8_t vfwadd_wf_f64m8_m_vd = test_vfwadd_wf_f64m8_m (
+__riscv_vreinterpret_v_i8m1_b8 (__riscv_vundefined_i8m1 ()),
+__riscv_vundefined_f64m8 (), 1.0, __riscv_vsetvlmax_e64m8 ());
+  asm volatile ("" ::"vr"(vfwadd_wf_f64m8_m_vd) : "memory");
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c
new file mode 100644
index 000..6d680037aa1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-std=gnu99" } */
+
+#include 
+#include 
+
+vfloat64m8_t
+test_vfwadd_wf_f64m8_m (vbool8_t vm, vfloat64m8_t vs2, float rs1, size_t vl)
+{
+  return __riscv_vfwadd_wf_f64m8_m (vm, vs2, rs1, vl);
+}
+
+char global_memory[1024];
+void *fake_memory = (void *) global_memory;
+
+int
+main ()
+{
+  asm volatile ("fence" ::: "memory");
+  vfloat64m8_t vfwadd_wf_f64m8_m_vd = test_vfwadd_wf

[gcc r15-950] aarch64: testsuite: Explicitly add -mlittle-endian to vget_low_2.c

2024-05-31 Thread Pengxuan Zheng via Gcc-cvs

https://gcc.gnu.org/g:7fb62627cfb3e03811bb667fa7159bbc7f972f00

commit r15-950-g7fb62627cfb3e03811bb667fa7159bbc7f972f00
Author: Pengxuan Zheng 
Date:   Wed May 22 17:38:43 2024 -0700

aarch64: testsuite: Explicitly add -mlittle-endian to vget_low_2.c

vget_low_2.c is a test case for little-endian, but we missed the 
-mlittle-endian
flag in r15-697-ga2e4fe5a53cf75.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vget_low_2.c: Add -mlittle-endian.

Signed-off-by: Pengxuan Zheng 

Diff:
---
 gcc/testsuite/gcc.target/aarch64/vget_low_2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/vget_low_2.c 
b/gcc/testsuite/gcc.target/aarch64/vget_low_2.c
index 44414e1c043..93e9e664ee9 100644
--- a/gcc/testsuite/gcc.target/aarch64/vget_low_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/vget_low_2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-optimized" } */
+/* { dg-options "-O3 -fdump-tree-optimized -mlittle-endian" } */
 
 #include

[gcc r15-949] MAINTAINERS: Add myself to Write After Approval and DCO

2024-05-31 Thread Pengxuan Zheng via Gcc-cvs

https://gcc.gnu.org/g:96ec186d1dbeaa87453c3703e25fae7ce3ddbbb7

commit r15-949-g96ec186d1dbeaa87453c3703e25fae7ce3ddbbb7
Author: Pengxuan Zheng 
Date:   Fri May 31 11:07:05 2024 -0700

MAINTAINERS: Add myself to Write After Approval and DCO

ChangeLog:

* MAINTAINERS: Add myself to Write After Approval and DCO.

Signed-off-by: Pengxuan Zheng 

Diff:
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e2870eef2ef..6444e6ea2f1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -743,6 +743,7 @@ Dennis Zhang

 Yufeng Zhang   
 Qing Zhao  
 Shujing Zhao   
+Pengxuan Zheng 
 Jon Ziegler
 Roman Zippel   
 Josef Zlomek   
@@ -789,3 +790,4 @@ Martin Uecker   

 Jonathan Wakely
 Alexander Westbrooks   
 Chung-Ju Wu
+Pengxuan Zheng

[gcc r15-947] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-05-31 Thread Qing Zhao via Gcc-cvs

https://gcc.gnu.org/g:3d94fee616d6132075f3292a6eafdcb7b1d3f5a5

commit r15-947-g3d94fee616d6132075f3292a6eafdcb7b1d3f5a5
Author: Qing Zhao 
Date:   Tue May 28 18:37:14 2024 +

Use the .ACCESS_WITH_SIZE in bound sanitizer.

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.

Diff:
---
 gcc/c-family/c-ubsan.cc| 42 
 .../gcc.dg/ubsan/flex-array-counted-by-bounds-2.c  | 45 +
 .../gcc.dg/ubsan/flex-array-counted-by-bounds-3.c  | 34 
 .../gcc.dg/ubsan/flex-array-counted-by-bounds-4.c  | 34 
 .../gcc.dg/ubsan/flex-array-counted-by-bounds.c| 46 ++
 5 files changed, 201 insertions(+)

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819dd..7cd3c6aa5b8 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
   return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data));
 }
 
+/* Get the tree that represented the number of counted_by, i.e, the maximum
+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));
+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when class_of_size is 1, i.e, the number of the elements of
+ the object type, return the size.  */
+  if (class_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
that gets expanded in the sanopt pass, and make an array dimension
of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }
   else
return NULL_TREE;
 }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index 000..b503320628d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* Test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m)
+{
+  struct foo {
+int n;
+int p[][n2][n1] __attribute__((counted_by(n)));
+  } *f;
+
+  f = (struct foo *) malloc

[gcc r15-948] Add the 6th argument to .ACCESS_WITH_SIZE

2024-05-31 Thread Qing Zhao via Gcc-cvs

https://gcc.gnu.org/g:4c5bea7def13613fba166edb23289bab446b0b48

commit r15-948-g4c5bea7def13613fba166edb23289bab446b0b48
Author: Qing Zhao 
Date:   Tue May 28 18:39:31 2024 +

Add the 6th argument to .ACCESS_WITH_SIZE

to carry the TYPE of the flexible array.

Such information is needed during tree-object-size.cc.

We cannot use the result type or the type of the 1st argument
of the routine .ACCESS_WITH_SIZE to decide the element type
of the original array due to possible type casting in the
source code.

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th
argument to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): Use the type
of the 6th argument for the type of the element.
* internal-fn.cc (expand_ACCESS_WITH_SIZE): Update the comment with
the 6th argument.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-6.c: New test.

Diff:
---
 gcc/c/c-typeck.cc  | 11 --
 gcc/internal-fn.cc |  2 ++
 gcc/testsuite/gcc.dg/flex-array-counted-by-6.c | 46 ++
 gcc/tree-object-size.cc| 16 +
 4 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 268c3ddbe14..a0e7dbe1b48 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2714,7 +2714,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
 
to:
 
-   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1))
+   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1,
+   (TYPE_OF_ARRAY *)0))
 
NOTE: The return type of this function is the POINTER type pointing
to the original flexible array type.
@@ -2726,6 +2727,9 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
The 4th argument of the call is a constant 0 with the TYPE of the
object pointed by COUNTED_BY_REF.
 
+   The 6th argument of the call is a constant 0 with the pointer TYPE
+   to the original flexible array type.
+
   */
 static tree
 build_access_with_size_for_counted_by (location_t loc, tree ref,
@@ -2738,12 +2742,13 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
 
   tree call
 = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
-   result_type, 5,
+   result_type, 6,
array_to_pointer_conversion (loc, ref),
counted_by_ref,
build_int_cst (integer_type_node, 1),
build_int_cst (counted_by_type, 0),
-   build_int_cst (integer_type_node, -1));
+   build_int_cst (integer_type_node, -1),
+   build_int_cst (result_type, 0));
   /* Wrap the call with an INDIRECT_REF with the flexible array type.  */
   call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
   SET_EXPR_LOCATION (call, loc);
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index eb2c4cd5904..0d27f17b283 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3456,6 +3456,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
  1: read_only
  2: write_only
  3: read_write
+   6th argument: A constant 0 with the pointer TYPE to the original flexible
+ array type.
 
Both the return type and the type of the first argument of this
function have been converted from the incomplete array type to
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
new file mode 100644
index 000..65fa01443d9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
@@ -0,0 +1,46 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size: when the type of the flexible array member
+ * is casting to another type.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+typedef unsigned short u16;
+
+struct info {
+   u16 data_len;
+   char data[] __attribute__((counted_by(data_len)));
+};
+
+struct foo {
+   int a;
+   int b;
+};
+
+static __attribute__((__noinline__))
+struct info *setup ()
+{
+ struct info *p;
+ size_t bytes = 3 * sizeof(struct foo);
+
+ p = (struct info *)malloc (sizeof (struct info) + bytes);
+ p->data_len = bytes;
+
+ return p;
+}
+
+static void
+__attribute__((__noinline__)) report (struct info *p)
+{
+ struct foo *bar = (struct foo *)p->data;
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 1), 1), 16);
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 2), 1), 8);
+}
+
+int main(int argc, char *argv[])
+{
+ stru

[gcc r15-946] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-05-31 Thread Qing Zhao via Gcc-cvs

https://gcc.gnu.org/g:6f17933548fc34ee269e90546a590df8269cee60

commit r15-946-g6f17933548fc34ee269e90546a590df8269cee60
Author: Qing Zhao 
Date:   Tue May 28 18:36:00 2024 +

Use the .ACCESS_WITH_SIZE in builtin object size.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/builtin-object-size-common.h |  11 ++
 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c|  63 
 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c| 178 ++
 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c|  48 ++
 gcc/tree-object-size.cc   |  60 
 5 files changed, 360 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953..b677067c6e6 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
   __builtin_abort ();\
 return 0;\
   } while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);\
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index 000..78f50230e89
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index 000..20103d58ef5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* Test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+We should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10 
+#define MAX(a, b) ((a) > (b)

[gcc r15-945] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-05-31 Thread Qing Zhao via Gcc-cvs

https://gcc.gnu.org/g:bb49b6e4f55891d0d8b596845118f40df6ae72a5

commit r15-945-gbb49b6e4f55891d0d8b596845118f40df6ae72a5
Author: Qing Zhao 
Date:   Tue May 28 18:34:09 2024 +

Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

Including the following changes:
* The definition of the new internal function .ACCESS_WITH_SIZE
  in internal-fn.def.
* C FE converts every reference to a FAM with a "counted_by" attribute
  to a call to the internal function .ACCESS_WITH_SIZE.
  (build_component_ref in c_typeck.cc)

  This includes the case when the object is statically allocated and
  initialized.
  In order to make this working, the routine digest_init in c-typeck.cc
  is updated to fold calls to .ACCESS_WITH_SIZE to its first argument
  when require_constant is TRUE.

  However, for the reference inside "offsetof", the "counted_by" attribute 
is
  ignored since it's not useful at all.
  (c_parser_postfix_expression in c/c-parser.cc)

  In addtion to "offsetof", for the reference inside operator "typeof" and
  "alignof", we ignore counted_by attribute too.

  When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
  replace the call with its first argument.

* Convert every call to .ACCESS_WITH_SIZE to its first argument.
  (expand_ACCESS_WITH_SIZE in internal-fn.cc)
* Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
  get the reference from the call to .ACCESS_WITH_SIZE.
  (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)

gcc/c/ChangeLog:

* c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
attribute when build_component_ref inside offsetof operator.
* c-tree.h (build_component_ref): Add one more parameter.
* c-typeck.cc (build_counted_by_ref): New function.
(build_access_with_size_for_counted_by): New function.
(build_component_ref): Check the counted-by attribute and build
call to .ACCESS_WITH_SIZE.
(build_unary_op): When building ADDR_EXPR for
.ACCESS_WITH_SIZE, use its first argument.
(lvalue_p): Accept call to .ACCESS_WITH_SIZE.
(digest_init): Fold call to .ACCESS_WITH_SIZE to its first
argument when require_constant is TRUE.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
* internal-fn.def (ACCESS_WITH_SIZE): New internal function.
* tree.cc (is_access_with_size_p): New function.
(get_ref_from_access_with_size): New function.
* tree.h (is_access_with_size_p): New prototype.
(get_ref_from_access_with_size): New prototype.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-2.c: New test.

Diff:
---
 gcc/c/c-parser.cc  |  10 +-
 gcc/c/c-tree.h |   2 +-
 gcc/c/c-typeck.cc  | 142 -
 gcc/internal-fn.cc |  34 ++
 gcc/internal-fn.def|   5 +
 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c | 112 +++
 gcc/tree.cc|  22 
 gcc/tree.h |   8 ++
 8 files changed, 328 insertions(+), 7 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 00f8bf4376e..2d9e9c0969f 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10848,9 +10848,12 @@ c_parser_postfix_expression (c_parser *parser)
if (c_parser_next_token_is (parser, CPP_NAME))
  {
c_token *comp_tok = c_parser_peek_token (parser);
+   /* Ignore the counted_by attribute for reference inside
+  offsetof since the information is not useful at all.  */
offsetof_ref
  = build_component_ref (loc, offsetof_ref, comp_tok->value,
-comp_tok->location, UNKNOWN_LOCATION);
+comp_tok->location, UNKNOWN_LOCATION,
+false);
c_parser_consume_token (parser);
while (c_parser_next_token_is (parser, CPP_DOT)
   || c_parser_next_token_is (parser,
@@ -10877,11 +10880,14 @@ c_parser_postfix_expression (c_parser *parser)
break;
  }
c_token *comp_tok = c_parser_peek_token (parser);
+   /* Ignore the counted_by attribute for reference inside
+  offsetof since the information is not useful.  */
offsetof_ref
  = build_component_ref (loc, offsetof_ref,

[gcc r15-944] Provide counted_by attribute to flexible array member field

2024-05-31 Thread Qing Zhao via Gcc-cvs

https://gcc.gnu.org/g:f824acd0e807546a733c122ab6340f18cef88766

commit r15-944-gf824acd0e807546a733c122ab6340f18cef88766
Author: Qing Zhao 
Date:   Tue May 28 18:30:05 2024 +

Provide counted_by attribute to flexible array member field

'counted_by (COUNT)'
 The 'counted_by' attribute may be attached to the C99 flexible
 array member of a structure.  It indicates that the number of the
 elements of the array is given by the field "COUNT" in the
 same structure as the flexible array member.
 GCC may use this information to improve detection of object size 
information
 for such structures and provide better results in compile-time 
diagnostics
 and runtime features like the array bound sanitizer and
 the '__builtin_dynamic_object_size'.

 For instance, the following code:

  struct P {
size_t count;
char other;
char array[] __attribute__ ((counted_by (count)));
  } *p;

 specifies that the 'array' is a flexible array member whose number
 of elements is given by the field 'count' in the same structure.

 The field that represents the number of the elements should have an
 integer type.  Otherwise, the compiler reports an error and
 ignores the attribute.

 When the field that represents the number of the elements is assigned a
 negative integer value, the compiler treats the value as zero.

 An explicit 'counted_by' annotation defines a relationship between
 two objects, 'p->array' and 'p->count', and there are the following
 requirementthat on the relationship between this pair:

* 'p->count' must be initialized before the first reference to
  'p->array';

* 'p->array' has _at least_ 'p->count' number of elements
  available all the time.  This relationship must hold even
  after any of these related objects are updated during the
  program.

 It's the user's responsibility to make sure the above requirements
 to be kept all the time.  Otherwise the compiler reports
 warnings, at the same time, the results of the array bound
 sanitizer and the '__builtin_dynamic_object_size' is undefined.

 One important feature of the attribute is, a reference to the
 flexible array member field uses the latest value assigned to
 the field that represents the number of the elements before that
 reference.  For example,

p->count = val1;
p->array[20] = 0;  // ref1 to p->array
p->count = val2;
p->array[30] = 0;  // ref2 to p->array

 in the above, 'ref1' uses 'val1' as the number of the elements
 in 'p->array', and 'ref2' uses 'val2' as the number of elements
 in 'p->array'.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_counted_by_attribute): New function.
(attribute_takes_identifier_p): Add counted_by attribute to the 
list.
* c-common.cc (c_flexible_array_member_type_p): ...To this.
* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
(add_flexible_array_elts_to_size): Use renamed function.
(is_flexible_array_member_p): Use renamed function.
(verify_counted_by_attribute): New function.
(finish_struct): Use renamed function and verify counted_by
attribute.
* c-tree.h (lookup_field): New prototype.
* c-typeck.cc (lookup_field): Expose as extern function.
(tagged_types_tu_compatible_p): Check counted_by attribute for
structure type.

gcc/ChangeLog:

* doc/extend.texi: Document attribute counted_by.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by.c: New test.
* gcc.dg/flex-array-counted-by-7.c: New test.
* gcc.dg/flex-array-counted-by-8.c: New test.

Diff:
---
 gcc/c-family/c-attribs.cc  |  68 -
 gcc/c-family/c-common.cc   |  13 +++
 gcc/c-family/c-common.h|   1 +
 gcc/c/c-decl.cc|  80 
 gcc/c/c-tree.h |   1 +
 gcc/c/c-typeck.cc  |  37 ++-
 gcc/doc/extend.texi|  68 +
 gcc/testsuite/gcc.dg/flex-array-counted-by-7.c |   8 ++
 gcc/testsuite/gcc.dg/flex-array-counted-by-8.c | 127 +
 gcc/testsuite/gcc.dg/flex-array-counted-by.c   |  62 
 10 files changed, 444 insertions(+), 21 deletions(-)

diff --git a/g

[gcc(refs/users/aoliva/heads/testme)] [libstdc++] add _GLIBCXX_CLANG to workaround predefined clang

2024-05-31 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:1472684e8baaf4137c27fc725e6f2b1e852b

commit 1472684e8baaf4137c27fc725e6f2b1e852b
Author: Alexandre Oliva 
Date:   Fri May 31 09:10:55 2024 -0300

[libstdc++] add _GLIBCXX_CLANG to workaround predefined __clang__

A proprietary embedded operating system that uses clang as its primary
compiler ships headers that require __clang__ to be defined.  Defining
that macro causes libstdc++ to adopt workarounds that work for clang
but that break for GCC.

So, introduce a _GLIBCXX_CLANG macro, and a convention to test for it
rather than for __clang__, so that a GCC variant that adds -D__clang__
to satisfy system headers can also -D_GLIBCXX_CLANG=0 to avoid
workarounds that are not meant for GCC.

I've left fast_float and ryu files alone, their tests for __clang__
don't seem to be harmful for GCC, they don't include bits/c++config,
and patching such third-party files would just make trouble for
updating them without visible benefit.


for  libstdc++-v3/ChangeLog

* include/bits/c++config (_GLIBCXX_CLANG): Define or undefine.
* include/bits/locale_facets_nonio.tcc: Test for it.
* include/bits/stl_bvector.h: Likewise.
* include/c_compatibility/stdatomic.h: Likewise.
* include/experimental/bits/simd.h: Likewise.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_detail.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.
* include/experimental/simd: Likewise.
* include/std/complex: Likewise.
* include/std/ranges: Likewise.
* include/std/variant: Likewise.
* include/pstl/pstl_config.h: Likewise, when defined(__GLIBCXX__).

Diff:
---
 libstdc++-v3/include/bits/c++config   | 13 -
 libstdc++-v3/include/bits/locale_facets_nonio.tcc |  2 +-
 libstdc++-v3/include/bits/stl_bvector.h   |  2 +-
 libstdc++-v3/include/c_compatibility/stdatomic.h  |  2 +-
 libstdc++-v3/include/experimental/bits/simd.h | 14 +++---
 libstdc++-v3/include/experimental/bits/simd_builtin.h |  4 ++--
 libstdc++-v3/include/experimental/bits/simd_detail.h  |  8 
 libstdc++-v3/include/experimental/bits/simd_x86.h | 12 ++--
 libstdc++-v3/include/experimental/simd|  2 +-
 libstdc++-v3/include/pstl/pstl_config.h   |  4 ++--
 libstdc++-v3/include/std/complex  |  4 ++--
 libstdc++-v3/include/std/ranges   |  8 
 libstdc++-v3/include/std/variant  |  2 +-
 13 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index b57e3f338e9..6dca2d9467a 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -481,9 +481,20 @@ _GLIBCXX_END_NAMESPACE_VERSION
 // Define if compatibility should be provided for -mlong-double-64.
 #undef _GLIBCXX_LONG_DOUBLE_COMPAT
 
+// Use an alternate macro to test for clang, so as to provide an easy
+// workaround for systems (such as vxworks) whose headers require
+// __clang__ to be defined, even when compiling with GCC.
+#if !defined _GLIBCXX_CLANG && defined __clang__
+# define _GLIBCXX_CLANG __clang__
+// Turn -D_GLIBCXX_CLANG=0 into -U_GLIBCXX_CLANG, so that
+// _GLIBCXX_CLANG can be tested as defined, just like __clang__.
+#elif !_GLIBCXX_CLANG
+# undef _GLIBCXX_CLANG
+#endif
+
 // Define if compatibility should be provided for alternative 128-bit long
 // double formats. Not possible for Clang until __ibm128 is supported.
-#ifndef __clang__
+#ifndef _GLIBCXX_CLANG
 #undef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
 #endif
 
diff --git a/libstdc++-v3/include/bits/locale_facets_nonio.tcc 
b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
index 8f67be5a614..72136f42f08 100644
--- a/libstdc++-v3/include/bits/locale_facets_nonio.tcc
+++ b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
@@ -1465,7 +1465,7 @@ _GLIBCXX_END_NAMESPACE_LDBL_OR_CXX11
   ctype<_CharT> const& __ctype = use_facet >(__loc);
   __err = ios_base::goodbit;
   bool __use_state = false;
-#if __GNUC__ >= 5 && !defined(__clang__)
+#if __GNUC__ >= 5 && !defined(_GLIBCXX_CLANG)
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wpmf-conversions"
   // Nasty hack.  The C++ standard mandates that get invokes the do_get
diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index d567e26f4e4..52153cadf8f 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -185,7 +185,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 void
 _M_assume_normalized() const
 {
-#if __has_attribute(__assume__) && !defined(__clang__)
+#if __has_attribute(__assume__) && !defined(_GLIBCXX_CLAN

[gcc/aoliva/heads/testme] [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

2024-05-31 Thread Alexandre Oliva via Gcc-cvs

The branch 'aoliva/heads/testme' was updated to point to:

 1472684e8ba... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

It previously pointed to:

 68673b6784a... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  68673b6... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c


Summary of changes (added commits):
---

  1472684... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

[gcc r13-8813] vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-05-31 Thread Richard Sandiford via Gcc-cvs

https://gcc.gnu.org/g:2602b71103d5ef2ef86000cac832b31dad3dfe2b

commit r13-8813-g2602b71103d5ef2ef86000cac832b31dad3dfe2b
Author: Richard Sandiford 
Date:   Fri May 31 15:56:05 2024 +0100

vect: Tighten vect_determine_precisions_from_range [PR113281]

This was another PR caused by the way that
vect_determine_precisions_from_range handles shifts.  We tried to
narrow 32768 >> x to a 16-bit shift based on range information for
the inputs and outputs, with vect_recog_over_widening_pattern
(after PR110828) adjusting the shift amount.  But this doesn't
work for the case where x is in [16, 31], since then 32-bit
32768 >> x is a well-defined zero, whereas no well-defined
16-bit 32768 >> y will produce 0.

We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
but since vect_determine_precisions_from_range was never really
supposed to rely on fix-ups, it seems better to fix that instead.

The patch also makes the code more selective about which codes
can be narrowed based on input and output ranges.  This showed
that vect_truncatable_operation_p was missing cases for
BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
(equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).

pr113281-1.c is the original testcase.  pr113281-[23].c failed
before the patch due to overly optimistic narrowing.  pr113281-[45].c
previously passed and are meant to protect against accidental
optimisation regressions.

gcc/
PR target/113281
* tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove
workaround for right shifts.
(vect_truncatable_operation_p): Handle NEGATE_EXPR and BIT_NOT_EXPR.
(vect_determine_precisions_from_range): Be more selective about
which codes can be narrowed based on their input and output ranges.
For shifts, require at least one more bit of precision than the
maximum shift amount.

gcc/testsuite/
PR target/113281
* gcc.dg/vect/pr113281-1.c: New test.
* gcc.dg/vect/pr113281-2.c: Likewise.
* gcc.dg/vect/pr113281-3.c: Likewise.
* gcc.dg/vect/pr113281-4.c: Likewise.
* gcc.dg/vect/pr113281-5.c: Likewise.

(cherry picked from commit 1a8261e047f7a2c2b0afb95716f7615cba718cd1)

Diff:
---
 gcc/testsuite/gcc.dg/vect/pr113281-1.c |  17 ++
 gcc/testsuite/gcc.dg/vect/pr113281-2.c |  50 +++
 gcc/testsuite/gcc.dg/vect/pr113281-3.c |  39 
 gcc/testsuite/gcc.dg/vect/pr113281-4.c |  55 +
 gcc/testsuite/gcc.dg/vect/pr113281-5.c |  66 
 gcc/tree-vect-patterns.cc  | 107 -
 6 files changed, 305 insertions(+), 29 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr113281-1.c 
b/gcc/testsuite/gcc.dg/vect/pr113281-1.c
new file mode 100644
index 000..6df4231cb5f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr113281-1.c
@@ -0,0 +1,17 @@
+#include "tree-vect.h"
+
+unsigned char a;
+
+int main() {
+  check_vect ();
+
+  short b = a = 0;
+  for (; a != 19; a++)
+if (a)
+  b = 32872 >> a;
+
+  if (b == 0)
+return 0;
+  else
+return 1;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/pr113281-2.c 
b/gcc/testsuite/gcc.dg/vect/pr113281-2.c
new file mode 100644
index 000..3a1170c28b6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr113281-2.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+
+#define N 128
+
+short x[N];
+short y[N];
+
+void
+f1 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= y[i];
+}
+
+void
+f2 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= (y[i] < 32 ? y[i] : 32);
+}
+
+void
+f3 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= (y[i] < 31 ? y[i] : 31);
+}
+
+void
+f4 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= (y[i] & 31);
+}
+
+void
+f5 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= 0x8000 >> y[i];
+}
+
+void
+f6 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= 0x8000 >> (y[i] & 31);
+}
+
+/* { dg-final { scan-tree-dump-not {can narrow[^\n]+>>} "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr113281-3.c 
b/gcc/testsuite/gcc.dg/vect/pr113281-3.c
new file mode 100644
index 000..5982dd2d16f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr113281-3.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+
+#define N 128
+
+short x[N];
+short y[N];
+
+void
+f1 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= (y[i] < 30 ? y[i] : 30);
+}
+
+void
+f2 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= ((y[i] & 15) + 2);
+}
+
+void
+f3 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] >>= (y[i] < 16 ? y[i] : 16);
+}
+
+void
+f4 (void)
+{
+  for (int i = 0; i < N; ++i)
+x[i] = 32768 >> ((y[i] & 15) + 3);
+}
+
+/* { dg-final { scan-tree-dump {can narrow to signed:31 without loss [^\n]+>>} 
"vect" } } */
+/* { dg-final { scan-tree-dump {can n

[gcc r13-8812] vect: Fix access size alignment assumption [PR115192]

2024-05-31 Thread Richard Sandiford via Gcc-cvs

https://gcc.gnu.org/g:0836216693749f3b0b383d015bd36c004754f1da

commit r13-8812-g0836216693749f3b0b383d015bd36c004754f1da
Author: Richard Sandiford 
Date:   Fri May 31 15:56:04 2024 +0100

vect: Fix access size alignment assumption [PR115192]

create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:

  end_a <= start_b || end_b <= start_a

It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly).  The comment for the latter is:

  /* Calculate the minimum alignment shared by all four pointers,
 then arrange for this alignment to be subtracted from the
 exclusive maximum values to get inclusive maximum values.
 This "- min_align" is cumulative with a "+ access_size"
 in the calculation of the maximum values.  In the best
 (and common) case, the two cancel each other out, leaving
 us with an inclusive bound based only on seg_len.  In the
 worst case we're simply adding a smaller number than before.

The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.

The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.

gcc/
PR tree-optimization/115192
* tree-data-ref.cc (create_intersect_range_checks): Take the
alignment of the access sizes into account.

gcc/testsuite/
PR tree-optimization/115192
* gcc.dg/vect/pr115192.c: New test.

(cherry picked from commit a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba)

Diff:
---
 gcc/testsuite/gcc.dg/vect/pr115192.c | 28 
 gcc/tree-data-ref.cc |  5 -
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr115192.c 
b/gcc/testsuite/gcc.dg/vect/pr115192.c
new file mode 100644
index 000..923d377c1bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr115192.c
@@ -0,0 +1,28 @@
+#include "tree-vect.h"
+
+int data[4 * 16 * 16] __attribute__((aligned(16)));
+
+__attribute__((noipa)) void
+foo (__SIZE_TYPE__ n)
+{
+  for (__SIZE_TYPE__ i = 1; i < n; ++i)
+{
+  data[i * n * 4] = data[(i - 1) * n * 4] + 1;
+  data[i * n * 4 + 1] = data[(i - 1) * n * 4 + 1] + 2;
+}
+}
+
+int
+main ()
+{
+  check_vect ();
+
+  data[0] = 10;
+  data[1] = 20;
+
+  foo (3);
+
+  if (data[24] != 12 || data[25] != 24)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index 6cd5f7aa3cf..96934addff1 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
 
 */
 
+#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2629,7 +2630,9 @@ create_intersect_range_checks (class loop *loop, tree 
*cond_expr,
 Because the maximum values are inclusive, there is an alias
 if the maximum value of one segment is equal to the minimum
 value of the other.  */
-  min_align = MIN (dr_a.align, dr_b.align);
+  min_align = std::min (dr_a.align, dr_b.align);
+  min_align = std::min (min_align, known_alignment (dr_a.access_size));
+  min_align = std::min (min_align, known_alignment (dr_b.access_size));
   cmp_code = LT_EXPR;
 }

[gcc r14-10264] alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]

2024-05-31 Thread Uros Bizjak via Gcc-cvs

https://gcc.gnu.org/g:ec92744de552303a1424085203e1311bd9146f21

commit r14-10264-gec92744de552303a1424085203e1311bd9146f21
Author: Uros Bizjak 
Date:   Fri May 31 15:52:03 2024 +0200

alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]

any_divmod instructions are modelled with invalid RTX:

  [(set (match_operand:DI 0 "register_operand" "=c")
(sign_extend:DI (match_operator:SI 3 "divmod_operator"
[(match_operand:DI 1 "register_operand" "a")
 (match_operand:DI 2 "register_operand" "b")])))
   (clobber (reg:DI 23))
   (clobber (reg:DI 28))]

where SImode divmod_operator (div,mod,udiv,umod) has DImode operands.

Wrap input operand with truncate:SI to make machine modes consistent.

PR target/115297

gcc/ChangeLog:

* config/alpha/alpha.md (si3): Wrap DImode
operands 3 and 4 with truncate:SI RTX.
(*divmodsi_internal_er): Ditto for operands 1 and 2.
(*divmodsi_internal_er_1): Ditto.
(*divmodsi_internal): Ditto.
* config/alpha/constraints.md ("b"): Correct register
number in the description.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115297.c: New test.

(cherry picked from commit 0ac802064c2a018cf166c37841697e867de65a95)

Diff:
---
 gcc/config/alpha/alpha.md | 21 -
 gcc/config/alpha/constraints.md   |  2 +-
 gcc/testsuite/gcc.target/alpha/pr115297.c | 13 +
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index 79f12c53c16..1e2de5a4d15 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -725,7 +725,8 @@
(sign_extend:DI (match_operand:SI 2 "nonimmediate_operand")))
(parallel [(set (match_dup 5)
   (sign_extend:DI
-   (any_divmod:SI (match_dup 3) (match_dup 4
+   (any_divmod:SI (truncate:SI (match_dup 3))
+  (truncate:SI (match_dup 4)
  (clobber (reg:DI 23))
  (clobber (reg:DI 28))])
(set (match_operand:SI 0 "nonimmediate_operand")
@@ -751,9 +752,10 @@
 
 (define_insn_and_split "*divmodsi_internal_er"
   [(set (match_operand:DI 0 "register_operand" "=c")
-   (sign_extend:DI (match_operator:SI 3 "divmod_operator"
-   [(match_operand:DI 1 "register_operand" "a")
-(match_operand:DI 2 "register_operand" "b")])))
+   (sign_extend:DI
+(match_operator:SI 3 "divmod_operator"
+ [(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+  (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(clobber (reg:DI 23))
(clobber (reg:DI 28))]
   "TARGET_EXPLICIT_RELOCS && TARGET_ABI_OSF"
@@ -795,8 +797,8 @@
 (define_insn "*divmodsi_internal_er_1"
   [(set (match_operand:DI 0 "register_operand" "=c")
(sign_extend:DI (match_operator:SI 3 "divmod_operator"
-[(match_operand:DI 1 "register_operand" "a")
- (match_operand:DI 2 "register_operand" "b")])))
+[(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+ (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(use (match_operand:DI 4 "register_operand" "c"))
(use (match_operand 5 "const_int_operand"))
(clobber (reg:DI 23))
@@ -808,9 +810,10 @@
 
 (define_insn "*divmodsi_internal"
   [(set (match_operand:DI 0 "register_operand" "=c")
-   (sign_extend:DI (match_operator:SI 3 "divmod_operator"
-   [(match_operand:DI 1 "register_operand" "a")
-(match_operand:DI 2 "register_operand" "b")])))
+   (sign_extend:DI
+(match_operator:SI 3 "divmod_operator"
+ [(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+  (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(clobber (reg:DI 23))
(clobber (reg:DI 28))]
   "TARGET_ABI_OSF"
diff --git a/gcc/config/alpha/constraints.md b/gcc/config/alpha/constraints.md
index 0d001ba26f1..4383f1fa895 100644
--- a/gcc/config/alpha/constraints.md
+++ b/gcc/config/alpha/constraints.md
@@ -27,7 +27,7 @@
  "General register 24, input to division routine")
 
 (define_register_constraint "b" "R25_REG"
- "General register 24, input to division routine")
+ "General register 25, input to division routine")
 
 (define_register_constraint "c" "R27_REG"
  "General register 27, function call address")
diff --git a/gcc/testsuite/gcc.target/alpha/pr115297.c 
b/gcc/testsuite/gcc.target/alpha/pr115297.c
new file mode 100644
index 000..4d5890ec8d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/alpha/pr115297.c
@@ -0,0 +1,13 @@
+/* PR target/115297 */
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+enum { BPF_F_USER_BUILD_ID } __bpf_get_stack_s

[gcc r15-943] alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]

2024-05-31 Thread Uros Bizjak via Gcc-cvs

https://gcc.gnu.org/g:0ac802064c2a018cf166c37841697e867de65a95

commit r15-943-g0ac802064c2a018cf166c37841697e867de65a95
Author: Uros Bizjak 
Date:   Fri May 31 15:52:03 2024 +0200

alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]

any_divmod instructions are modelled with invalid RTX:

  [(set (match_operand:DI 0 "register_operand" "=c")
(sign_extend:DI (match_operator:SI 3 "divmod_operator"
[(match_operand:DI 1 "register_operand" "a")
 (match_operand:DI 2 "register_operand" "b")])))
   (clobber (reg:DI 23))
   (clobber (reg:DI 28))]

where SImode divmod_operator (div,mod,udiv,umod) has DImode operands.

Wrap input operand with truncate:SI to make machine modes consistent.

PR target/115297

gcc/ChangeLog:

* config/alpha/alpha.md (si3): Wrap DImode
operands 3 and 4 with truncate:SI RTX.
(*divmodsi_internal_er): Ditto for operands 1 and 2.
(*divmodsi_internal_er_1): Ditto.
(*divmodsi_internal): Ditto.
* config/alpha/constraints.md ("b"): Correct register
number in the description.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115297.c: New test.

Diff:
---
 gcc/config/alpha/alpha.md | 21 -
 gcc/config/alpha/constraints.md   |  2 +-
 gcc/testsuite/gcc.target/alpha/pr115297.c | 13 +
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index 79f12c53c16..1e2de5a4d15 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -725,7 +725,8 @@
(sign_extend:DI (match_operand:SI 2 "nonimmediate_operand")))
(parallel [(set (match_dup 5)
   (sign_extend:DI
-   (any_divmod:SI (match_dup 3) (match_dup 4
+   (any_divmod:SI (truncate:SI (match_dup 3))
+  (truncate:SI (match_dup 4)
  (clobber (reg:DI 23))
  (clobber (reg:DI 28))])
(set (match_operand:SI 0 "nonimmediate_operand")
@@ -751,9 +752,10 @@
 
 (define_insn_and_split "*divmodsi_internal_er"
   [(set (match_operand:DI 0 "register_operand" "=c")
-   (sign_extend:DI (match_operator:SI 3 "divmod_operator"
-   [(match_operand:DI 1 "register_operand" "a")
-(match_operand:DI 2 "register_operand" "b")])))
+   (sign_extend:DI
+(match_operator:SI 3 "divmod_operator"
+ [(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+  (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(clobber (reg:DI 23))
(clobber (reg:DI 28))]
   "TARGET_EXPLICIT_RELOCS && TARGET_ABI_OSF"
@@ -795,8 +797,8 @@
 (define_insn "*divmodsi_internal_er_1"
   [(set (match_operand:DI 0 "register_operand" "=c")
(sign_extend:DI (match_operator:SI 3 "divmod_operator"
-[(match_operand:DI 1 "register_operand" "a")
- (match_operand:DI 2 "register_operand" "b")])))
+[(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+ (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(use (match_operand:DI 4 "register_operand" "c"))
(use (match_operand 5 "const_int_operand"))
(clobber (reg:DI 23))
@@ -808,9 +810,10 @@
 
 (define_insn "*divmodsi_internal"
   [(set (match_operand:DI 0 "register_operand" "=c")
-   (sign_extend:DI (match_operator:SI 3 "divmod_operator"
-   [(match_operand:DI 1 "register_operand" "a")
-(match_operand:DI 2 "register_operand" "b")])))
+   (sign_extend:DI
+(match_operator:SI 3 "divmod_operator"
+ [(truncate:SI (match_operand:DI 1 "register_operand" "a"))
+  (truncate:SI (match_operand:DI 2 "register_operand" "b"))])))
(clobber (reg:DI 23))
(clobber (reg:DI 28))]
   "TARGET_ABI_OSF"
diff --git a/gcc/config/alpha/constraints.md b/gcc/config/alpha/constraints.md
index 0d001ba26f1..4383f1fa895 100644
--- a/gcc/config/alpha/constraints.md
+++ b/gcc/config/alpha/constraints.md
@@ -27,7 +27,7 @@
  "General register 24, input to division routine")
 
 (define_register_constraint "b" "R25_REG"
- "General register 24, input to division routine")
+ "General register 25, input to division routine")
 
 (define_register_constraint "c" "R27_REG"
  "General register 27, function call address")
diff --git a/gcc/testsuite/gcc.target/alpha/pr115297.c 
b/gcc/testsuite/gcc.target/alpha/pr115297.c
new file mode 100644
index 000..4d5890ec8d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/alpha/pr115297.c
@@ -0,0 +1,13 @@
+/* PR target/115297 */
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+enum { BPF_F_USER_BUILD_ID } __bpf_get_stack_size;
+long __bpf_get_stack_flags, bpf_get_stack___trans_tmp_2;
+
+void bpf_get_s

[gcc(refs/users/aoliva/heads/testme)] [libstdc++] add _GLIBCXX_CLANG to workaround predefined clang

2024-05-31 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:68673b6784aa74e7209e8d54c35124799c0c8fc4

commit 68673b6784aa74e7209e8d54c35124799c0c8fc4
Author: Alexandre Oliva 
Date:   Fri May 31 09:10:55 2024 -0300

[libstdc++] add _GLIBCXX_CLANG to workaround predefined __clang__

A proprietary embedded operating system that uses clang as its primary
compiler ships headers that require __clang__ to be defined.  Defining
that macro causes libstdc++ to adopt workarounds that work for clang
but that break for GCC.

So, introduce a _GLIBCXX_CLANG macro, and a convention to test for it
rather than for __clang__, so that a GCC variant that adds -D__clang__
to satisfy system headers can also -D_GLIBCXX_CLANG=0 to avoid
workarounds that are not meant for GCC.

I've left fast_float and ryu files alone, their tests for __clang__
don't seem to be harmful for GCC, they don't include bits/c++config,
and patching such third-party files would just make trouble for
updating them without visible benefit.


for  libstdc++-v3/ChangeLog

* include/bits/c++config (_GLIBCXX_CLANG): Define or undefine.
* include/bits/locale_facets_nonio.tcc: Test for it.
* include/bits/stl_bvector.h: Likewise.
* include/c_compatibility/stdatomic.h: Likewise.
* include/experimental/bits/simd.h: Likewise.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_detail.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.
* include/experimental/simd: Likewise.
* include/std/complex: Likewise.
* include/std/ranges: Likewise.
* include/std/variant: Likewise.
* include/pstl/pstl_config.h: Likewise, when defined(__GLIBCXX__).

Diff:
---
 libstdc++-v3/include/bits/c++config   | 13 -
 libstdc++-v3/include/bits/locale_facets_nonio.tcc |  2 +-
 libstdc++-v3/include/bits/stl_bvector.h   |  2 +-
 libstdc++-v3/include/c_compatibility/stdatomic.h  |  2 +-
 libstdc++-v3/include/experimental/bits/simd.h | 14 +++---
 libstdc++-v3/include/experimental/bits/simd_builtin.h |  4 ++--
 libstdc++-v3/include/experimental/bits/simd_detail.h  |  8 
 libstdc++-v3/include/experimental/bits/simd_x86.h | 12 ++--
 libstdc++-v3/include/experimental/simd|  2 +-
 libstdc++-v3/include/pstl/pstl_config.h   |  4 ++--
 libstdc++-v3/include/std/complex  |  4 ++--
 libstdc++-v3/include/std/ranges   |  8 
 libstdc++-v3/include/std/variant  |  2 +-
 13 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index b57e3f338e9..6dca2d9467a 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -481,9 +481,20 @@ _GLIBCXX_END_NAMESPACE_VERSION
 // Define if compatibility should be provided for -mlong-double-64.
 #undef _GLIBCXX_LONG_DOUBLE_COMPAT
 
+// Use an alternate macro to test for clang, so as to provide an easy
+// workaround for systems (such as vxworks) whose headers require
+// __clang__ to be defined, even when compiling with GCC.
+#if !defined _GLIBCXX_CLANG && defined __clang__
+# define _GLIBCXX_CLANG __clang__
+// Turn -D_GLIBCXX_CLANG=0 into -U_GLIBCXX_CLANG, so that
+// _GLIBCXX_CLANG can be tested as defined, just like __clang__.
+#elif !_GLIBCXX_CLANG
+# undef _GLIBCXX_CLANG
+#endif
+
 // Define if compatibility should be provided for alternative 128-bit long
 // double formats. Not possible for Clang until __ibm128 is supported.
-#ifndef __clang__
+#ifndef _GLIBCXX_CLANG
 #undef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
 #endif
 
diff --git a/libstdc++-v3/include/bits/locale_facets_nonio.tcc 
b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
index 8f67be5a614..72136f42f08 100644
--- a/libstdc++-v3/include/bits/locale_facets_nonio.tcc
+++ b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
@@ -1465,7 +1465,7 @@ _GLIBCXX_END_NAMESPACE_LDBL_OR_CXX11
   ctype<_CharT> const& __ctype = use_facet >(__loc);
   __err = ios_base::goodbit;
   bool __use_state = false;
-#if __GNUC__ >= 5 && !defined(__clang__)
+#if __GNUC__ >= 5 && !defined(_GLIBCXX_CLANG)
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wpmf-conversions"
   // Nasty hack.  The C++ standard mandates that get invokes the do_get
diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index d567e26f4e4..52153cadf8f 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -185,7 +185,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 void
 _M_assume_normalized() const
 {
-#if __has_attribute(__assume__) && !defined(__clang__)
+#if __has_attribute(__assume__) && !defined(_GLIBCXX_CLAN

[gcc/aoliva/heads/testme] (36 commits) [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

2024-05-31 Thread Alexandre Oliva via Gcc-cvs

The branch 'aoliva/heads/testme' was updated to point to:

 68673b6784a... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c

It previously pointed to:

 ca809ee3fbe... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  ca809ee... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra
  0276651... [libstdc++-v3] [rtems] enable filesystem support
  1c34040... [tree-prof] skip if errors were seen [PR113681]
  1b22d42... [testsuite] [arm] add effective target and options for pacb
  d34a3eb... add explicit ABI and align options to pr88233.c
  0bb10f1... [rs6000] adjust return_pc debug attrs
  99047b7... enable adjustment of return_pc debug attrs


Summary of changes (added commits):
---

  68673b6... [libstdc++] add _GLIBCXX_CLANG to workaround predefined __c
  ac5c6c9... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra (*)
  1d71818... [testsuite] conditionalize dg-additional-sources on target  (*)
  5955c18... [libstdc++-v3] [rtems] enable filesystem support (*)
  b6c6d5a... Support vcond_mask_qiqi and friends. (*)
  ef27b91... Don't reduce estimated unrolled size for innermost loop. (*)
  bdc264a... [testsuite] conditionalize dg-additional-sources on target  (*)
  c9842f9... tree-ssa-pre.c/115214(ICE in find_or_generate_expression, a (*)
  c68bd7e... Revert "resource.cc: Replace calls to find_basic_block with (*)
  afe48a4... Revert "resource.cc (mark_target_live_regs): Remove check f (*)
  c31a9d3... Revert "resource.cc: Remove redundant conditionals" (*)
  d815d9a... Daily bump. (*)
  86b98d9... C23: fix aliasing for structures/unions with incomplete typ (*)
  915440e... MIPS16: Mark $2/$3 as clobbered if GP is used (*)
  9a92e5e... MIPS/testsuite: Fix bseli.b fail in msa-builtins.c (*)
  d1a1f7e... PR modula2/115276 bugfix libgm2 wraptime.InitTM returns NIL (*)
  547143d... match: Add support for `a ^ CST` to bitwise_inverted_equal_ (*)
  0a9154d... Match: Add maybe_bit_not instead of plain matching (*)
  39263ed... aarch64: Split aarch64_combinev16qi before RA [PR115258] (*)
  d22eaec... libstdc++: Use RAII to replace try/catch blocks (*)
  b24b081... Delete a file due to push error (*)
  9c74718... vect: Unify bbs in loop_vec_info and bb_vec_info (*)
  eff0004... c++: pragma target and static init [PR109753] (*)
  3ae02dc... [to-be-committed] [RISC-V] Use pack to handle repeating con (*)
  ff41abd... c++: add module extensions (*)
  18f4779... libgomp: Enable USM for AMD APUs and MI200 devices (*)
  4ccb336... libgomp: Enable USM for some nvptx devices (*)
  19c491d... c-family: add hints for strerror (*)
  f46eaad... tree-optimization/115252 - enhance peeling for gaps avoidan (*)
  1065a7d... tree-optimization/114435 - pcom left around copies confusin (*)
  9c6e75a... Fix link failure of GNAT tools on 32-bit SPARC/Linux (*)
  499d001... i386: Fix ix86_option override after change [PR 113719] (*)
  58b8c87... c++: canonicity of fn types w/ instantiated eh specs [PR115 (*)
  2f97d98... Fix memory leak. (*)
  a99ebb8... libstdc++: Build libbacktrace and 19_diagnostics/stacktrace (*)
  241a6cc... libstdc++: Avoid MMX return types from __builtin_shufflevec (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/aoliva/heads/testme' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc/aoliva/heads/testbase] (35 commits) [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra

2024-05-31 Thread Alexandre Oliva via Gcc-cvs

The branch 'aoliva/heads/testbase' was updated to point to:

 ac5c6c90a7f... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra

It previously pointed to:

 b644126237a... Align tight&hot loop without considering max skipping bytes

Diff:

Summary of changes (added commits):
---

  ac5c6c9... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra (*)
  1d71818... [testsuite] conditionalize dg-additional-sources on target  (*)
  5955c18... [libstdc++-v3] [rtems] enable filesystem support (*)
  b6c6d5a... Support vcond_mask_qiqi and friends. (*)
  ef27b91... Don't reduce estimated unrolled size for innermost loop. (*)
  bdc264a... [testsuite] conditionalize dg-additional-sources on target  (*)
  c9842f9... tree-ssa-pre.c/115214(ICE in find_or_generate_expression, a (*)
  c68bd7e... Revert "resource.cc: Replace calls to find_basic_block with (*)
  afe48a4... Revert "resource.cc (mark_target_live_regs): Remove check f (*)
  c31a9d3... Revert "resource.cc: Remove redundant conditionals" (*)
  d815d9a... Daily bump. (*)
  86b98d9... C23: fix aliasing for structures/unions with incomplete typ (*)
  915440e... MIPS16: Mark $2/$3 as clobbered if GP is used (*)
  9a92e5e... MIPS/testsuite: Fix bseli.b fail in msa-builtins.c (*)
  d1a1f7e... PR modula2/115276 bugfix libgm2 wraptime.InitTM returns NIL (*)
  547143d... match: Add support for `a ^ CST` to bitwise_inverted_equal_ (*)
  0a9154d... Match: Add maybe_bit_not instead of plain matching (*)
  39263ed... aarch64: Split aarch64_combinev16qi before RA [PR115258] (*)
  d22eaec... libstdc++: Use RAII to replace try/catch blocks (*)
  b24b081... Delete a file due to push error (*)
  9c74718... vect: Unify bbs in loop_vec_info and bb_vec_info (*)
  eff0004... c++: pragma target and static init [PR109753] (*)
  3ae02dc... [to-be-committed] [RISC-V] Use pack to handle repeating con (*)
  ff41abd... c++: add module extensions (*)
  18f4779... libgomp: Enable USM for AMD APUs and MI200 devices (*)
  4ccb336... libgomp: Enable USM for some nvptx devices (*)
  19c491d... c-family: add hints for strerror (*)
  f46eaad... tree-optimization/115252 - enhance peeling for gaps avoidan (*)
  1065a7d... tree-optimization/114435 - pcom left around copies confusin (*)
  9c6e75a... Fix link failure of GNAT tools on 32-bit SPARC/Linux (*)
  499d001... i386: Fix ix86_option override after change [PR 113719] (*)
  58b8c87... c++: canonicity of fn types w/ instantiated eh specs [PR115 (*)
  2f97d98... Fix memory leak. (*)
  a99ebb8... libstdc++: Build libbacktrace and 19_diagnostics/stacktrace (*)
  241a6cc... libstdc++: Avoid MMX return types from __builtin_shufflevec (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/aoliva/heads/testbase' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc r15-942] nvptx target: Global constructor, destructor support, via nvptx-tools 'ld'

2024-05-31 Thread Thomas Schwinge via Gcc-cvs

https://gcc.gnu.org/g:d9c90c82d900fdae95df4499bf5f0a4ecb903b53

commit r15-942-gd9c90c82d900fdae95df4499bf5f0a4ecb903b53
Author: Thomas Schwinge 
Date:   Tue May 28 23:20:29 2024 +0200

nvptx target: Global constructor, destructor support, via nvptx-tools 'ld'

The function attributes 'constructor', 'destructor', and 'init_priority' now
work, as do the C++ features making use of this.  Test cases with effective
target 'global_constructor' and 'init_priority' now generally work, and
'check-gcc-c++' test results greatly improve; no more
"sorry, unimplemented: global constructors not supported on this target".

For proper execution test results, this depends on


"ld: Global constructor/destructor support".

gcc/
* config/nvptx/nvptx.h: Configure global constructor, destructor
support.
gcc/testsuite/
* gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is
'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper
in identifiers.
* lib/target-supports.exp
(check_effective_target_global_constructor): Enable for nvptx.
libgcc/
* config/nvptx/crt0.c (__gbl_ctors): New weak function.
(__main): Invoke it.
* config/nvptx/gbl-ctors.c: New.
* config/nvptx/t-nvptx: Configure global constructor, destructor
support.

Diff:
---
 gcc/config/nvptx/nvptx.h   | 14 +++-
 .../gcc.dg/no_profile_instrument_function-attr-1.c |  2 +-
 gcc/testsuite/lib/target-supports.exp  |  3 +-
 libgcc/config/nvptx/crt0.c | 12 
 libgcc/config/nvptx/gbl-ctors.c| 74 ++
 libgcc/config/nvptx/t-nvptx|  9 ++-
 6 files changed, 109 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h
index e282aad1b73..74f4a68924c 100644
--- a/gcc/config/nvptx/nvptx.h
+++ b/gcc/config/nvptx/nvptx.h
@@ -356,7 +356,19 @@ struct GTY(()) machine_function
 #define MOVE_MAX 8
 #define MOVE_RATIO(SPEED) 4
 #define FUNCTION_MODE QImode
-#define HAS_INIT_SECTION 1
+
+/* Implement global constructor, destructor support in a conceptually simpler
+   way than using 'collect2' (the program): implement the respective
+   functionality in the nvptx-tools 'ld'.  This however still requires the
+   compiler-side effects corresponding to 'USE_COLLECT2': the global
+   constructor, destructor support functions need to have external linkage, and
+   therefore names that are "unique across the whole link".  Use
+   '!targetm.have_ctors_dtors' to achieve this (..., and thus don't need to
+   provide 'targetm.asm_out.constructor', 'targetm.asm_out.destructor').  */
+#define TARGET_HAVE_CTORS_DTORS false
+
+/* See 'libgcc/config/nvptx/crt0.c' for wrapping of 'main'.  */
+#define HAS_INIT_SECTION
 
 /* The C++ front end insists to link against libstdc++ -- which we don't build.
Tell it to instead link against the innocuous libgcc.  */
diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c 
b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c
index 909f8a68479..5b4101cf596 100644
--- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c
+++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c
@@ -18,7 +18,7 @@ int main ()
   return foo ();
 }
 
-/* { dg-final { scan-tree-dump-times "__gcov0\[._\]main.* = PROF_edge_counter" 
1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "__gcov0\[$._\]main.* = 
PROF_edge_counter" 1 "optimized"} } */
 /* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v" 1 
"optimized" } } */
 /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 
"optimized" } } */
 /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f0f6da52275..a3992faab5e 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -942,8 +942,7 @@ proc check_effective_target_nonlocal_goto {} {
 # Return 1 if global constructors are supported, 0 otherwise.
 
 proc check_effective_target_global_constructor {} {
-if { [istarget nvptx-*-*]
-|| [istarget bpf-*-*] } {
+if { [istarget bpf-*-*] } {
return 0
 }
 return 1
diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c
index e37a6fb40d3..47e8ec44c19 100644
--- a/libgcc/config/nvptx/crt0.c
+++ b/libgcc/config/nvptx/crt0.c
@@ -32,6 +32,16 @@ void *__nvptx_stacks[32] __attribute__((shared,nocommon));
 /* Likewise for -muniform-simt.  */
 unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
 
+/* Global constructor/destructor support.  Dummy; if necessary, overridden

[gcc r15-941] tree-optimization/115278 - fix DSE in if-conversion wrt volatiles

2024-05-31 Thread Richard Biener via Gcc-cvs

https://gcc.gnu.org/g:65dbe0ab7cdaf2aa84b09a74e594f0faacf1945c

commit r15-941-g65dbe0ab7cdaf2aa84b09a74e594f0faacf1945c
Author: Richard Biener 
Date:   Fri May 31 10:14:25 2024 +0200

tree-optimization/115278 - fix DSE in if-conversion wrt volatiles

The following adds the missing guard for volatile stores to the
embedded DSE in the loop if-conversion pass.

PR tree-optimization/115278
* tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores.

* g++.dg/vect/pr115278.cc: New testcase.

Diff:
---
 gcc/testsuite/g++.dg/vect/pr115278.cc | 38 +++
 gcc/tree-if-conv.cc   |  4 +++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/vect/pr115278.cc 
b/gcc/testsuite/g++.dg/vect/pr115278.cc
new file mode 100644
index 000..331075fb278
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr115278.cc
@@ -0,0 +1,38 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-additional-options "-fdump-tree-optimized" }
+
+#include 
+
+const int runs = 92;
+
+union BitfieldStructUnion {
+struct {
+uint64_t a : 17;
+uint64_t padding: 39;
+uint64_t b : 8;
+} __attribute__((packed));
+
+struct {
+uint32_t value_low;
+uint32_t value_high;
+} __attribute__((packed));
+
+BitfieldStructUnion(uint32_t value_low, uint32_t value_high) : 
value_low(value_low), value_high(value_high) {}
+};
+
+volatile uint32_t *WRITE = (volatile unsigned*)0x42;
+
+void buggy() {
+for (int i = 0; i < runs; i++) {
+BitfieldStructUnion rt{*WRITE, *WRITE};
+
+rt.a = 99;
+rt.b = 1;
+
+*WRITE = rt.value_low;
+*WRITE = rt.value_high;
+}
+}
+
+// { dg-final { scan-tree-dump-times "\\\*WRITE\[^\r\n\]* ={v} " 2 "optimized" 
} }
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 09d99fb9dda..c4c3ed41a44 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -3381,7 +3381,9 @@ ifcvt_local_dce (class loop *loop)
   gimple_stmt_iterator gsiprev = gsi;
   gsi_prev (&gsiprev);
   stmt = gsi_stmt (gsi);
-  if (gimple_store_p (stmt) && gimple_vdef (stmt))
+  if (!gimple_has_volatile_ops (stmt)
+ && gimple_store_p (stmt)
+ && gimple_vdef (stmt))
{
  tree lhs = gimple_get_lhs (stmt);
  ao_ref write;

[gcc r15-940] fix: valid compiler optimization may fail the test

2024-05-31 Thread Marc Poulhi?s via Gcc-cvs

https://gcc.gnu.org/g:e0ab5ee9bed5cbad9ae344a23ff0d302b8279d32

commit r15-940-ge0ab5ee9bed5cbad9ae344a23ff0d302b8279d32
Author: Marc Poulhiès 
Date:   Thu May 23 11:57:54 2024 +0200

fix: valid compiler optimization may fail the test

cxa4001 may fail with "Exception not raised" when the compiler omits the
calls to To_Mapping, in accordance with 10.2.1(18/3):

  "If a library unit is declared pure, then the implementation is
  permitted to omit a call on a library-level subprogram of the library
  unit if the results are not needed after the call"

Using the result of both To_Mapping calls prevents the compiler from
omitting them.

"The corrected test will be available on the ACAA web site
(http://www.ada-auth.org/), and will be issued with the Modified Tests List
version 2.6K, 3.1DD, and 4.1GG."

gcc/testsuite/ChangeLog:

* ada/acats/tests/cxa/cxa4001.a: Use function result.

Diff:
---
 gcc/testsuite/ada/acats/tests/cxa/cxa4001.a | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/testsuite/ada/acats/tests/cxa/cxa4001.a 
b/gcc/testsuite/ada/acats/tests/cxa/cxa4001.a
index d850acd4a72..52fabc3d514 100644
--- a/gcc/testsuite/ada/acats/tests/cxa/cxa4001.a
+++ b/gcc/testsuite/ada/acats/tests/cxa/cxa4001.a
@@ -185,6 +185,12 @@ begin
   begin
  Bad_Map := Maps.To_Mapping(From => "aa", To => "yz");
  Report.Failed("Exception not raised with repeated character");
+
+ if Report.Equal (Character'Pos('y'),
+  Character'Pos(Maps.Value(Bad_Map, 'a'))) then
+-- Use the map to avoid optimization.
+Report.Comment ("Shouldn't get here.");
+ end if;
   exception
  when Translation_Error => null;  -- OK
  when others=> 
@@ -200,6 +206,12 @@ begin
   begin
  Bad_Map := Maps.To_Mapping("abc", "yz");
  Report.Failed("Exception not raised with unequal parameter lengths");
+
+ if Report.Equal (Character'Pos('y'),
+  Character'Pos(Maps.Value(Bad_Map, 'a'))) then
+-- Use the map to avoid optimization.
+Report.Comment ("Shouldn't get here.");
+ end if;
   exception
  when Translation_Error => null;  -- OK
  when others=>

[gcc r15-939] build: Include minor version in config.gcc unsupported message

2024-05-31 Thread Rainer Orth via Gcc-cvs

https://gcc.gnu.org/g:37fafc63e732c51900d2d998b6df6433d9ca6e2f

commit r15-939-g37fafc63e732c51900d2d998b6df6433d9ca6e2f
Author: Rainer Orth 
Date:   Fri May 31 11:29:19 2024 +0200

build: Include minor version in config.gcc unsupported message

It has been pointed out to me that when moving Solaris 11.3 from
config.gcc's obsolete to unsupported list, I'd forgotten to also move
the minor version info, leading to confusing

*** Configuration i386-pc-solaris2.11 not supported

instead of the correct

*** Configuration i386-pc-solaris2.11.3 not supported

This patch fixes this oversight.

Tested on i386-pc-solaris2.11 (11.3 and 11.4).

2024-05-30  Rainer Orth  

gcc:
* config.gcc: Move ${target_min} from obsolete to unsupported
message.

Diff:
---
 gcc/config.gcc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index a37113bd00a..e500ba63e32 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -276,7 +276,7 @@ case ${target} in
| nios2*-*-*\
  )
 if test "x$enable_obsolete" != xyes; then
-  echo "*** Configuration ${target}${target_min} is obsolete." >&2
+  echo "*** Configuration ${target} is obsolete." >&2
   echo "*** Specify --enable-obsolete to build it anyway." >&2
   echo "*** Support will be REMOVED in the next major release of GCC," >&2
   echo "*** unless a maintainer comes forward." >&2
@@ -328,7 +328,7 @@ case ${target}${target_min} in
  | *-*-sysv*   \
  | vax-*-vms*  \
  )
-   echo "*** Configuration ${target} not supported" 1>&2
+   echo "*** Configuration ${target}${target_min} not supported" 1>&2
exit 1
;;
 esac

[gcc r15-938] Fix some opindex for some options [PR115022]

2024-05-31 Thread Andrew Pinski via Gcc-cvs

https://gcc.gnu.org/g:a0d60660f2aae2d79685f73d568facb2397582d8

commit r15-938-ga0d60660f2aae2d79685f73d568facb2397582d8
Author: Andrew Pinski 
Date:   Wed May 29 20:40:31 2024 -0700

Fix some opindex for some options [PR115022]

While looking at the index I noticed that some options had
`-` in the front for the index which is wrong. And then
I noticed there was no index for `mcmodel=` for targets or had
used `-mcmodel` incorrectly.

This fixes both of those and regnerates the urls files see that
`-mcmodel=` option now has an url associated with it.

gcc/ChangeLog:

PR target/115022
* doc/invoke.texi (fstrub=disable): Fix opindex.
(minline-memops-threshold): Fix opindex.
(mcmodel=): Add opindex and fix them.
* common.opt.urls: Regenerate.
* config/aarch64/aarch64.opt.urls: Regenerate.
* config/bpf/bpf.opt.urls: Regenerate.
* config/i386/i386.opt.urls: Regenerate.
* config/loongarch/loongarch.opt.urls: Regenerate.
* config/nds32/nds32-elf.opt.urls: Regenerate.
* config/nds32/nds32-linux.opt.urls: Regenerate.
* config/or1k/or1k.opt.urls: Regenerate.
* config/riscv/riscv.opt.urls: Regenerate.
* config/rs6000/aix64.opt.urls: Regenerate.
* config/rs6000/linux64.opt.urls: Regenerate.
* config/sparc/sparc.opt.urls: Regenerate.

Signed-off-by: Andrew Pinski 

Diff:
---
 gcc/common.opt.urls |  3 +++
 gcc/config/aarch64/aarch64.opt.urls |  3 ++-
 gcc/config/bpf/bpf.opt.urls |  3 +++
 gcc/config/i386/i386.opt.urls   |  3 ++-
 gcc/config/loongarch/loongarch.opt.urls |  2 +-
 gcc/config/nds32/nds32-elf.opt.urls |  2 +-
 gcc/config/nds32/nds32-linux.opt.urls   |  2 +-
 gcc/config/or1k/or1k.opt.urls   |  3 ++-
 gcc/config/riscv/riscv.opt.urls |  3 ++-
 gcc/config/rs6000/aix64.opt.urls|  3 ++-
 gcc/config/rs6000/linux64.opt.urls  |  3 ++-
 gcc/config/sparc/sparc.opt.urls |  2 +-
 gcc/doc/invoke.texi | 17 +++--
 13 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls
index 10462e40874..1f2eb67c8e0 100644
--- a/gcc/common.opt.urls
+++ b/gcc/common.opt.urls
@@ -1339,6 +1339,9 @@ 
UrlSuffix(gcc/Optimize-Options.html#index-fstrict-aliasing)
 fstrict-overflow
 UrlSuffix(gcc/Code-Gen-Options.html#index-fstrict-overflow)
 
+fstrub=disable
+UrlSuffix(gcc/Instrumentation-Options.html#index-fstrub_003ddisable)
+
 fstrub=strict
 UrlSuffix(gcc/Instrumentation-Options.html#index-fstrub_003dstrict)
 
diff --git a/gcc/config/aarch64/aarch64.opt.urls 
b/gcc/config/aarch64/aarch64.opt.urls
index 993634c52f8..4fa90384378 100644
--- a/gcc/config/aarch64/aarch64.opt.urls
+++ b/gcc/config/aarch64/aarch64.opt.urls
@@ -18,7 +18,8 @@ 
UrlSuffix(gcc/AArch64-Options.html#index-mfix-cortex-a53-843419)
 mlittle-endian
 UrlSuffix(gcc/AArch64-Options.html#index-mlittle-endian)
 
-; skipping UrlSuffix for 'mcmodel=' due to finding no URLs
+mcmodel=
+UrlSuffix(gcc/AArch64-Options.html#index-mcmodel_003d)
 
 mtp=
 UrlSuffix(gcc/AArch64-Options.html#index-mtp)
diff --git a/gcc/config/bpf/bpf.opt.urls b/gcc/config/bpf/bpf.opt.urls
index 8c1e5f86d5c..1e8873a899f 100644
--- a/gcc/config/bpf/bpf.opt.urls
+++ b/gcc/config/bpf/bpf.opt.urls
@@ -33,3 +33,6 @@ UrlSuffix(gcc/eBPF-Options.html#index-msmov)
 mcpu=
 UrlSuffix(gcc/eBPF-Options.html#index-mcpu-5)
 
+minline-memops-threshold=
+UrlSuffix(gcc/eBPF-Options.html#index-minline-memops-threshold)
+
diff --git a/gcc/config/i386/i386.opt.urls b/gcc/config/i386/i386.opt.urls
index 40e8a844936..9384b0b3187 100644
--- a/gcc/config/i386/i386.opt.urls
+++ b/gcc/config/i386/i386.opt.urls
@@ -40,7 +40,8 @@ UrlSuffix(gcc/x86-Options.html#index-march-16)
 mlarge-data-threshold=
 UrlSuffix(gcc/x86-Options.html#index-mlarge-data-threshold)
 
-; skipping UrlSuffix for 'mcmodel=' due to finding no URLs
+mcmodel=
+UrlSuffix(gcc/x86-Options.html#index-mcmodel_003d-7)
 
 mcpu=
 UrlSuffix(gcc/x86-Options.html#index-mcpu-14)
diff --git a/gcc/config/loongarch/loongarch.opt.urls 
b/gcc/config/loongarch/loongarch.opt.urls
index 9ed5d7b5596..f7545f65103 100644
--- a/gcc/config/loongarch/loongarch.opt.urls
+++ b/gcc/config/loongarch/loongarch.opt.urls
@@ -58,7 +58,7 @@ mrecip
 UrlSuffix(gcc/LoongArch-Options.html#index-mrecip)
 
 mcmodel=
-UrlSuffix(gcc/LoongArch-Options.html#index-mcmodel)
+UrlSuffix(gcc/LoongArch-Options.html#index-mcmodel_003d-1)
 
 mdirect-extern-access
 UrlSuffix(gcc/LoongArch-Options.html#index-mdirect-extern-access)
diff --git a/gcc/config/nds32/nds32-elf.opt.urls 
b/gcc/config/nds32/nds32-elf.opt.urls
index 3ae1efe7312..e5432b62863 100644
--- a/gcc/config/nds32/nds32-elf.opt.urls
+++ b/gcc/config/nds32/nds32-elf.opt.urls
@@ -1,5 +1,5 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/nds32/nds

[gcc r15-937] testsuite: Adjust several dg-additional-files-options calls [PR115294]

2024-05-31 Thread Rainer Orth via Gcc-cvs

https://gcc.gnu.org/g:7e322d576eb6a87607215196bec62d3348e65b0e

commit r15-937-g7e322d576eb6a87607215196bec62d3348e65b0e
Author: Rainer Orth 
Date:   Fri May 31 09:29:38 2024 +0200

testsuite: Adjust several dg-additional-files-options calls [PR115294]

A recent patch

commit bdc264a16e327c63d133131a695a202fbbc0a6a0
Author: Alexandre Oliva 
Date:   Thu May 30 02:06:48 2024 -0300

[testsuite] conditionalize dg-additional-sources on target and type

added two additional args to dg-additional-files-options.
Unfortunately, this completely broke several testsuites like

ERROR: tcl error sourcing 
/vol/gcc/src/hg/master/local/libatomic/testsuite/../../gcc/testsuite/lib/gcc-dg.exp.
wrong # args: should be "dg-additional-files-options options source dest 
type"

since the patch forgot to adjust some of the callers.

This patch fixes that.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

2024-05-31  Rainer Orth  

libatomic:
PR testsuite/115294
* testsuite/lib/libatomic.exp (libatomic_target_compile): Pass new
dg-additional-files-options args.

libgomp:
PR testsuite/115294
* testsuite/lib/libgomp.exp (libgomp_target_compile): Pass new
dg-additional-files-options args.

libitm:
PR testsuite/115294
* testsuite/lib/libitm.exp (libitm_target_compile): Pass new
dg-additional-files-options args.

libphobos:
PR testsuite/115294
* testsuite/lib/libphobos.exp (libphobos_target_compile): Pass new
dg-additional-files-options args.

libvtv:
PR testsuite/115294
* testsuite/lib/libvtv.exp (libvtv_target_compile): Pass new
dg-additional-files-options args.

Diff:
---
 libatomic/testsuite/lib/libatomic.exp | 2 +-
 libgomp/testsuite/lib/libgomp.exp | 2 +-
 libitm/testsuite/lib/libitm.exp   | 2 +-
 libphobos/testsuite/lib/libphobos.exp | 2 +-
 libvtv/testsuite/lib/libvtv.exp   | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/libatomic/testsuite/lib/libatomic.exp 
b/libatomic/testsuite/lib/libatomic.exp
index 432a67e12e9..ed6ba806732 100644
--- a/libatomic/testsuite/lib/libatomic.exp
+++ b/libatomic/testsuite/lib/libatomic.exp
@@ -214,7 +214,7 @@ proc libatomic_target_compile { source dest type options } {
set options [concat "$ALWAYS_CFLAGS" $options]
 }
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $type]
 
 set result [target_compile $source $dest $type $options]
 
diff --git a/libgomp/testsuite/lib/libgomp.exp 
b/libgomp/testsuite/lib/libgomp.exp
index cab926a798b..7c109262916 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -296,7 +296,7 @@ proc libgomp_target_compile { source dest type options } {
set options [concat "$ALWAYS_CFLAGS" $options]
 }
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $type]
 
 set result [target_compile $source $dest $type $options]
 
diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp
index 61bbfa0c923..3e60797c3e3 100644
--- a/libitm/testsuite/lib/libitm.exp
+++ b/libitm/testsuite/lib/libitm.exp
@@ -217,7 +217,7 @@ proc libitm_target_compile { source dest type options } {
set options [concat "$ALWAYS_CFLAGS" $options]
 }
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $type]
 
 set result [target_compile $source $dest $type $options]
 
diff --git a/libphobos/testsuite/lib/libphobos.exp 
b/libphobos/testsuite/lib/libphobos.exp
index d4aa433ddc1..a4a2fe3d56d 100644
--- a/libphobos/testsuite/lib/libphobos.exp
+++ b/libphobos/testsuite/lib/libphobos.exp
@@ -281,7 +281,7 @@ proc libphobos_target_compile { source dest type options } {
 lappend options "compiler=$gdc_final"
 lappend options "timeout=[timeout_value]"
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $type]
 set comp_output [target_compile $source $dest $type $options]
 
 return $comp_output
diff --git a/libvtv/testsuite/lib/libvtv.exp b/libvtv/testsuite/lib/libvtv.exp
index 4b71c9ce7bc..bfd03d7d258 100644
--- a/libvtv/testsuite/lib/libvtv.exp
+++ b/libvtv/testsuite/lib/libvtv.exp
@@ -212,7 +212,7 @@ proc libvtv_target_compile { source dest type options } {
set options [concat "$ALWAYS_CFLAGS" $options]
 }
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $t

[gcc r14-10263] vect: Fix access size alignment assumption [PR115192]

2024-05-31 Thread Richard Sandiford via Gcc-cvs

https://gcc.gnu.org/g:36575f5fe491d86b6851ff3f47cbfb7dad0fc8ae

commit r14-10263-g36575f5fe491d86b6851ff3f47cbfb7dad0fc8ae
Author: Richard Sandiford 
Date:   Fri May 31 08:22:55 2024 +0100

vect: Fix access size alignment assumption [PR115192]

create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:

  end_a <= start_b || end_b <= start_a

It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly).  The comment for the latter is:

  /* Calculate the minimum alignment shared by all four pointers,
 then arrange for this alignment to be subtracted from the
 exclusive maximum values to get inclusive maximum values.
 This "- min_align" is cumulative with a "+ access_size"
 in the calculation of the maximum values.  In the best
 (and common) case, the two cancel each other out, leaving
 us with an inclusive bound based only on seg_len.  In the
 worst case we're simply adding a smaller number than before.

The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.

The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.

gcc/
PR tree-optimization/115192
* tree-data-ref.cc (create_intersect_range_checks): Take the
alignment of the access sizes into account.

gcc/testsuite/
PR tree-optimization/115192
* gcc.dg/vect/pr115192.c: New test.

(cherry picked from commit a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba)

Diff:
---
 gcc/testsuite/gcc.dg/vect/pr115192.c | 28 
 gcc/tree-data-ref.cc |  5 -
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr115192.c 
b/gcc/testsuite/gcc.dg/vect/pr115192.c
new file mode 100644
index 000..923d377c1bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr115192.c
@@ -0,0 +1,28 @@
+#include "tree-vect.h"
+
+int data[4 * 16 * 16] __attribute__((aligned(16)));
+
+__attribute__((noipa)) void
+foo (__SIZE_TYPE__ n)
+{
+  for (__SIZE_TYPE__ i = 1; i < n; ++i)
+{
+  data[i * n * 4] = data[(i - 1) * n * 4] + 1;
+  data[i * n * 4 + 1] = data[(i - 1) * n * 4 + 1] + 2;
+}
+}
+
+int
+main ()
+{
+  check_vect ();
+
+  data[0] = 10;
+  data[1] = 20;
+
+  foo (3);
+
+  if (data[24] != 12 || data[25] != 24)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index f37734b5340..654a8220214 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
 
 */
 
+#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2640,7 +2641,9 @@ create_intersect_range_checks (class loop *loop, tree 
*cond_expr,
 Because the maximum values are inclusive, there is an alias
 if the maximum value of one segment is equal to the minimum
 value of the other.  */
-  min_align = MIN (dr_a.align, dr_b.align);
+  min_align = std::min (dr_a.align, dr_b.align);
+  min_align = std::min (min_align, known_alignment (dr_a.access_size));
+  min_align = std::min (min_align, known_alignment (dr_b.access_size));
   cmp_code = LT_EXPR;
 }

37 matches

Mail list logo