Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-13 Thread Tobias Burnus

Hi Harald,

On 13.12.22 23:27, Harald Anlauf wrote:

Am 13.12.22 um 22:41 schrieb Tobias Burnus:

Back to differences: 'diff -U0 -p -w' against the last GCC 11 branch
shows:

...
@@ -35,0 +37,2 @@ export_proto(cfi_desc_to_gfc_desc);
+/* NOTE: Since GCC 12, the FE generates code to do the conversion
+   directly without calling this function.  */
@@ -63 +66 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  d->dtype.version = s->version;
+  d->dtype.version = 0;


I was wondering what the significance of "version" is.
In ISO_Fortran_binding.h we seem to always have
   #define CFI_VERSION 1
and it did not change with gcc-12.


The version is 1 for CFI but it is 0 for GFC. However, as we do not
check the GFC version anywhere and it is not publicly exposed, it does
not really matter. Still, "d->dtype.version = 0;" matches what the
compiler itself produces – and for consistency, setting it to 0 is
better than setting it to 1 (via CFI's version field).

Actually 'dtype.version' is not really set anywhere; at least
gfc_get_dtype_rank_type(...) does not set it; zero initialization is
most common but it could be also some random value. In libgfortran,
GFC_DTYPE_CLEAR explicitly sets it to 0.

@@ -100,2 +110,2 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-d = malloc (sizeof (CFI_cdesc_t)
-   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t)));
+d = calloc (1, (sizeof (CFI_cdesc_t)
+   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t;
@@ -107 +117 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-  d->version = s->dtype.version;
+  d->version = CFI_VERSION;


This treatment of "version" was the equivalent to the above that
confused me.  Assuming we were to change CFI_VERSION in gcc-13+,
is this the right choice here regarding backward compatibility?


I don't think we will change CFI version any time soon as we rather
closely follow the Fortran standard and I do not see any changes which
are required there.

NOTE: As s->dtype.version is either 0 or some random value, setting
version in the CFI / ISO C descriptor to 1, be it as literal or as macro
constant, makes it the same as CFI_VERSION.

And: I don't think we will change CFI_VERSION or the structure of the
CFI array descriptor any time soon; there does not seem to be any need
for it, it matches the Fortran standard one well (and no plans seem to
be planed on that side) and, finally, changing an array descriptor is
painful!

However, using '1;  /* CFI_VERSION in GCC 11 and at time of writing. */'
would also work – but I would expect that we will go through all CFI
users if we ever change the descriptor (and bump the version), possibly
adding version-number dependent code.


So besides the "version" question ok from my side.


I hope I could answer the latter.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH (pushed)] docs: document --param=ipa-sra-ptrwrap-growth-factor

2022-12-13 Thread Martin Liška
gcc/ChangeLog:

* doc/invoke.texi: Document ipa-sra-ptrwrap-growth-factor.
---
 gcc/doc/invoke.texi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0459714d100..7dc1d45e275 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15520,6 +15520,11 @@ parameters only when their cumulative size is less or 
equal to
 @option{ipa-sra-ptr-growth-factor} times the size of the original
 pointer parameter.
 
+@item ipa-sra-ptrwrap-growth-factor
+Additional maximum allowed growth of total size of new parameters
+that ipa-sra replaces a pointer to an aggregate with,
+if it points to a local variable that the caller never writes to.
+
 @item ipa-sra-max-replacements
 Maximum pieces of an aggregate that IPA-SRA tracks.  As a
 consequence, it is also the maximum number of replacements of a formal
-- 
2.38.1



[PATCH] tree-optimization/107617 - big-endian .LEN_STORE VN

2022-12-13 Thread Richard Biener via Gcc-patches
The following fixes a mistake in interpreting .LEN_STORE definitions
during value-numbering when in big-endian mode.  We cannot offset
the encoding of the RHS but instead encode to an offsetted position
which is then treated correctly by the endian aware copying code.

Bootstrapped and tested on x86_64-unkown-linux-gnu and on s390
by Robin, pushed.

PR tree-optimization/107617
* tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def):
Handle negative pd.rhs_off.
(vn_reference_lookup_3): Properly provide pd.rhs_off
for .LEN_STORE on big-endian targets.
---
 gcc/tree-ssa-sccvn.cc | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index b9f289b6eca..fa2f65df159 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -2090,7 +2090,7 @@ vn_walk_cb_data::push_partial_def (pd_data pd,
len = ROUND_UP (pd.size, BITS_PER_UNIT) / BITS_PER_UNIT;
  memset (this_buffer, 0, len);
}
-  else
+  else if (pd.rhs_off >= 0)
{
  len = native_encode_expr (pd.rhs, this_buffer, bufsize,
(MAX (0, -pd.offset)
@@ -2105,6 +2105,24 @@ vn_walk_cb_data::push_partial_def (pd_data pd,
  return (void *)-1;
}
}
+  else /* negative pd.rhs_off indicates we want to chop off first bits */
+   {
+ if (-pd.rhs_off >= bufsize)
+   return (void *)-1;
+ len = native_encode_expr (pd.rhs,
+   this_buffer + -pd.rhs_off / BITS_PER_UNIT,
+   bufsize - -pd.rhs_off / BITS_PER_UNIT,
+   MAX (0, -pd.offset) / BITS_PER_UNIT);
+ if (len <= 0
+ || len < (ROUND_UP (pd.size, BITS_PER_UNIT) / BITS_PER_UNIT
+   - MAX (0, -pd.offset) / BITS_PER_UNIT))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "Failed to encode %u "
+"partial definitions\n", ndefs);
+ return (void *)-1;
+   }
+   }
 
   unsigned char *p = buffer;
   HOST_WIDE_INT size = pd.size;
@@ -3349,10 +3367,13 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
}
  else if (fn == IFN_LEN_STORE)
{
- pd.rhs_off = 0;
  pd.offset = offset2i;
  pd.size = (tree_to_uhwi (len)
 + -tree_to_shwi (bias)) * BITS_PER_UNIT;
+ if (BYTES_BIG_ENDIAN)
+   pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype));
+ else
+   pd.rhs_off = 0;
  if (ranges_known_overlap_p (offset, maxsize,
  pd.offset, pd.size))
return data->push_partial_def (pd, set, set,
-- 
2.35.3


[PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is to support VSETVL PASS for RVV support.
1.The optimization and performance is guaranteed LCM (Lazy code motion).
2.Base on RTL_SSA framework to gain better optimization chances.
3.Also we do VL/VTYPE, demand information backward propagation across
  blocks by RTL_SSA reverse order in CFG.
4.It has been well and fully tested by about 200+ testcases for VLMAX
  AVL situation (Only for VLMAX since we don't have an intrinsics to
  test non-VLMAX).
5.Will support AVL model in the next patch.

gcc/ChangeLog:

* config.gcc: Add riscv-vsetvl.o.
* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL PASS 
location.
* config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
(enum avl_type): New enum.
(get_ta): New function.
(get_ma): Ditto.
(get_avl_type): Ditto.
(calculate_ratio): Ditto.
(enum tail_policy): New enum.
(enum mask_policy): Ditto.
* config/riscv/riscv-v.cc (calculate_ratio): New function.
(emit_pred_op): change the VLMAX mov codgen.
(get_ta): New function.
(get_ma): Ditto.
(enum tail_policy): Change enum.
(get_prefer_tail_policy): New function.
(enum mask_policy): Change enum.
(get_prefer_mask_policy): New function.
* config/riscv/t-riscv: Add riscv-vsetvl.o
* config/riscv/vector.md (): Adjust attribute and pattern for VSETVL 
PASS.
(@vlmax_avl): Ditto.
(@vsetvl_no_side_effects): Delete.
(vsetvl_vtype_change_only): New MD pattern.
(@vsetvl_discard_result): Ditto.
* config/riscv/riscv-vsetvl.cc: New file.
* config/riscv/riscv-vsetvl.h: New file.

---
 gcc/config.gcc|2 +-
 gcc/config/riscv/riscv-passes.def |1 +
 gcc/config/riscv/riscv-protos.h   |   15 +
 gcc/config/riscv/riscv-v.cc   |  102 +-
 gcc/config/riscv/riscv-vsetvl.cc  | 2509 +
 gcc/config/riscv/riscv-vsetvl.h   |  344 
 gcc/config/riscv/t-riscv  |8 +
 gcc/config/riscv/vector.md|  131 +-
 8 files changed, 3076 insertions(+), 36 deletions(-)
 create mode 100644 gcc/config/riscv/riscv-vsetvl.cc
 create mode 100644 gcc/config/riscv/riscv-vsetvl.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b5eda046033..1eb76c6c076 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -518,7 +518,7 @@ pru-*-*)
;;
 riscv*)
cpu_type=riscv
-   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o"
+   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
d_target_objs="riscv-d.o"
extra_headers="riscv_vector.h"
diff --git a/gcc/config/riscv/riscv-passes.def 
b/gcc/config/riscv/riscv-passes.def
index 23ef8ac6114..d2d48f231aa 100644
--- a/gcc/config/riscv/riscv-passes.def
+++ b/gcc/config/riscv/riscv-passes.def
@@ -18,3 +18,4 @@
.  */
 
 INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
+INSERT_PASS_BEFORE (pass_sched2, 1, pass_vsetvl);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e17e003f8e2..cfd0f284f91 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -96,6 +96,7 @@ extern void riscv_parse_arch_string (const char *, struct 
gcc_options *, locatio
 extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
 
 rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
+rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
@@ -131,6 +132,12 @@ enum vlmul_type
   LMUL_F4 = 6,
   LMUL_F2 = 7,
 };
+
+enum avl_type
+{
+  NONVLMAX,
+  VLMAX,
+};
 /* Routines implemented in riscv-vector-builtins.cc.  */
 extern void init_builtins (void);
 extern const char *mangle_builtin_type (const_tree);
@@ -145,17 +152,25 @@ extern bool legitimize_move (rtx, rtx, machine_mode);
 extern void emit_pred_op (unsigned, rtx, rtx, machine_mode);
 extern enum vlmul_type get_vlmul (machine_mode);
 extern unsigned int get_ratio (machine_mode);
+extern int get_ta (rtx);
+extern int get_ma (rtx);
+extern int get_avl_type (rtx);
+extern unsigned int calculate_ratio (unsigned int, enum vlmul_type);
 enum tail_policy
 {
   TAIL_UNDISTURBED = 0,
   TAIL_AGNOSTIC = 1,
+  TAIL_ANY = 2,
 };
 
 enum mask_policy
 {
   MASK_UNDISTURBED = 0,
   MASK_AGNOSTIC = 1,
+  MASK_ANY = 2,
 };
+enum tail_policy get_prefer_tail_policy ();
+enum mask_policy get_prefer_mask_policy ();
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 13ee33938bb..f02a048f76d 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.

[PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is to support VSETVL PASS for RVV support.
1.The optimization and performance is guaranteed LCM (Lazy code motion).
2.Base on RTL_SSA framework to gain better optimization chances.
3.Also we do VL/VTYPE, demand information backward propagation across
  blocks by RTL_SSA reverse order in CFG.
4.It has been well and fully tested by about 200+ testcases for VLMAX
  AVL situation (Only for VLMAX since we don't have an intrinsics to
  test non-VLMAX).
5.Will support AVL model in the next patch.

gcc/ChangeLog:

* config.gcc: Add riscv-vsetvl.o.
* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL PASS 
location.
* config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
(enum avl_type): New enum.
(get_ta): New function.
(get_ma): Ditto.
(get_avl_type): Ditto.
(calculate_ratio): Ditto.
(enum tail_policy): New enum.
(enum mask_policy): Ditto.
* config/riscv/riscv-v.cc (calculate_ratio): New function.
(emit_pred_op): change the VLMAX mov codgen.
(get_ta): New function.
(get_ma): Ditto.
(enum tail_policy): Change enum.
(get_prefer_tail_policy): New function.
(enum mask_policy): Change enum.
(get_prefer_mask_policy): New function.
* config/riscv/t-riscv: Add riscv-vsetvl.o
* config/riscv/vector.md (): Adjust attribute and pattern for VSETVL 
PASS.
(@vlmax_avl): Ditto.
(@vsetvl_no_side_effects): Delete.
(vsetvl_vtype_change_only): New MD pattern.
(@vsetvl_discard_result): Ditto.
* config/riscv/riscv-vsetvl.cc: New file.
* config/riscv/riscv-vsetvl.h: New file.

---
 gcc/config.gcc|2 +-
 gcc/config/riscv/riscv-passes.def |1 +
 gcc/config/riscv/riscv-protos.h   |   15 +
 gcc/config/riscv/riscv-v.cc   |  102 +-
 gcc/config/riscv/riscv-vsetvl.cc  | 2509 +
 gcc/config/riscv/riscv-vsetvl.h   |  344 
 gcc/config/riscv/t-riscv  |8 +
 gcc/config/riscv/vector.md|  131 +-
 8 files changed, 3076 insertions(+), 36 deletions(-)
 create mode 100644 gcc/config/riscv/riscv-vsetvl.cc
 create mode 100644 gcc/config/riscv/riscv-vsetvl.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b5eda046033..1eb76c6c076 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -518,7 +518,7 @@ pru-*-*)
;;
 riscv*)
cpu_type=riscv
-   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o"
+   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
d_target_objs="riscv-d.o"
extra_headers="riscv_vector.h"
diff --git a/gcc/config/riscv/riscv-passes.def 
b/gcc/config/riscv/riscv-passes.def
index 23ef8ac6114..d2d48f231aa 100644
--- a/gcc/config/riscv/riscv-passes.def
+++ b/gcc/config/riscv/riscv-passes.def
@@ -18,3 +18,4 @@
.  */
 
 INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
+INSERT_PASS_BEFORE (pass_sched2, 1, pass_vsetvl);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e17e003f8e2..cfd0f284f91 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -96,6 +96,7 @@ extern void riscv_parse_arch_string (const char *, struct 
gcc_options *, locatio
 extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
 
 rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
+rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
@@ -131,6 +132,12 @@ enum vlmul_type
   LMUL_F4 = 6,
   LMUL_F2 = 7,
 };
+
+enum avl_type
+{
+  NONVLMAX,
+  VLMAX,
+};
 /* Routines implemented in riscv-vector-builtins.cc.  */
 extern void init_builtins (void);
 extern const char *mangle_builtin_type (const_tree);
@@ -145,17 +152,25 @@ extern bool legitimize_move (rtx, rtx, machine_mode);
 extern void emit_pred_op (unsigned, rtx, rtx, machine_mode);
 extern enum vlmul_type get_vlmul (machine_mode);
 extern unsigned int get_ratio (machine_mode);
+extern int get_ta (rtx);
+extern int get_ma (rtx);
+extern int get_avl_type (rtx);
+extern unsigned int calculate_ratio (unsigned int, enum vlmul_type);
 enum tail_policy
 {
   TAIL_UNDISTURBED = 0,
   TAIL_AGNOSTIC = 1,
+  TAIL_ANY = 2,
 };
 
 enum mask_policy
 {
   MASK_UNDISTURBED = 0,
   MASK_AGNOSTIC = 1,
+  MASK_ANY = 2,
 };
+enum tail_policy get_prefer_tail_policy ();
+enum mask_policy get_prefer_mask_policy ();
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 13ee33938bb..f02a048f76d 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.

[PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is to support VSETVL PASS for RVV support.
1.The optimization and performance is guaranteed LCM (Lazy code motion).
2.Base on RTL_SSA framework to gain better optimization chances.
3.Also we do VL/VTYPE, demand information backward propagation across
  blocks by RTL_SSA reverse order in CFG.
4.It has been well and fully tested by about 200+ testcases for VLMAX
  AVL situation (Only for VLMAX since we don't have an intrinsics to 
  test non-VLMAX).
5.Will support AVL model in the next patch.

gcc/ChangeLog:

* config.gcc: Add riscv-vsetvl.o.
* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL PASS 
location.
* config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
(enum avl_type): New enum.
(get_ta): New function.
(get_ma): Ditto.
(get_avl_type): Ditto.
(calculate_ratio): Ditto.
(enum tail_policy): New enum.
(enum mask_policy): Ditto.
* config/riscv/riscv-v.cc (calculate_ratio): New function.
(emit_pred_op): change the VLMAX mov codgen.
(get_ta): New function.
(get_ma): Ditto.
(enum tail_policy): Change enum.
(get_prefer_tail_policy): New function.
(enum mask_policy): Change enum.
(get_prefer_mask_policy): New function.
* config/riscv/t-riscv: Add riscv-vsetvl.o
* config/riscv/vector.md (): Adjust attribute and pattern for VSETVL 
PASS.
(@vlmax_avl): Ditto.
(@vsetvl_no_side_effects): Delete.
(vsetvl_vtype_change_only): New MD pattern.
(@vsetvl_discard_result): Ditto.
* config/riscv/riscv-vsetvl.cc: New file.
* config/riscv/riscv-vsetvl.h: New file.

---
 gcc/config.gcc|2 +-
 gcc/config/riscv/riscv-passes.def |1 +
 gcc/config/riscv/riscv-protos.h   |   15 +
 gcc/config/riscv/riscv-v.cc   |  102 +-
 gcc/config/riscv/riscv-vsetvl.cc  | 2509 +
 gcc/config/riscv/riscv-vsetvl.h   |  344 
 gcc/config/riscv/t-riscv  |8 +
 gcc/config/riscv/vector.md|  131 +-
 8 files changed, 3076 insertions(+), 36 deletions(-)
 create mode 100644 gcc/config/riscv/riscv-vsetvl.cc
 create mode 100644 gcc/config/riscv/riscv-vsetvl.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b5eda046033..1eb76c6c076 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -518,7 +518,7 @@ pru-*-*)
;;
 riscv*)
cpu_type=riscv
-   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o"
+   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
d_target_objs="riscv-d.o"
extra_headers="riscv_vector.h"
diff --git a/gcc/config/riscv/riscv-passes.def 
b/gcc/config/riscv/riscv-passes.def
index 23ef8ac6114..d2d48f231aa 100644
--- a/gcc/config/riscv/riscv-passes.def
+++ b/gcc/config/riscv/riscv-passes.def
@@ -18,3 +18,4 @@
.  */
 
 INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
+INSERT_PASS_BEFORE (pass_sched2, 1, pass_vsetvl);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e17e003f8e2..cfd0f284f91 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -96,6 +96,7 @@ extern void riscv_parse_arch_string (const char *, struct 
gcc_options *, locatio
 extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
 
 rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
+rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
@@ -131,6 +132,12 @@ enum vlmul_type
   LMUL_F4 = 6,
   LMUL_F2 = 7,
 };
+
+enum avl_type
+{
+  NONVLMAX,
+  VLMAX,
+};
 /* Routines implemented in riscv-vector-builtins.cc.  */
 extern void init_builtins (void);
 extern const char *mangle_builtin_type (const_tree);
@@ -145,17 +152,25 @@ extern bool legitimize_move (rtx, rtx, machine_mode);
 extern void emit_pred_op (unsigned, rtx, rtx, machine_mode);
 extern enum vlmul_type get_vlmul (machine_mode);
 extern unsigned int get_ratio (machine_mode);
+extern int get_ta (rtx);
+extern int get_ma (rtx);
+extern int get_avl_type (rtx);
+extern unsigned int calculate_ratio (unsigned int, enum vlmul_type);
 enum tail_policy
 {
   TAIL_UNDISTURBED = 0,
   TAIL_AGNOSTIC = 1,
+  TAIL_ANY = 2,
 };
 
 enum mask_policy
 {
   MASK_UNDISTURBED = 0,
   MASK_AGNOSTIC = 1,
+  MASK_ANY = 2,
 };
+enum tail_policy get_prefer_tail_policy ();
+enum mask_policy get_prefer_mask_policy ();
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 13ee33938bb..f02a048f76d 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v

Re: [PATCH v2 0/3] btf: fix BTF for extern items [PR106773]

2022-12-13 Thread Indu Bhagat via Gcc-patches

On 12/13/22 10:44, David Faust wrote:

[Changes from v1:
  - Remove #defines for LINKAGE_* values, instead mirror enums from
linux/btf.h to include/btf.h and use those.
  - Fix BTF generation for extern variable with both non-defining and
defining decls in the same CU. Add a test for this.
  - Update several comments per review feedback. ]

Hi,

This series fixes the issues reported in target/PR106773. I decided to
split it into three commits, as there are ultimately three distinct
issues and fixes. See each patch for details.

Tested on bpf-unknown-none and x86_64-linux-gnu, no known regressions.

OK to push?
Thanks.



Hi David,

LGTM.

Thanks


David Faust (3):
   btf: add 'extern' linkage for variables [PR106773]
   btf: fix 'extern const void' variables [PR106773]
   btf: correct generation for extern funcs [PR106773]

  gcc/btfout.cc | 184 +-
  .../gcc.dg/debug/btf/btf-datasec-2.c  |  28 +++
  .../gcc.dg/debug/btf/btf-function-6.c |  19 ++
  gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c |  25 +++
  .../gcc.dg/debug/btf/btf-variables-4.c|  24 +++
  .../gcc.dg/debug/btf/btf-variables-5.c|  19 ++
  include/btf.h |  29 ++-
  7 files changed, 276 insertions(+), 52 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-6.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c





[PATCH] RISC-V: Fix RVV machine mode attribute configuration

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

The attribute configuration of each machine mode are support in the previous 
patch.
I noticed some of them are not correct during VSETVL PASS testsing.
Correct them in the single patch now.

gcc/ChangeLog:

* config/riscv/riscv-vector-switch.def (ENTRY): Correct attributes.

---
 gcc/config/riscv/riscv-vector-switch.def | 38 
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-switch.def 
b/gcc/config/riscv/riscv-vector-switch.def
index a51f45be487..ec12be84661 100644
--- a/gcc/config/riscv/riscv-vector-switch.def
+++ b/gcc/config/riscv/riscv-vector-switch.def
@@ -95,16 +95,16 @@ TODO: FP16 vector needs support of 'zvfh', we don't support 
it yet.  */
 #endif
 
 /* Mask modes. Disable VNx64BImode when TARGET_MIN_VLEN == 32.  */
-ENTRY (VNx64BI, TARGET_MIN_VLEN > 32, LMUL_F8, 64, LMUL_RESERVED, 0)
-ENTRY (VNx32BI, true, LMUL_F4, 32, LMUL_RESERVED, 0)
-ENTRY (VNx16BI, true, LMUL_F2, 16, LMUL_RESERVED, 0)
-ENTRY (VNx8BI, true, LMUL_1, 8, LMUL_RESERVED, 0)
-ENTRY (VNx4BI, true, LMUL_2, 4, LMUL_RESERVED, 0)
-ENTRY (VNx2BI, true, LMUL_4, 2, LMUL_RESERVED, 0)
-ENTRY (VNx1BI, true, LMUL_8, 1, LMUL_RESERVED, 0)
+ENTRY (VNx64BI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 1)
+ENTRY (VNx32BI, true, LMUL_8, 1, LMUL_4, 2)
+ENTRY (VNx16BI, true, LMUL_4, 2, LMUL_2, 4)
+ENTRY (VNx8BI, true, LMUL_2, 4, LMUL_1, 8)
+ENTRY (VNx4BI, true, LMUL_1, 8, LMUL_F2, 16)
+ENTRY (VNx2BI, true, LMUL_F2, 16, LMUL_F4, 32)
+ENTRY (VNx1BI, true, LMUL_F4, 32, LMUL_F8, 64)
 
 /* SEW = 8. Disable VNx64QImode when TARGET_MIN_VLEN == 32.  */
-ENTRY (VNx64QI, TARGET_MIN_VLEN > 32, LMUL_8, 1, LMUL_RESERVED, 0)
+ENTRY (VNx64QI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 1)
 ENTRY (VNx32QI, true, LMUL_8, 1, LMUL_4, 2)
 ENTRY (VNx16QI, true, LMUL_4, 2, LMUL_2, 4)
 ENTRY (VNx8QI, true, LMUL_2, 4, LMUL_1, 8)
@@ -113,7 +113,7 @@ ENTRY (VNx2QI, true, LMUL_F2, 16, LMUL_F4, 32)
 ENTRY (VNx1QI, true, LMUL_F4, 32, LMUL_F8, 64)
 
 /* SEW = 16. Disable VNx32HImode when TARGET_MIN_VLEN == 32.  */
-ENTRY (VNx32HI, TARGET_MIN_VLEN > 32, LMUL_8, 2, LMUL_RESERVED, 0)
+ENTRY (VNx32HI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 2)
 ENTRY (VNx16HI, true, LMUL_8, 2, LMUL_4, 4)
 ENTRY (VNx8HI, true, LMUL_4, 4, LMUL_2, 8)
 ENTRY (VNx4HI, true, LMUL_2, 8, LMUL_1, 16)
@@ -121,7 +121,7 @@ ENTRY (VNx2HI, true, LMUL_1, 16, LMUL_F2, 32)
 ENTRY (VNx1HI, true, LMUL_F2, 32, LMUL_F4, 64)
 
 /* TODO:Disable all FP16 vector, enable them when 'zvfh' is supported.  */
-ENTRY (VNx32HF, false, LMUL_8, 2, LMUL_RESERVED, 0)
+ENTRY (VNx32HF, false, LMUL_RESERVED, 0, LMUL_8, 2)
 ENTRY (VNx16HF, false, LMUL_8, 2, LMUL_4, 4)
 ENTRY (VNx8HF, false, LMUL_4, 4, LMUL_2, 8)
 ENTRY (VNx4HF, false, LMUL_2, 8, LMUL_1, 16)
@@ -131,18 +131,18 @@ ENTRY (VNx1HF, false, LMUL_F2, 32, LMUL_F4, 64)
 /* SEW = 32. Disable VNx16SImode when TARGET_MIN_VLEN == 32.
For single-precision floating-point, we need TARGET_VECTOR_FP32 ==
RVV_ENABLE.  */
-ENTRY (VNx16SI, TARGET_MIN_VLEN > 32, LMUL_8, 4, LMUL_RESERVED, 0)
+ENTRY (VNx16SI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 4)
 ENTRY (VNx8SI, true, LMUL_8, 4, LMUL_4, 8)
-ENTRY (VNx4SI, true, LMUL_4, 8, LMUL_2, 4)
-ENTRY (VNx2SI, true, LMUL_2, 16, LMUL_1, 2)
-ENTRY (VNx1SI, true, LMUL_1, 32, LMUL_F2, 1)
+ENTRY (VNx4SI, true, LMUL_4, 8, LMUL_2, 16)
+ENTRY (VNx2SI, true, LMUL_2, 16, LMUL_1, 32)
+ENTRY (VNx1SI, true, LMUL_1, 32, LMUL_F2, 64)
 
-ENTRY (VNx16SF, TARGET_VECTOR_FP32 && (TARGET_MIN_VLEN > 32), LMUL_8, 4,
-   LMUL_RESERVED, 0)
+ENTRY (VNx16SF, TARGET_VECTOR_FP32 && (TARGET_MIN_VLEN > 32), LMUL_RESERVED, 0,
+   LMUL_8, 4)
 ENTRY (VNx8SF, TARGET_VECTOR_FP32, LMUL_8, 4, LMUL_4, 8)
-ENTRY (VNx4SF, TARGET_VECTOR_FP32, LMUL_4, 8, LMUL_2, 4)
-ENTRY (VNx2SF, TARGET_VECTOR_FP32, LMUL_2, 16, LMUL_1, 2)
-ENTRY (VNx1SF, TARGET_VECTOR_FP32, LMUL_1, 32, LMUL_F2, 1)
+ENTRY (VNx4SF, TARGET_VECTOR_FP32, LMUL_4, 8, LMUL_2, 16)
+ENTRY (VNx2SF, TARGET_VECTOR_FP32, LMUL_2, 16, LMUL_1, 32)
+ENTRY (VNx1SF, TARGET_VECTOR_FP32, LMUL_1, 32, LMUL_F2, 64)
 
 /* SEW = 64. Enable when TARGET_MIN_VLEN > 32.
For double-precision floating-point, we need TARGET_VECTOR_FP64 ==
-- 
2.36.3



[PATCH] RISC-V: Change vlmul printing rule

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is preparing patch for the following patch (VSETVL PASS)
support since the current vlmul printing rule is not appropriate
information for VSETVL PASS. I split this fix in a single patch.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_vsetvl): Pass through VLMUL enum 
instead of machine mode.
* config/riscv/riscv-vector-builtins-bases.cc: Ditto.
* config/riscv/riscv.cc (riscv_print_operand): Print LMUL by enum vlmul 
instead of machine mode.

---
 gcc/config/riscv/riscv-v.cc   |  2 +-
 .../riscv/riscv-vector-builtins-bases.cc  |  2 +-
 gcc/config/riscv/riscv.cc | 52 ++-
 3 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 4992ff2470c..13ee33938bb 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -115,7 +115,7 @@ emit_vlmax_vsetvl (machine_mode vmode)
 
   emit_insn (
 gen_vsetvl_no_side_effects (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, 
Pmode),
-   gen_int_mode ((unsigned int) vmode, Pmode),
+   gen_int_mode (get_vlmul (vmode), Pmode),
const1_rtx, const1_rtx));
   return vl;
 }
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 231b63a610d..ffeb1b25fbc 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -73,7 +73,7 @@ public:
 gen_int_mode (GET_MODE_BITSIZE (inner_mode), Pmode));
 
 /* LMUL.  */
-e.add_input_operand (Pmode, gen_int_mode ((unsigned int) mode, Pmode));
+e.add_input_operand (Pmode, gen_int_mode (get_vlmul (mode), Pmode));
 
 /* TA.  */
 e.add_input_operand (Pmode, gen_int_mode (1, Pmode));
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2d380aa42cb..ff07d4a3843 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4272,30 +4272,34 @@ riscv_print_operand (FILE *file, rtx op, int letter)
  }
else if (code == CONST_INT)
  {
-   /* The value in the operand is the unsigned int value
-  converted from (enum machine_mode).
-  This RTX is generated as follows:
-
-  machine_mode mode = XXXmode;
-  operand = gen_int_mode ((unsigned int)mode, Pmode);
-
-  So we convert it back into machine_mode and then calculate
-  the LMUL according to GET_MODE_SIZE.  */
-
-   machine_mode rvv_mode = (machine_mode) UINTVAL (op);
-   /* For rvv mask modes, we can not calculate LMUL simpily according
-  to BYTES_PER_RISCV_VECTOR. When rvv_mode = VNx4BImode.
-  Set SEW = 8, LMUL = 1 by default if TARGET_MIN_VLEN == 32.
-  Set SEW = 8, LMUL = 1 / 2 by default if TARGET_MIN_VLEN > 32.  */
-   bool bool_p = GET_MODE_CLASS (rvv_mode) == MODE_VECTOR_BOOL;
-   poly_int64 m1_size = BYTES_PER_RISCV_VECTOR;
-   poly_int64 rvv_size
- = bool_p ? GET_MODE_NUNITS (rvv_mode) : GET_MODE_SIZE (rvv_mode);
-   bool fractional_p = known_lt (rvv_size, BYTES_PER_RISCV_VECTOR);
-   unsigned int factor
- = fractional_p ? exact_div (m1_size, rvv_size).to_constant ()
-: exact_div (rvv_size, m1_size).to_constant ();
-   asm_fprintf (file, "%s%d", fractional_p ? "mf" : "m", factor);
+   /* If it is a const_int value, it denotes the VLMUL field enum.  */
+   unsigned int vlmul = UINTVAL (op);
+   switch (vlmul)
+ {
+ case riscv_vector::LMUL_1:
+   asm_fprintf (file, "%s", "m1");
+   break;
+ case riscv_vector::LMUL_2:
+   asm_fprintf (file, "%s", "m2");
+   break;
+ case riscv_vector::LMUL_4:
+   asm_fprintf (file, "%s", "m4");
+   break;
+ case riscv_vector::LMUL_8:
+   asm_fprintf (file, "%s", "m8");
+   break;
+ case riscv_vector::LMUL_F8:
+   asm_fprintf (file, "%s", "mf8");
+   break;
+ case riscv_vector::LMUL_F4:
+   asm_fprintf (file, "%s", "mf4");
+   break;
+ case riscv_vector::LMUL_F2:
+   asm_fprintf (file, "%s", "mf2");
+   break;
+ default:
+   gcc_unreachable ();
+ }
  }
else
  output_operand_lossage ("invalid vector constant");
-- 
2.36.3



[PATCH] RISC-V: Fix RVV mask mode size

2022-12-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is to fix RVV mask modes size. Since mask mode size are adjust
as a whole RVV register size LMUL = 1 which not only make each mask type for
example vbool32_t tied to vint8m1_t but also increase memory consuming.

I notice this issue during development of VSETVL PASS. Since it is not part of
VSETVL support, I seperate it into a single fix patch now.

gcc/ChangeLog:

* config/riscv/riscv-modes.def (ADJUST_BYTESIZE): Reduce RVV mask mode 
size.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize): New function.
(riscv_modes_tieable_p): Don't tie mask modes which will create issue.
* config/riscv/riscv.h (riscv_v_adjust_bytesize): New function.

---
 gcc/config/riscv/riscv-modes.def | 14 
 gcc/config/riscv/riscv.cc| 61 
 gcc/config/riscv/riscv.h |  1 +
 3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def
index 556b5c55253..339b41b32eb 100644
--- a/gcc/config/riscv/riscv-modes.def
+++ b/gcc/config/riscv/riscv-modes.def
@@ -64,13 +64,13 @@ ADJUST_ALIGNMENT (VNx16BI, 1);
 ADJUST_ALIGNMENT (VNx32BI, 1);
 ADJUST_ALIGNMENT (VNx64BI, 1);
 
-ADJUST_BYTESIZE (VNx1BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx2BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx4BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx8BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
+ADJUST_BYTESIZE (VNx1BI, riscv_v_adjust_bytesize (VNx1BImode, 1));
+ADJUST_BYTESIZE (VNx2BI, riscv_v_adjust_bytesize (VNx2BImode, 1));
+ADJUST_BYTESIZE (VNx4BI, riscv_v_adjust_bytesize (VNx4BImode, 1));
+ADJUST_BYTESIZE (VNx8BI, riscv_v_adjust_bytesize (VNx8BImode, 1));
+ADJUST_BYTESIZE (VNx16BI, riscv_v_adjust_bytesize (VNx16BImode, 2));
+ADJUST_BYTESIZE (VNx32BI, riscv_v_adjust_bytesize (VNx32BImode, 4));
+ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_bytesize (VNx64BImode, 8));
 
 /*
| Mode| MIN_VLEN=32 | MIN_VLEN=32 | MIN_VLEN=64 | MIN_VLEN=64 |
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1198a08b13e..2d380aa42cb 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -979,6 +979,46 @@ riscv_v_adjust_nunits (machine_mode mode, int scale)
   return scale;
 }
 
+/* Call from ADJUST_BYTESIZE in riscv-modes.def. Return the correct
+   BYTES for corresponding MODE_VECTOR_BOOL machine_mode.  */
+
+poly_int64
+riscv_v_adjust_bytesize (machine_mode mode, int scale)
+{
+  /* According to RVV ISA, each BOOL element occupy 1-bit.
+ However, GCC assume each BOOL element occupy at least
+ 1-bytes. ??? TODO: Maybe we can adjust it and support
+ 1-bit BOOL in the future 
+
+ One solution is to adjust all MODE_VECTOR_BOOL with
+ the same size which is LMUL = 1. However, for VNx1BImode
+ which only occupy a small fractional bytes of a single
+ LMUL = 1 size that is wasting memory usage and increasing
+ memory access traffic.
+
+ Ideally, a RVV mask datatype like 'vbool64_t' for example
+ which is VNx1BI when TARGET_MIN_VLEN > 32 should be the
+ BYTESIZE of 1/8 of vint8mf8_t (VNx1QImode) according to RVV
+ ISA. However, GCC can not support 1-bit bool value, we can
+ only adjust the BYTESIZE to the smallest size which the
+ BYTESIZE of vint8mf8_t (VNx1QImode).
+
+ Base on this circumstance, we can model MODE_VECOR_BOOL
+ as small bytesize as possible so that we could reduce
+ memory traffic and memory consuming.  */
+
+  /* Only adjust BYTESIZE of RVV mask mode.  */
+  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL);
+  if (riscv_v_ext_vector_mode_p (mode))
+{
+  if (known_lt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR))
+   return GET_MODE_SIZE (mode);
+  else
+   return BYTES_PER_RISCV_VECTOR;
+}
+  return scale;
+}
+
 /* Return true if X is a valid address for machine mode MODE.  If it is,
fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
effect.  */
@@ -5735,6 +5775,27 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
 static bool
 riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
 {
+  if (riscv_v_ext_vector_mode_p (mode1) && riscv_v_ext_vector_mode_p (mode2))
+{
+  /* Base on the riscv_v_adjust_bytesize, RVV mask mode is not
+accurately modeled. For example, we model VNx1BI as the
+BYTESIZE of VNx1QImode even though VNx1BI should be the
+1/8 of VNx1QImode BYTESIZE. We shouldn't allow them to be
+tieable each other since it produce incorrect codegen.
+
+For example:
+  if (cond == 0) {
+

PING [PATCH, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2022-12-13 Thread HAO CHEN GUI via Gcc-patches
Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601909.html

Thanks
Gui Haochen

在 2022/9/21 13:13, HAO CHEN GUI 写道:
> Hi,
>   This patch adds a new insn for vector splat with small V2DI constants on P8.
> If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be 
> loaded
> with vspltisw and vupkhsw on P8. It should be efficient than loading vector 
> from
> TOC.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-09-21 Haochen Gui 
> 
> gcc/
>   PR target/104124
>   * config/rs6000/altivec.md (*altivec_vupkhs_direct): Renamed
>   to...
>   (altivec_vupkhs_direct): ...this.
>   * config/rs6000/constraints.md (wT constraint): New constant for a
>   vector constraint that can be loaded with vspltisw and vupkhsw.
>   * config/rs6000/predicates.md (vspltisw_constant_split): New
>   predicate for wT constraint.
>   * config/rs6000/rs6000-protos.h (vspltisw_constant_p): Add declaration.
>   * config/rs6000/rs6000.cc (easy_altivec_constant): Call
>   vspltisw_constant_p to judge if a V2DI constant can be synthesized with
>   a vspltisw and a vupkhsw.
>   * (vspltisw_constant_p): New function to return true if OP mode is
>   V2DI and can be synthesized with ISA 2.07 instruction vupkhsw and
>   vspltisw.
>   * gcc/config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up
>   constants with vspltisw and vupkhsw.
> 
> gcc/testsuite/
>   PR target/104124
>   * gcc.target/powerpc/p8-splat.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 2c4940f2e21..185414df021 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2542,7 +2542,7 @@ (define_insn "altivec_vupkhs"
>  }
>[(set_attr "type" "vecperm")])
> 
> -(define_insn "*altivec_vupkhs_direct"
> +(define_insn "altivec_vupkhs_direct"
>[(set (match_operand:VP 0 "register_operand" "=v")
>   (unspec:VP [(match_operand: 1 "register_operand" "v")]
>UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
> diff --git a/gcc/config/rs6000/constraints.md 
> b/gcc/config/rs6000/constraints.md
> index 5a44a92142e..f65dea6e0c7 100644
> --- a/gcc/config/rs6000/constraints.md
> +++ b/gcc/config/rs6000/constraints.md
> @@ -150,6 +150,10 @@ (define_constraint "wS"
>"@internal Vector constant that can be loaded with XXSPLTIB & sign 
> extension."
>(match_test "xxspltib_constant_split (op, mode)"))
> 
> +(define_constraint "wT"
> +  "@internal Vector constant that can be loaded with vspltisw & vupkhsw."
> +  (match_test "vspltisw_constant_split (op, mode)"))
> +
>  ;; ISA 3.0 DS-form instruction that has the bottom 2 bits 0 and no update 
> form.
>  ;; Used by LXSD/STXSD/LXSSP/STXSSP.  In contrast to "Y", the multiple-of-four
>  ;; offset is enforced for 32-bit too.
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index b1fcc69bb60..00cf60bbe58 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -694,6 +694,19 @@ (define_predicate "xxspltib_constant_split"
>return num_insns > 1;
>  })
> 
> +;; Return true if the operand is a constant that can be loaded with a 
> vspltisw
> +;; instruction and then a vupkhsw instruction.
> +
> +(define_predicate "vspltisw_constant_split"
> +  (match_code "const_vector,vec_duplicate")
> +{
> +  int value = 32;
> +
> +  if (!vspltisw_constant_p (op, mode, &value))
> +return false;
> +
> +  return true;
> +})
> 
>  ;; Return 1 if the operand is constant that can loaded directly with a 
> XXSPLTIB
>  ;; instruction.
> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index b3c16e7448d..45f3d044eee 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, 
> rtx, int, int, int,
> 
>  extern int easy_altivec_constant (rtx, machine_mode);
>  extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
> +extern bool vspltisw_constant_p (rtx, machine_mode, int *);
>  extern int vspltis_shifted (rtx);
>  extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
>  extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index df491bee2ea..984624026c2 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -6292,6 +6292,12 @@ easy_altivec_constant (rtx op, machine_mode mode)
> && INTVAL (CONST_VECTOR_ELT (op, 1)) == -1)
>   return 8;
> 
> +  /* If V2DI constant is within RANGE (-16, 15), it can be synthesized 
> with
> +  a vspltisw and a vupkhsw.  */
> +  int value = 32;
> +  if (vspltisw_constant_p (op, mode, &value))
> + return 8;
> +
>

[PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-13 Thread liuhongt via Gcc-patches
Don't add crtfastmath.o for -shared to avoid changing the MXCSR
register when loading a shared library.  crtfastmath.o will be used
only when building executables.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/55522
PR target/36821
* config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC):
Link crtfastmath.o when -mdaz-ftz is specified, not link it
when -shared is specified.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
---
 gcc/config/i386/gnu-user-common.h |  2 +-
 gcc/config/i386/i386.opt  |  4 
 gcc/doc/invoke.texi   | 10 +-
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/gnu-user-common.h 
b/gcc/config/i386/gnu-user-common.h
index cab9be2bfb7..02e4a2192a4 100644
--- a/gcc/config/i386/gnu-user-common.h
+++ b/gcc/config/i386/gnu-user-common.h
@@ -47,7 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Similar to standard GNU userspace, but adding -ffast-math support.  */
 #define GNU_USER_TARGET_MATHFILE_SPEC \
-  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
+  
"%{Ofast|ffast-math|funsafe-math-optimizations|mdaz-ftz:%{!shared:crtfastmath.o%s}}
 \
%{mpc32:crtprec32.o%s} \
%{mpc64:crtprec64.o%s} \
%{mpc80:crtprec80.o%s}"
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index fb4e57ada7c..8fd222db857 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -420,6 +420,10 @@ mpc80
 Target RejectNegative
 Set 80387 floating-point precision to 80-bit.
 
+mdaz-ftz
+Target RejectNegative
+Set the FTZ and DAZ Flags.
+
 mpreferred-stack-boundary=
 Target RejectNegative Joined UInteger Var(ix86_preferred_stack_boundary_arg)
 Attempt to keep stack aligned to this power of 2.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cb40b38b73a..670e3767fbd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1433,7 +1433,7 @@ See RS/6000 and PowerPC Options.
 -m96bit-long-double  -mlong-double-64  -mlong-double-80  -mlong-double-128 @gol
 -mregparm=@var{num}  -msseregparm @gol
 -mveclibabi=@var{type}  -mvect8-ret-in-mem @gol
--mpc32  -mpc64  -mpc80  -mstackrealign @gol
+-mpc32  -mpc64  -mpc80  -mdaz-ftz -mstackrealign @gol
 -momit-leaf-frame-pointer  -mno-red-zone  -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model}  -mabi=@var{name}  -maddress-mode=@var{mode} @gol
 -m32  -m64  -mx32  -m16  -miamcu  -mlarge-data-threshold=@var{num} @gol
@@ -32752,6 +32752,14 @@ are enabled by default; routines in such libraries 
could suffer significant
 loss of accuracy, typically through so-called ``catastrophic cancellation'',
 when this option is used to set the precision to less than extended precision.
 
+@item -mdaz-ftz
+@opindex mdaz-ftz
+
+the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
register
+are used to control floating-point calculations.SSE and AVX instructions
+including scalar and vector instructions could benefit from enabling the FTZ
+and DAZ flags when @option{-mdaz-ftz} is specified.
+
 @item -mstackrealign
 @opindex mstackrealign
 Realign the stack at entry.  On the x86, the @option{-mstackrealign}
-- 
2.27.0



Re: [PATCH] Fix intrin name in Intel CMPccXADD

2022-12-13 Thread Hongtao Liu via Gcc-patches
On Wed, Dec 14, 2022 at 9:46 AM Haochen Jiang via Gcc-patches
 wrote:
>
> Hi all,
>
> We usually use only one "_" but not two "__" as prefix in intrin.
>
> This patch aims to fix the intrin name for CMPccXADD.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?
Ok, I think there's no backward compatibility issue since they're only
upstreamed ~2 months ago(in the same GCC13).
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
> * config/i386/cmpccxaddintrin.h
> (__cmpccxadd_epi32): Rename to _cmpccxadd_epi32.
> (__cmpccxadd_epi64): Rename to _cmpccxadd_epi64.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/cmpccxadd-1.c: Fix intrin name.
> * gcc.target/i386/cmpccxadd-2.c: Ditto.
> ---
>  gcc/config/i386/cmpccxaddintrin.h   |  8 +--
>  gcc/testsuite/gcc.target/i386/cmpccxadd-1.c | 64 ++---
>  gcc/testsuite/gcc.target/i386/cmpccxadd-2.c | 64 ++---
>  3 files changed, 68 insertions(+), 68 deletions(-)
>
> diff --git a/gcc/config/i386/cmpccxaddintrin.h 
> b/gcc/config/i386/cmpccxaddintrin.h
> index 1afa03bd08a..11fce1f5e50 100644
> --- a/gcc/config/i386/cmpccxaddintrin.h
> +++ b/gcc/config/i386/cmpccxaddintrin.h
> @@ -58,23 +58,23 @@ typedef enum {
>  #ifdef __OPTIMIZE__
>  extern __inline int
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> -__cmpccxadd_epi32 (int *__A, int __B, int __C, const _CMPCCX_ENUM __D)
> +_cmpccxadd_epi32 (int *__A, int __B, int __C, const _CMPCCX_ENUM __D)
>  {
>return __builtin_ia32_cmpccxadd (__A, __B, __C, __D);
>  }
>
>  extern __inline long long
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> -__cmpccxadd_epi64 (long long *__A, long long __B, long long __C,
> +_cmpccxadd_epi64 (long long *__A, long long __B, long long __C,
>const _CMPCCX_ENUM __D)
>  {
>return __builtin_ia32_cmpccxadd64 (__A, __B, __C, __D);
>  }
>  #else
> -#define __cmpccxadd_epi32(A,B,C,D) \
> +#define _cmpccxadd_epi32(A,B,C,D) \
>__builtin_ia32_cmpccxadd ((int *) (A), (int) (B), (int) (C), \
> (_CMPCCX_ENUM) (D))
> -#define __cmpccxadd_epi64(A,B,C,D) \
> +#define _cmpccxadd_epi64(A,B,C,D) \
>__builtin_ia32_cmpccxadd64 ((long long *) (A), (long long) (B), \
>   (long long) (C), (_CMPCCX_ENUM) (D))
>  #endif
> diff --git a/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c 
> b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
> index c825717e29e..537b79b8d2d 100644
> --- a/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
> +++ b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
> @@ -26,36 +26,36 @@ long long e, f;
>  void extern
>  cmpccxadd_test(void)
>  {
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_O);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_O);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NO);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NO);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_B);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_B);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NB);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NB);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_Z);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_Z);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NZ);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NZ);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_BE);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_BE);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NBE);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NBE);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_S);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_S);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NS);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NS);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_P);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_P);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NP);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NP);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_L);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_L);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NL);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NL);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_LE);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_LE);
> -  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NLE);
> -  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NLE);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_O);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_O);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NO);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NO);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_B);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_B);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NB);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NB);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_Z);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_Z);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NZ);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NZ);
> +  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_BE);
> +  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_BE);
> +  b = _cmpccxadd_e

[PATCH] Fix intrin name in Intel CMPccXADD

2022-12-13 Thread Haochen Jiang via Gcc-patches
Hi all,

We usually use only one "_" but not two "__" as prefix in intrin.

This patch aims to fix the intrin name for CMPccXADD.

Bootstrapped and regtested on x86_64-pc-linux-gnu. Ok for trunk?

BRs,
Haochen

gcc/ChangeLog:

* config/i386/cmpccxaddintrin.h
(__cmpccxadd_epi32): Rename to _cmpccxadd_epi32.
(__cmpccxadd_epi64): Rename to _cmpccxadd_epi64.

gcc/testsuite/ChangeLog:

* gcc.target/i386/cmpccxadd-1.c: Fix intrin name.
* gcc.target/i386/cmpccxadd-2.c: Ditto.
---
 gcc/config/i386/cmpccxaddintrin.h   |  8 +--
 gcc/testsuite/gcc.target/i386/cmpccxadd-1.c | 64 ++---
 gcc/testsuite/gcc.target/i386/cmpccxadd-2.c | 64 ++---
 3 files changed, 68 insertions(+), 68 deletions(-)

diff --git a/gcc/config/i386/cmpccxaddintrin.h 
b/gcc/config/i386/cmpccxaddintrin.h
index 1afa03bd08a..11fce1f5e50 100644
--- a/gcc/config/i386/cmpccxaddintrin.h
+++ b/gcc/config/i386/cmpccxaddintrin.h
@@ -58,23 +58,23 @@ typedef enum {
 #ifdef __OPTIMIZE__
 extern __inline int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-__cmpccxadd_epi32 (int *__A, int __B, int __C, const _CMPCCX_ENUM __D)
+_cmpccxadd_epi32 (int *__A, int __B, int __C, const _CMPCCX_ENUM __D)
 {
   return __builtin_ia32_cmpccxadd (__A, __B, __C, __D);
 }
 
 extern __inline long long
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-__cmpccxadd_epi64 (long long *__A, long long __B, long long __C,
+_cmpccxadd_epi64 (long long *__A, long long __B, long long __C,
   const _CMPCCX_ENUM __D)
 {
   return __builtin_ia32_cmpccxadd64 (__A, __B, __C, __D);
 }
 #else
-#define __cmpccxadd_epi32(A,B,C,D) \
+#define _cmpccxadd_epi32(A,B,C,D) \
   __builtin_ia32_cmpccxadd ((int *) (A), (int) (B), (int) (C), \
(_CMPCCX_ENUM) (D))
-#define __cmpccxadd_epi64(A,B,C,D) \
+#define _cmpccxadd_epi64(A,B,C,D) \
   __builtin_ia32_cmpccxadd64 ((long long *) (A), (long long) (B), \
  (long long) (C), (_CMPCCX_ENUM) (D))
 #endif
diff --git a/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c 
b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
index c825717e29e..537b79b8d2d 100644
--- a/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
+++ b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c
@@ -26,36 +26,36 @@ long long e, f;
 void extern
 cmpccxadd_test(void)
 {
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_O);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_O);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NO);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NO);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_B);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_B);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NB);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NB);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_Z);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_Z);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NZ);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NZ);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_BE);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_BE);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NBE);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NBE);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_S);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_S);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NS);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NS);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_P);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_P);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NP);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NP);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_L);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_L);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NL);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NL);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_LE);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_LE);
-  b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NLE);
-  e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NLE);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_O);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_O);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NO);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NO);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_B);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_B);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NB);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NB);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_Z);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_Z);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NZ);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NZ);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_BE);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_BE);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NBE);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NBE);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_S);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_S);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NS);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_NS);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_P);
+  e = _cmpccxadd_epi64 (d, e, f, _CMPCCX_P);
+  b = _cmpccxadd_epi32 (a, b, c, _CMPCCX_NP);
+  e = 

[GCC-10][committed] libphobos: Fix std.path.expandTilde raising onOutOfMemory

2022-12-13 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports from mainline a fix for std.path.expandTilde
erroneously raising onOutOfMemory after failed call to `getpwnam_r()'.

Regression tested on x86_64-linux-gnu/-m32/-mx32, committed to
releases/gcc-10 branch.

Regards,
Iain.

---
libphobos/ChangeLog:

* src/std/path.d (expandTilde): Handle more errno codes that could be
left set by getpwnam_r.
---
 libphobos/src/std/path.d | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/libphobos/src/std/path.d b/libphobos/src/std/path.d
index 4a435efba6c..d250953ee1c 100644
--- a/libphobos/src/std/path.d
+++ b/libphobos/src/std/path.d
@@ -3850,7 +3850,7 @@ string expandTilde(string inputPath) nothrow
 version (Posix)
 {
 import core.exception : onOutOfMemoryError;
-import core.stdc.errno : errno, ERANGE;
+import core.stdc.errno : errno, EBADF, ENOENT, EPERM, ERANGE, ESRCH;
 import core.stdc.stdlib : malloc, free, realloc;
 
 /*  Joins a path from a C string to the remainder of path.
@@ -3950,7 +3950,7 @@ string expandTilde(string inputPath) nothrow
 scope(exit) free(extra_memory);
 
 passwd result;
-while (1)
+loop: while (1)
 {
 extra_memory = cast(char*) realloc(extra_memory, 
extra_memory_size * char.sizeof);
 if (extra_memory == null)
@@ -3969,10 +3969,23 @@ string expandTilde(string inputPath) nothrow
 break;
 }
 
-if (errno != ERANGE &&
+switch (errno)
+{
+case ERANGE:
 // On BSD and OSX, errno can be left at 0 instead of 
set to ERANGE
-errno != 0)
-onOutOfMemoryError();
+case 0:
+break;
+
+case ENOENT:
+case ESRCH:
+case EBADF:
+case EPERM:
+// The given name or uid was not found.
+break loop;
+
+default:
+onOutOfMemoryError();
+}
 
 // extra_memory isn't large enough
 import core.checkedint : mulu;
-- 
2.37.2



[GCC-11][committed] libphobos: Backport library and bindings fixes from mainline

2022-12-13 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports some fixes for the libphobos library from mainline
that fix build and testsuite failures.

Regression tested on x86_64-linux-gnu/-m32/-mx32, committed to
releases/gcc-11 branch.

D Runtime changes:

- Fix MIPS64 bindings for CRuntime_UClibc.

Phobos changes:

- Fix std.path.expandTilde erroneously raising onOutOfMemory
  after failed call to getpwnam_r().
- Use GENERIC_IO on CRuntime_UClibc port of std.stdio.

libphobos/ChangeLog:

* libdruntime/core/stdc/fenv.d: Compile in MIPS uClibc bindings on
MIPS_Any targets.
* libdruntime/core/stdc/math.d: Likewise.
* libdruntime/core/sys/posix/dlfcn.d: Likewise.
* libdruntime/core/sys/posix/setjmp.d: Add MIPS64 definitions for
CRuntime_UClibc.
* libdruntime/core/sys/posix/sys/types.d: Likewise.
* src/std/path.d (expandTilde): Handle more errno codes that could be
left set by getpwnam_r.
* src/std/stdio.d: Set CRuntime_UClibc as GENERIC_IO target.
---
 libphobos/libdruntime/core/stdc/fenv.d|  2 +-
 libphobos/libdruntime/core/stdc/math.d|  2 +-
 libphobos/libdruntime/core/sys/posix/dlfcn.d  |  2 +-
 libphobos/libdruntime/core/sys/posix/setjmp.d | 16 +
 .../libdruntime/core/sys/posix/sys/types.d| 12 ++
 libphobos/src/std/path.d  | 23 +++
 libphobos/src/std/stdio.d |  3 +--
 7 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/libphobos/libdruntime/core/stdc/fenv.d 
b/libphobos/libdruntime/core/stdc/fenv.d
index 3002c022613..665f383167d 100644
--- a/libphobos/libdruntime/core/stdc/fenv.d
+++ b/libphobos/libdruntime/core/stdc/fenv.d
@@ -481,7 +481,7 @@ else version (CRuntime_UClibc)
 
 alias fexcept_t = ushort;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 struct fenv_t
 {
diff --git a/libphobos/libdruntime/core/stdc/math.d 
b/libphobos/libdruntime/core/stdc/math.d
index 2de6e579575..2a965444f2c 100644
--- a/libphobos/libdruntime/core/stdc/math.d
+++ b/libphobos/libdruntime/core/stdc/math.d
@@ -113,7 +113,7 @@ else version (CRuntime_UClibc)
 ///
 enum int FP_ILOGBNAN  = int.min;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 ///
 enum int FP_ILOGB0= -int.max;
diff --git a/libphobos/libdruntime/core/sys/posix/dlfcn.d 
b/libphobos/libdruntime/core/sys/posix/dlfcn.d
index f6476ec3106..ff24896cdb6 100644
--- a/libphobos/libdruntime/core/sys/posix/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/posix/dlfcn.d
@@ -316,7 +316,7 @@ else version (CRuntime_UClibc)
 enum RTLD_LOCAL = 0;
 enum RTLD_NODELETE  = 0x01000;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 enum RTLD_LAZY  = 0x0001;
 enum RTLD_NOW   = 0x0002;
diff --git a/libphobos/libdruntime/core/sys/posix/setjmp.d 
b/libphobos/libdruntime/core/sys/posix/setjmp.d
index b98d321a883..547e52e8edc 100644
--- a/libphobos/libdruntime/core/sys/posix/setjmp.d
+++ b/libphobos/libdruntime/core/sys/posix/setjmp.d
@@ -366,6 +366,22 @@ else version (CRuntime_UClibc)
 double[6] __fpregs;
 }
 }
+else version (MIPS64)
+{
+struct __jmp_buf
+{
+long __pc;
+long __sp;
+long[8] __regs;
+long __fp;
+long __gp;
+int __fpc_csr;
+version (MIPS_N64)
+double[8] __fpregs;
+else
+double[6] __fpregs;
+}
+}
 else
 static assert(0, "unimplemented");
 
diff --git a/libphobos/libdruntime/core/sys/posix/sys/types.d 
b/libphobos/libdruntime/core/sys/posix/sys/types.d
index abcea99019f..529df1bae82 100644
--- a/libphobos/libdruntime/core/sys/posix/sys/types.d
+++ b/libphobos/libdruntime/core/sys/posix/sys/types.d
@@ -1277,6 +1277,18 @@ else version (CRuntime_UClibc)
 enum __SIZEOF_PTHREAD_BARRIER_T = 20;
 enum __SIZEOF_PTHREAD_BARRIERATTR_T = 4;
  }
+ else version (MIPS64)
+ {
+enum __SIZEOF_PTHREAD_ATTR_T= 56;
+enum __SIZEOF_PTHREAD_MUTEX_T   = 40;
+enum __SIZEOF_PTHREAD_MUTEXATTR_T   = 4;
+enum __SIZEOF_PTHREAD_COND_T= 48;
+enum __SIZEOF_PTHREAD_CONDATTR_T= 4;
+enum __SIZEOF_PTHREAD_RWLOCK_T  = 56;
+enum __SIZEOF_PTHREAD_RWLOCKATTR_T  = 8;
+enum __SIZEOF_PTHREAD_BARRIER_T = 32;
+enum __SIZEOF_PTHREAD_BARRIERATTR_T = 4;
+ }
  else version (ARM)
  {
 enum __SIZEOF_PTHREAD_ATTR_T = 36;
diff --git a/libphobos/src/std/path.d b/libphobos/src/std/path.d
index 4a435efba6c..d250953ee1c 100644
--- a/libphobos/src/std/path.d
+++ b/libphobos/src/std/path.d
@@ -3850,7 +3850,7 @@ string expandTilde(string inputPath) nothrow
 version (Posix)
 {
 impor

[GCC-12][committed] libphobos: Backport library and bindings fixes from mainline

2022-12-13 Thread Iain Buclaw via Gcc-patches
Hi,

This patch backports some fixes for the libphobos library from mainline
that fix build and testsuite failures.

Regression tested on x86_64-linux-gnu/-m32/-mx32, committed to
releases/gcc-12 branch.

D Runtime changes:

- Fix MIPS64 bindings for CRuntime_UClibc.

Phobos changes:

- Fix std.path.expandTilde erroneously raising onOutOfMemory
  after failed call to getpwnam_r().
- Fix std.random unittest failures on ILP32 targets.
- Use GENERIC_IO on CRuntime_UClibc port of std.stdio.

libphobos/ChangeLog:

* libdruntime/core/stdc/fenv.d: Compile in MIPS uClibc bindings on
MIPS_Any targets.
* libdruntime/core/stdc/math.d: Likewise.
* libdruntime/core/sys/posix/dlfcn.d: Likewise.
* libdruntime/core/sys/posix/setjmp.d: Add MIPS64 definitions for
CRuntime_UClibc.
* libdruntime/core/sys/posix/sys/types.d: Likewise.
* src/std/path.d (expandTilde): Handle more errno codes that could be
left set by getpwnam_r.
* src/std/random.d: Use D_LP64 in unittests.
* src/std/stdio.d: Set CRuntime_UClibc as GENERIC_IO target.
---
 libphobos/libdruntime/core/stdc/fenv.d|  2 +-
 libphobos/libdruntime/core/stdc/math.d|  2 +-
 libphobos/libdruntime/core/sys/posix/dlfcn.d  |  2 +-
 libphobos/libdruntime/core/sys/posix/setjmp.d | 16 +
 .../libdruntime/core/sys/posix/sys/types.d| 12 ++
 libphobos/src/std/path.d  | 23 +++
 libphobos/src/std/random.d| 14 +--
 libphobos/src/std/stdio.d |  3 +--
 8 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/libphobos/libdruntime/core/stdc/fenv.d 
b/libphobos/libdruntime/core/stdc/fenv.d
index 88123fb16a6..5242ba9d4e2 100644
--- a/libphobos/libdruntime/core/stdc/fenv.d
+++ b/libphobos/libdruntime/core/stdc/fenv.d
@@ -483,7 +483,7 @@ else version (CRuntime_UClibc)
 
 alias fexcept_t = ushort;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 struct fenv_t
 {
diff --git a/libphobos/libdruntime/core/stdc/math.d 
b/libphobos/libdruntime/core/stdc/math.d
index 0393ea52c07..51fd68f9fc3 100644
--- a/libphobos/libdruntime/core/stdc/math.d
+++ b/libphobos/libdruntime/core/stdc/math.d
@@ -120,7 +120,7 @@ else version (CRuntime_UClibc)
 ///
 enum int FP_ILOGBNAN  = int.min;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 ///
 enum int FP_ILOGB0= -int.max;
diff --git a/libphobos/libdruntime/core/sys/posix/dlfcn.d 
b/libphobos/libdruntime/core/sys/posix/dlfcn.d
index a9519ca234a..24fa3787ec4 100644
--- a/libphobos/libdruntime/core/sys/posix/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/posix/dlfcn.d
@@ -387,7 +387,7 @@ else version (CRuntime_UClibc)
 enum RTLD_LOCAL = 0;
 enum RTLD_NODELETE  = 0x01000;
 }
-else version (MIPS32)
+else version (MIPS_Any)
 {
 enum RTLD_LAZY  = 0x0001;
 enum RTLD_NOW   = 0x0002;
diff --git a/libphobos/libdruntime/core/sys/posix/setjmp.d 
b/libphobos/libdruntime/core/sys/posix/setjmp.d
index 91e3a19d081..5a15d82d2ee 100644
--- a/libphobos/libdruntime/core/sys/posix/setjmp.d
+++ b/libphobos/libdruntime/core/sys/posix/setjmp.d
@@ -370,6 +370,22 @@ else version (CRuntime_UClibc)
 double[6] __fpregs;
 }
 }
+else version (MIPS64)
+{
+struct __jmp_buf
+{
+long __pc;
+long __sp;
+long[8] __regs;
+long __fp;
+long __gp;
+int __fpc_csr;
+version (MIPS_N64)
+double[8] __fpregs;
+else
+double[6] __fpregs;
+}
+}
 else
 static assert(0, "unimplemented");
 
diff --git a/libphobos/libdruntime/core/sys/posix/sys/types.d 
b/libphobos/libdruntime/core/sys/posix/sys/types.d
index ec229dd3b2b..3e515c4c68e 100644
--- a/libphobos/libdruntime/core/sys/posix/sys/types.d
+++ b/libphobos/libdruntime/core/sys/posix/sys/types.d
@@ -1140,6 +1140,18 @@ else version (CRuntime_UClibc)
 enum __SIZEOF_PTHREAD_BARRIER_T = 20;
 enum __SIZEOF_PTHREAD_BARRIERATTR_T = 4;
  }
+ else version (MIPS64)
+ {
+enum __SIZEOF_PTHREAD_ATTR_T= 56;
+enum __SIZEOF_PTHREAD_MUTEX_T   = 40;
+enum __SIZEOF_PTHREAD_MUTEXATTR_T   = 4;
+enum __SIZEOF_PTHREAD_COND_T= 48;
+enum __SIZEOF_PTHREAD_CONDATTR_T= 4;
+enum __SIZEOF_PTHREAD_RWLOCK_T  = 56;
+enum __SIZEOF_PTHREAD_RWLOCKATTR_T  = 8;
+enum __SIZEOF_PTHREAD_BARRIER_T = 32;
+enum __SIZEOF_PTHREAD_BARRIERATTR_T = 4;
+ }
  else version (ARM)
  {
 enum __SIZEOF_PTHREAD_ATTR_T = 36;
diff --git a/libphobos/src/std/path.d b/libphobos/src/std/path.d
index de180fcc548..777d8b924dd 

Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-13 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 13.12.22 um 22:41 schrieb Tobias Burnus:

Note: We only talk about those two functions, the other functions are used
by both GCC <= 11 and GCC >= 12.


yes.


Fortunately, these functions matter most as they map GFC internals to CFI
internals or vice versa. Most other functions are user callable and there
incompatibilities are less likely to show up and GCC 11 users also could
profit from fixes there. It looks as if CFI_section and CFI_select_part had
some larger changes, likewise CFI_setpointer.

Back to differences: 'diff -U0 -p -w' against the last GCC 11 branch shows:

...
@@ -35,0 +37,2 @@ export_proto(cfi_desc_to_gfc_desc);
+/* NOTE: Since GCC 12, the FE generates code to do the conversion
+   directly without calling this function.  */
@@ -63 +66 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  d->dtype.version = s->version;
+  d->dtype.version = 0;


I was wondering what the significance of "version" is.
In ISO_Fortran_binding.h we seem to always have

#define CFI_VERSION 1

and it did not change with gcc-12.

If it is just decoration (for now), then it is not important.
I just didn't see where it is used.


@@ -76,0 +80 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
+  if (GFC_DESCRIPTOR_DATA (d))
@@ -79,3 +83,7 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)s->dim[n].lower_bound;
-  GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent
-   + s->dim[n].lower_bound 
- 1);

+   CFI_index_t lb = 1;
+
+   if (s->attribute != CFI_attribute_other)
+ lb = s->dim[n].lower_bound;
+
+   GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)lb;
+   GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent + lb 
- 1);


Oh, now I see that on 11-branch in trans-expr.c we set a hard-coded
attribute = 2 instead of using CFI_attribute_other, even if that was
available as a macro defining the very same value.  Thus it is ok.


@@ -89,0 +98,2 @@ export_proto(gfc_desc_to_cfi_desc);
+/* NOTE: Since GCC 12, the FE generates code to do the conversion
+   directly without calling this function.  */
@@ -100,2 +110,2 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-    d = malloc (sizeof (CFI_cdesc_t)
-   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t)));
+    d = calloc (1, (sizeof (CFI_cdesc_t)
+   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t;
@@ -107 +117 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-  d->version = s->dtype.version;
+  d->version = CFI_VERSION;


This treatment of "version" was the equivalent to the above that
confused me.  Assuming we were to change CFI_VERSION in gcc-13+,
is this the right choice here regarding backward compatibility?


Probably yes. I don't have a better suggestion. The problem is that it
usually only matters in some corner cases, like in the PR where a not
some argument is passed to the GFC→CFI conversion but first a Fortran
function is called with TYPE(*) any only then it is passed on. – Such
cases are usually not in the testsuite. (With GCC 12 we have a rather
complex testsuite, but obviously it also does not cover everything.)


Well, I understand we have to draw a line here, whether we
reproduce every bug in <= gcc-11 where the user may have
attempted to implement a workaround.  That might be tough.


Well, in the real world there are larger installations with large
software stacks, and it is easier said to "compile each component
with the same compiler version" than done...


I concur – but there were really many fixes for the array descriptor /
TS29113 in GCC 12.

It is simply not possible to fix tons of bugs and be 100% compatible
with the working bits of the old version – especially if they only work
if one does not look sharply at the result. (Like here, were 'type' is
wrong, which does not matter unless in programs which use them.)


True.  I was only thinking of the 90+ percent of valid and common
uses that we really consider important.

So besides the "version" question ok from my side.

Thanks,
Harald


Thanks,

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
Registergericht München, HRB 106955







Re: [PATCH] libgccjit: Allow comparing vector types

2022-12-13 Thread David Malcolm via Gcc-patches
On Tue, 2022-12-13 at 16:27 -0500, Antoni Boucher wrote:
> Thanks!
> 
> David: you mentioned gcc 10. For now, I only intend to make changes
> to
> the next release (13). Is this OK or should I backport all my fixes
> to
> all active releases? (I'm not sure what are GCC policies here.)

I think it varies by subproject within GCC.

Given that this could arguably be an RFE rather than a bugfix, and that
rustc_codegen_gcc is likely the primary user of this stuff, I leave the
decision of which branches to you.  If you only want it in trunk for
gcc 13 onwards, then that's fine by me.

Thanks again for the patch
Dave 

> 
> On Tue, 2022-12-13 at 16:24 -0500, David Malcolm wrote:
> > On Mon, 2022-12-12 at 21:31 -0500, Antoni Boucher via Jit wrote:
> > > Hi.
> > > This fixes bug 108078.
> > > Thanks for the review.
> > 
> > [...snip...]
> > 
> > > diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
> > > index 5d7c7177cc3..4ec0fff4843 100644
> > > --- a/gcc/jit/jit-recording.h
> > > +++ b/gcc/jit/jit-recording.h
> > > @@ -806,6 +806,15 @@ public:
> > >  
> > >    void replay_into (replayer *) final override;
> > >  
> > > +  virtual bool is_same_type_as (type *other)
> > 
> > This would be better with a "final override" (and without the
> > "virtual").
> > 
> > > +  {
> > > +    vector_type *other_vec_type = other->dyn_cast_vector_type
> > > ();
> > > +    if (other_vec_type == NULL)
> > > +  return false;
> > > +    return get_num_units () == other_vec_type->get_num_units ()
> > > +  && get_element_type () == other_vec_type->get_element_type
> > > ();
> > > +  }
> > > +
> > 
> > OK for active branches with that nit fixed (though for gcc 10 you'd
> > have to spell final and override as "FINAL" and "OVERRIDE" due to
> > needing to be buildable with a C++98 compiler; not sure if gcc 10's
> > libgccjit even has vector types though).
> > 
> > [...snip...]
> > 
> > Thanks for the patch
> > 
> > Dave
> > 
> 



Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-13 Thread Tobias Burnus

Hi Harald,

On 13.12.22 21:53, Harald Anlauf via Gcc-patches wrote:


I now did so - except for three fixes (cf. changelog). See also
PR: https://gcc.gnu.org/PR108056

There is no testcase as it needs to be compiled by GCC <= 11 and then
run with linking (dynamically) to a GCC 12 or 13 libgfortran.


I've looked at the resulting ISO_Fortran_binding.c vs. the 11-branch
version and am still trying to understand the resulting differences
in the code, in what respect they might be relevant or not.


Hmm, if I run a diff, I do not see much differences.

Note: We only talk about those two functions, the other functions are used
by both GCC <= 11 and GCC >= 12.

Fortunately, these functions matter most as they map GFC internals to CFI
internals or vice versa. Most other functions are user callable and there
incompatibilities are less likely to show up and GCC 11 users also could
profit from fixes there. It looks as if CFI_section and CFI_select_part had
some larger changes, likewise CFI_setpointer.

Back to differences: 'diff -U0 -p -w' against the last GCC 11 branch shows:

...
@@ -35,0 +37,2 @@ export_proto(cfi_desc_to_gfc_desc);
+/* NOTE: Since GCC 12, the FE generates code to do the conversion
+   directly without calling this function.  */
@@ -63 +66 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  d->dtype.version = s->version;
+  d->dtype.version = 0;
@@ -76,0 +80 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
+  if (GFC_DESCRIPTOR_DATA (d))
@@ -79,3 +83,7 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)s->dim[n].lower_bound;
-  GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent
-   + s->dim[n].lower_bound - 1);
+   CFI_index_t lb = 1;
+
+   if (s->attribute != CFI_attribute_other)
+ lb = s->dim[n].lower_bound;
+
+   GFC_DESCRIPTOR_LBOUND(d, n) = (index_type)lb;
+   GFC_DESCRIPTOR_UBOUND(d, n) = (index_type)(s->dim[n].extent + lb - 1);
@@ -89,0 +98,2 @@ export_proto(gfc_desc_to_cfi_desc);
+/* NOTE: Since GCC 12, the FE generates code to do the conversion
+   directly without calling this function.  */
@@ -100,2 +110,2 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-d = malloc (sizeof (CFI_cdesc_t)
-   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t)));
+d = calloc (1, (sizeof (CFI_cdesc_t)
+   + (CFI_type_t)(CFI_MAX_RANK * sizeof (CFI_dim_t;
@@ -107 +117 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-  d->version = s->dtype.version;
+  d->version = CFI_VERSION;
@@ -153 +163 @@ void *CFI_address (const CFI_cdesc_t *dv
...


Given that this is a somewhat delicate situation we're in, is there
a set of tests that I could run *manually* (i.e. compile with gcc-11
and link with gcc-12/13) to verify that this best-effort fix should
be good enough for the common user?

Just a suggestion of a few "randomly" chosen tests?


Probably yes. I don't have a better suggestion. The problem is that it
usually only matters in some corner cases, like in the PR where a not
some argument is passed to the GFC→CFI conversion but first a Fortran
function is called with TYPE(*) any only then it is passed on. – Such
cases are usually not in the testsuite. (With GCC 12 we have a rather
complex testsuite, but obviously it also does not cover everything.)



Note: It is strongly recommended to use GCC 12 (or 13) with
array-descriptor
C interop as many issues were fixed. [...]


Well, in the real world there are larger installations with large
software stacks, and it is easier said to "compile each component
with the same compiler version" than done...


I concur – but there were really many fixes for the array descriptor /
TS29113 in GCC 12.

It is simply not possible to fix tons of bugs and be 100% compatible
with the working bits of the old version – especially if they only work
if one does not look sharply at the result. (Like here, were 'type' is
wrong, which does not matter unless in programs which use them.)

Thanks,

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] libgccjit: Allow comparing vector types

2022-12-13 Thread Antoni Boucher via Gcc-patches
Thanks!

David: you mentioned gcc 10. For now, I only intend to make changes to
the next release (13). Is this OK or should I backport all my fixes to
all active releases? (I'm not sure what are GCC policies here.)

On Tue, 2022-12-13 at 16:24 -0500, David Malcolm wrote:
> On Mon, 2022-12-12 at 21:31 -0500, Antoni Boucher via Jit wrote:
> > Hi.
> > This fixes bug 108078.
> > Thanks for the review.
> 
> [...snip...]
> 
> > diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
> > index 5d7c7177cc3..4ec0fff4843 100644
> > --- a/gcc/jit/jit-recording.h
> > +++ b/gcc/jit/jit-recording.h
> > @@ -806,6 +806,15 @@ public:
> >  
> >    void replay_into (replayer *) final override;
> >  
> > +  virtual bool is_same_type_as (type *other)
> 
> This would be better with a "final override" (and without the
> "virtual").
> 
> > +  {
> > +    vector_type *other_vec_type = other->dyn_cast_vector_type ();
> > +    if (other_vec_type == NULL)
> > +  return false;
> > +    return get_num_units () == other_vec_type->get_num_units ()
> > +  && get_element_type () == other_vec_type->get_element_type
> > ();
> > +  }
> > +
> 
> OK for active branches with that nit fixed (though for gcc 10 you'd
> have to spell final and override as "FINAL" and "OVERRIDE" due to
> needing to be buildable with a C++98 compiler; not sure if gcc 10's
> libgccjit even has vector types though).
> 
> [...snip...]
> 
> Thanks for the patch
> 
> Dave
> 



Re: [PATCH] libgccjit: Allow comparing vector types

2022-12-13 Thread David Malcolm via Gcc-patches
On Mon, 2022-12-12 at 21:31 -0500, Antoni Boucher via Jit wrote:
> Hi.
> This fixes bug 108078.
> Thanks for the review.

[...snip...]

> diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
> index 5d7c7177cc3..4ec0fff4843 100644
> --- a/gcc/jit/jit-recording.h
> +++ b/gcc/jit/jit-recording.h
> @@ -806,6 +806,15 @@ public:
>  
>void replay_into (replayer *) final override;
>  
> +  virtual bool is_same_type_as (type *other)

This would be better with a "final override" (and without the
"virtual").

> +  {
> +vector_type *other_vec_type = other->dyn_cast_vector_type ();
> +if (other_vec_type == NULL)
> +  return false;
> +return get_num_units () == other_vec_type->get_num_units ()
> +  && get_element_type () == other_vec_type->get_element_type ();
> +  }
> +

OK for active branches with that nit fixed (though for gcc 10 you'd
have to spell final and override as "FINAL" and "OVERRIDE" due to
needing to be buildable with a C++98 compiler; not sure if gcc 10's
libgccjit even has vector types though).

[...snip...]

Thanks for the patch

Dave



Re: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant

2022-12-13 Thread Segher Boessenkool
Hi!

Sorry for the tardiness.

On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote:
> When checking eq/ne with a constant which has only 16bits, it can be
> optimized to check the rotated data.  By this, the constant building
> is optimized.
> 
> As the example in PR103743:
> For "in == 0x8000LL", this patch generates:
> rotldi %r3,%r3,16
> cmpldi %cr0,%r3,32768
> instead:
> li %r9,-1
> rldicr %r9,%r9,0,0
> cmpd %cr0,%r3,%r9

FWIW, I find the winnt assembler syntax very hard to read, and I doubt
I am the only one.

So you're doing
  rotldi 3,3,16 ; cmpldi 3,0x8000
instead of
  li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9

> +/* Check if C can be rotated from an immediate which starts (as 64bit 
> integer)
> +   with at least CLZ bits zero.
> +
> +   Return the number by which C can be rotated from the immediate.
> +   Return -1 if C can not be rotated as from.  */
> +
> +int
> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz)

The name does not say what the function does.  Can you think of a better
name?

Maybe it is better to not return magic values anyway?  So perhaps

bool
can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot)

(with *rot written if the return value is true).

> +  /* case c. xx10.0xx: rotate 'clz + 1' bits firstly, then check case b.

s/firstly/first/

> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi.  
> */
> +
> +bool
> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c)

No _p please, this function is not a predicate (at least, the name does
not say what it tests).  So a better name please.  This matters even
more for extern functions (like this one) because the function
implementation is always farther away so you do not easily have all
interface details in mind.  Good names help :-)

> +(define_code_iterator eqne [eq ne])
> +(define_code_attr EQNE [(eq "EQ") (ne "NE")])

Just  or  should work?

Please fix these things.  Almost there :-)


Segher


[PATCH] libstdc++: Deliver names of C functions in

2022-12-13 Thread Björn Schäpers
From: Björn Schäpers 

One could add (), these are not part of __name. One could also try to
check upfront if __cxa_demangle should be called at all.

-- >8 --

Tested on i686-w64-mingw32.

__cxa_demangle is only to demangle C++ names, for all C functions,
extern "C" functions, and including main it returns -2, in that case
just adapt the given name. Otherwise it's kept empty, which doesn't look
nice in the stacktrace.

libstdc++-v3/ChangeLog:

* include/std/stacktrace (stacktrace_entry::_S_demangle): Use
raw __name if __cxa_demangle could not demangle it.

Signed-off-by: Björn Schäpers 
---
 libstdc++-v3/include/std/stacktrace | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace
index 83c6463b0d8..5baf2dcdaca 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -219,6 +219,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   &__status);
   if (__status == 0)
__s = __str;
+  else
+   __s = __name;
   __builtin_free(__str);
   return __s;
 }
-- 
2.38.1



Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-13 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 13.12.22 um 17:29 schrieb Tobias Burnus:

This is a 12/13 regression as come changes to fix the GFC/CFI descriptor
that went into GCC 12 fail with the (bogus) descriptor passed via by a
GCC-11-compiled program.

As later GCC 12 changes moved the descriptor to the front end, those
functions are only in libgomp.so to cater for old program. Richard
suggested in the PR that the best way is to move to the GCC 11 version,
such that libgfortran.so won't regress.


that appears to be the most reasonable & practical way to go.


I now did so - except for three fixes (cf. changelog). See also
PR: https://gcc.gnu.org/PR108056

There is no testcase as it needs to be compiled by GCC <= 11 and then
run with linking (dynamically) to a GCC 12 or 13 libgfortran.


I've verified that the testcase in the PR does not crash with
the re-modified libgfortran.

I've looked at the resulting ISO_Fortran_binding.c vs. the 11-branch
version and am still trying to understand the resulting differences
in the code, in what respect they might be relevant or not.

Given that this is a somewhat delicate situation we're in, is there
a set of tests that I could run *manually* (i.e. compile with gcc-11
and link with gcc-12/13) to verify that this best-effort fix should
be good enough for the common user?

Just a suggestion of a few "randomly" chosen tests?


OK for mainline and GCC 12?

  * * *

Note: It is strongly recommended to use GCC 12 (or 13) with
array-descriptor
C interop as many issues were fixed. Like for the testcase in the PR; in
GCC 11
the type arriving in libgomp is BT_ASSUME ('type(*)'). But as the effective
argument is passed as array descriptor through out, the 'float'
(real(4)) type
info is actually preservable (as GCC 12 cf. testcase of comment 0 and my
comment
in the PR for the C part of the testcase).(*)


Well, in the real world there are larger installations with large
software stacks, and it is easier said to "compile each component
with the same compiler version" than done...

(I've just had another personal experience during my daytime job.)

Thanks for your effort so far!

Harald


Tobias

((*) This is not possible if using a scalar 'type(*)' or a
non-array-descriptor
array in between. I think GCC 12 uses 'CFI_other' in the
information-is-lost case.)
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955




Re: [PATCH Rust front-end v4 46/46] gccrs: Add README, CONTRIBUTING and compiler logo

2022-12-13 Thread Joseph Myers
On Tue, 13 Dec 2022, Martin Liška wrote:

> If the Rust folks are willing to use Sphinx, then yes, I'm going to 
> prepare a common infrastructure (baseconf.py, common license files and a 
> common Makefile). So something similar to what I prepared for the Sphinx 
> conversion that didn't make it.

I suggest putting this in a directory such as gcc/doc/sphinx/ (rather than 
the top-level doc/ that was used in the Sphinx conversion).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH v2 3/3] btf: correct generation for extern funcs [PR106773]

2022-12-13 Thread David Faust via Gcc-patches
[Changes from v1:
 - Add enum btf_func_linkage to include/btf.h and use it.
 - Minor updates to comments based on review. ]

The eBPF loader expects to find entries for functions declared as extern
in the corresponding BTF_KIND_DATASEC record, but we were not generating
these entries.

This patch adds support for the 'extern' linkage of function types in
BTF, and creates entries for for them BTF_KIND_DATASEC records as needed.

PR target/106773

gcc/

* btfout.cc (get_section_name): New function.
(btf_collect_datasec): Use it here. Process functions, marking them
'extern' and generating DATASEC entries for them as appropriate. Move
creation of BTF_KIND_FUNC records to here...
(btf_dtd_emit_preprocess_cb): ... from here.

gcc/testsuite/

* gcc.dg/debug/btf/btf-datasec-2.c: New test.
* gcc.dg/debug/btf/btf-function-6.c: New test.

include/

* btf.h (enum btf_func_linkage): New.
(struct btf_var_secinfo): Update comments with notes about extern
functions.
---
 gcc/btfout.cc | 129 --
 .../gcc.dg/debug/btf/btf-datasec-2.c  |  28 
 .../gcc.dg/debug/btf/btf-function-6.c |  19 +++
 include/btf.h |  18 ++-
 4 files changed, 148 insertions(+), 46 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-6.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 204b11d4e9f..a423fabc0b5 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -290,7 +290,35 @@ btf_datasec_push_entry (ctf_container_ref ctfc, const char 
*secname,
   ds.entries.safe_push (info);
 
   datasecs.safe_push (ds);
-  num_types_created++;
+}
+
+
+/* Return the section name, as of interest to btf_collect_datasec, for the
+   given symtab node.  Note that this deliberately returns NULL for objects
+   which do not go in a section btf_collect_datasec cares about.  */
+static const char *
+get_section_name (symtab_node *node)
+{
+  const char *section_name = node->get_section ();
+
+  if (section_name == NULL)
+{
+  switch (categorize_decl_for_section (node->decl, 0))
+   {
+   case SECCAT_BSS:
+ section_name = ".bss";
+ break;
+   case SECCAT_DATA:
+ section_name = ".data";
+ break;
+   case SECCAT_RODATA:
+ section_name = ".rodata";
+ break;
+   default:;
+   }
+}
+
+  return section_name;
 }
 
 /* Construct all BTF_KIND_DATASEC records for CTFC. One such record is created
@@ -301,7 +329,60 @@ btf_datasec_push_entry (ctf_container_ref ctfc, const char 
*secname,
 static void
 btf_collect_datasec (ctf_container_ref ctfc)
 {
-  /* See cgraph.h struct symtab_node, which varpool_node extends.  */
+  cgraph_node *func;
+  FOR_EACH_FUNCTION (func)
+{
+  dw_die_ref die = lookup_decl_die (func->decl);
+  if (die == NULL)
+   continue;
+
+  ctf_dtdef_ref dtd = ctf_dtd_lookup (ctfc, die);
+  if (dtd == NULL)
+   continue;
+
+  /* Functions actually get two types: a BTF_KIND_FUNC_PROTO, and
+also a BTF_KIND_FUNC.  But the CTF container only allocates one
+type per function, which matches closely with BTF_KIND_FUNC_PROTO.
+For each such function, also allocate a BTF_KIND_FUNC entry.
+These will be output later.  */
+  ctf_dtdef_ref func_dtd = ggc_cleared_alloc ();
+  func_dtd->dtd_data = dtd->dtd_data;
+  func_dtd->dtd_data.ctti_type = dtd->dtd_type;
+  func_dtd->linkage = dtd->linkage;
+  func_dtd->dtd_type = num_types_added + num_types_created;
+
+  /* Only the BTF_KIND_FUNC type actually references the name. The
+BTF_KIND_FUNC_PROTO is always anonymous.  */
+  dtd->dtd_data.ctti_name = 0;
+
+  vec_safe_push (funcs, func_dtd);
+  num_types_created++;
+
+  /* Mark any 'extern' funcs and add DATASEC entries for them.  */
+  if (DECL_EXTERNAL (func->decl))
+   {
+ func_dtd->linkage = BTF_FUNC_EXTERN;
+
+ const char *section_name = get_section_name (func);
+ /* Note: get_section_name () returns NULL for functions in text
+section.  This is intentional, since we do not want to generate
+DATASEC entries for them.  */
+ if (section_name == NULL)
+   continue;
+
+ struct btf_var_secinfo info;
+
+ /* +1 for the sentinel type not in the types map.  */
+ info.type = func_dtd->dtd_type + 1;
+
+ /* Both zero at compile time.  */
+ info.size = 0;
+ info.offset = 0;
+
+ btf_datasec_push_entry (ctfc, section_name, info);
+   }
+}
+
   varpool_node *node;
   FOR_EACH_VARIABLE (node)
 {
@@ -313,28 +394,13 @@ btf_collect_datasec (ctf_container_ref ctfc)
   if (dvd == NULL)
continue;
 
-  const char *section_name = node->get_section ();
   /* Mark extern variables.  */

[PATCH v2 2/3] btf: fix 'extern const void' variables [PR106773]

2022-12-13 Thread David Faust via Gcc-patches
[Changes from v1: Minor updates to comments per review. ]

The eBPF loader expects to find BTF_KIND_VAR records for references to
extern const void symbols. We were mistakenly identifing these as
unsupported types, and as a result skipping emitting VAR records for
them.

In addition, the internal DWARF representation from which BTF is
produced does not generate 'const' modifier DIEs for the void type,
which meant in BTF the 'const' qualifier was dropped for 'extern const
void' variables. This patch also adds support for generating a const
void type in BTF to correct emission for these variables.

PR target/106773

gcc/

* btfout.cc (btf_collect_datasec): Correct size of void entries.
(btf_dvd_emit_preprocess_cb): Do not skip emitting variables which
refer to void types.
(btf_init_postprocess): Create 'const void' type record if needed and
adjust variables to refer to it as appropriate.

gcc/testsuite/

* gcc.dg/debug/btf/btf-pr106773.c: New test.
---
 gcc/btfout.cc | 44 +--
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c | 25 +++
 2 files changed, 65 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 677e8324424..204b11d4e9f 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -350,6 +350,8 @@ btf_collect_datasec (ctf_container_ref ctfc)
   tree size = DECL_SIZE_UNIT (node->decl);
   if (tree_fits_uhwi_p (size))
info.size = tree_to_uhwi (size);
+  else if (VOID_TYPE_P (TREE_TYPE (node->decl)))
+   info.size = 1;
 
   /* Offset is left as 0 at compile time, to be filled in by loaders such
 as libbpf.  */
@@ -441,7 +443,7 @@ btf_dvd_emit_preprocess_cb (ctf_dvdef_ref *slot, 
ctf_container_ref arg_ctfc)
 return 1;
 
   /* Do not add variables which refer to unsupported types.  */
-  if (btf_removed_type_p (var->dvd_type))
+  if (!voids.contains (var->dvd_type) && btf_removed_type_p (var->dvd_type))
 return 1;
 
   arg_ctfc->ctfc_vars_list[num_vars_added] = var;
@@ -1075,15 +1077,49 @@ btf_init_postprocess (void)
 {
   ctf_container_ref tu_ctfc = ctf_get_tu_ctfc ();
 
-  size_t i;
-  size_t num_ctf_types = tu_ctfc->ctfc_types->elements ();
-
   holes.create (0);
   voids.create (0);
 
   num_types_added = 0;
   num_types_created = 0;
 
+  /* Workaround for 'const void' variables.  These variables are sometimes used
+ in eBPF programs to address kernel symbols.  DWARF does not generate const
+ qualifier on void type, so we would incorrectly emit these variables
+ without the const qualifier.
+ Unfortunately we need the TREE node to know it was const, and we need
+ to create the const modifier type (if needed) now, before making the types
+ list.  So we can't avoid iterating with FOR_EACH_VARIABLE here, and then
+ again when creating the DATASEC entries.  */
+  ctf_id_t constvoid_id = CTF_NULL_TYPEID;
+  varpool_node *var;
+  FOR_EACH_VARIABLE (var)
+{
+  if (!var->decl)
+   continue;
+
+  tree type = TREE_TYPE (var->decl);
+  if (type && VOID_TYPE_P (type) && TYPE_READONLY (type))
+   {
+ dw_die_ref die = lookup_decl_die (var->decl);
+ if (die == NULL)
+   continue;
+
+ ctf_dvdef_ref dvd = ctf_dvd_lookup (tu_ctfc, die);
+ if (dvd == NULL)
+   continue;
+
+ /* Create the 'const' modifier type for void.  */
+ if (constvoid_id == CTF_NULL_TYPEID)
+   constvoid_id = ctf_add_reftype (tu_ctfc, CTF_ADD_ROOT,
+   dvd->dvd_type, CTF_K_CONST, NULL);
+ dvd->dvd_type = constvoid_id;
+   }
+}
+
+  size_t i;
+  size_t num_ctf_types = tu_ctfc->ctfc_types->elements ();
+
   if (num_ctf_types)
 {
   init_btf_id_map (num_ctf_types + 1);
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
new file mode 100644
index 000..f90fa773a4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
@@ -0,0 +1,25 @@
+/* Test BTF generation for extern const void symbols.
+   BTF_KIND_VAR records should be emitted for such symbols if they are used,
+   as well as a corresponding entry in the appropriate DATASEC record.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 1 variable record only for foo, with 'extern' (2) linkage.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+
+/* { dg-final { scan-assembler-times "ascii \"foo.0\"\[\t 
\]+\[^\n\]*btf_string" 1 } } */
+
+/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 1 } } */
+/* { dg-final { scan-assembler-times "1\[\t \]+\[^\n\]*bts_size" 1 } } */
+
+extern const void foo __attribute__((weak)) __attribute__((secti

[PATCH v2 1/3] btf: add 'extern' linkage for variables [PR106773]

2022-12-13 Thread David Faust via Gcc-patches
[Changes from v1:
 - Add enum btf_var_linkage in include/btf.h and use that instead of
   local #defines.
 - Fix BTF generation for extern variable with both non-defining and
   defining decls in the same CU. Add a test for this. ]

Add support for the 'extern' linkage value for BTF_KIND_VAR records,
which is used for variables declared as extern in the source file.

This also fixes a bug with BTF generation for extern variables which
have both a non-defining declaration and a defining declaration in the
same CU.

PR target/106773

gcc/

* btfout.cc (btf_collect_datasec): Mark extern variables as such.
(btf_dvd_emit_preprocess_cb): Skip non-defining extern variable decl
if there is a defining decl for the same variable.
(btf_asm_varent): Accomodate 'extern' linkage.

gcc/testsuite/

* gcc.dg/debug/btf/btf-variables-4.c: New test.
* gcc.dg/debug/btf/btf-variables-5.c: New test.

include/

* btf.h (enum btf_var_linkage): New.
(struct btf_var): Update comment to note 'extern' linkage.
---
 gcc/btfout.cc | 11 -
 .../gcc.dg/debug/btf/btf-variables-4.c| 24 +++
 .../gcc.dg/debug/btf/btf-variables-5.c| 19 +++
 include/btf.h | 11 -
 4 files changed, 63 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index aef9fd70a28..677e8324424 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -314,6 +314,9 @@ btf_collect_datasec (ctf_container_ref ctfc)
continue;
 
   const char *section_name = node->get_section ();
+  /* Mark extern variables.  */
+  if (DECL_EXTERNAL (node->decl))
+   dvd->dvd_visibility = BTF_VAR_GLOBAL_EXTERN;
 
   if (section_name == NULL)
{
@@ -431,6 +434,12 @@ btf_dvd_emit_preprocess_cb (ctf_dvdef_ref *slot, 
ctf_container_ref arg_ctfc)
 {
   ctf_dvdef_ref var = (ctf_dvdef_ref) * slot;
 
+  /* If this is an extern variable declaration with a defining declaration
+ later, skip it so that only the defining declaration is emitted.
+ This is the same case, fix and reasoning as in CTF; see PR105089.  */
+  if (ctf_dvd_ignore_lookup (arg_ctfc, var->dvd_key))
+return 1;
+
   /* Do not add variables which refer to unsupported types.  */
   if (btf_removed_type_p (var->dvd_type))
 return 1;
@@ -676,7 +685,7 @@ btf_asm_varent (ctf_dvdef_ref var)
   dw2_asm_output_data (4, var->dvd_name_offset, "btv_name");
   dw2_asm_output_data (4, BTF_TYPE_INFO (BTF_KIND_VAR, 0, 0), "btv_info");
   dw2_asm_output_data (4, get_btf_id (var->dvd_type), "btv_type");
-  dw2_asm_output_data (4, (var->dvd_visibility ? 1 : 0), "btv_linkage");
+  dw2_asm_output_data (4, var->dvd_visibility, "btv_linkage");
 }
 
 /* Asm'out a member description following a BTF_KIND_STRUCT or
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
new file mode 100644
index 000..d77600bae1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
@@ -0,0 +1,24 @@
+/* Test BTF generation for extern variables.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 4 variables.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 4 } } */
+
+/* 2 extern, 1 global, 1 static.  */
+/* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*btv_linkage" 1 } 
} */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t \]+\[^\n\]*btv_linkage" 2 
} } */
+
+extern int a;
+extern const int b;
+int c;
+static const int d = 5;
+
+int foo (int x)
+{
+  c = a + b + x;
+
+  return c + d;
+}
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c
new file mode 100644
index 000..8aae76cacab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c
@@ -0,0 +1,19 @@
+/* Test BTF generation for extern variable with both non-defining and
+   defining declarations.
+
+   In this case, only a single variable record should be emitted,
+   with 'global' linkage. However two array types will be generated.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 1 variable with global (1) linkage.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+
+/* Expect 2 array types, one of which is unsized.  */
+/* { dg-final { scan-assembler-times "\[\t \]0x4\[\t \]+\[^\n\]*bta_nelems" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*bta_nelems" 1 } 
} */
+
+extern const char FOO[];
+const char FOO[] = "foo";
di

[PATCH v2 0/3] btf: fix BTF for extern items [PR106773]

2022-12-13 Thread David Faust via Gcc-patches
[Changes from v1:
 - Remove #defines for LINKAGE_* values, instead mirror enums from
   linux/btf.h to include/btf.h and use those.
 - Fix BTF generation for extern variable with both non-defining and
   defining decls in the same CU. Add a test for this.
 - Update several comments per review feedback. ]

Hi,

This series fixes the issues reported in target/PR106773. I decided to
split it into three commits, as there are ultimately three distinct
issues and fixes. See each patch for details.

Tested on bpf-unknown-none and x86_64-linux-gnu, no known regressions.

OK to push?
Thanks.

David Faust (3):
  btf: add 'extern' linkage for variables [PR106773]
  btf: fix 'extern const void' variables [PR106773]
  btf: correct generation for extern funcs [PR106773]

 gcc/btfout.cc | 184 +-
 .../gcc.dg/debug/btf/btf-datasec-2.c  |  28 +++
 .../gcc.dg/debug/btf/btf-function-6.c |  19 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c |  25 +++
 .../gcc.dg/debug/btf/btf-variables-4.c|  24 +++
 .../gcc.dg/debug/btf/btf-variables-5.c|  19 ++
 include/btf.h |  29 ++-
 7 files changed, 276 insertions(+), 52 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-6.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-5.c

-- 
2.38.1



Re: [PATCH 4/9] ipa-sra: Treat REFERENCE_TYPES as always dereferencable

2022-12-13 Thread Martin Jambor
Hi,

On Mon, Dec 12 2022, Jan Hubicka wrote:
>>
[...]
>> gcc/ChangeLog:
>> 
>> 2022-11-11  Martin Jambor  
>> 
>>  PR ipa/103585
>>  * ipa-sra.c (struct gensum_param_access): New field load_count.
>>  (struct gensum_param_desc): New field safe_ref, adjusted comments.
>>  (by_ref_count): Renamed to unsafe_by_ref_count, adjusted all uses.
>>  (dump_gensum_access): Dump the new field.
>>  (dump_gensum_param_descriptor): Likewise.
>>  (create_parameter_descriptors): Set safe_ref field, move setting
>>  by_ref forward.  Only increment unsafe_by_ref_count for unsafe
>>  by_ref parameters.
>>  (allocate_access): Initialize new field.
>>  (mark_param_dereference): Adjust indentation.  Only add data to
>>  bb_dereferences for unsafe by_ref parameters.
>>  (scan_expr_access): For loads, accumulate BB counts.
>>  (dereference_probable_p): New function.
>>  (check_gensum_access): Fix leading comment, add parameter FUN.
>>  Check cumulative counts of loads for safe by_ref accesses instead
>>  of dereferences.
>>  (process_scan_results): Do not propagate dereference distances for
>>  safe by_ref parameters.  Pass fun to check_gensum_access.  Safe
>>  by_ref params do not need the postdominance check.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2022-11-11  Martin Jambor  
>> 
>> * g++.dg/ipa/ipa-sra-5.C: New test
>> -/* Perform basic checks on ACCESS to PARM described by DESC and all its
>> -   children, return true if the parameter cannot be split, otherwise return
>> -   true and update *TOTAL_SIZE and *ONLY_CALLS.  ENTRY_BB_INDEX must be the
>> -   index of the entry BB in the function of PARM.  */
>> +/* Return true if the ACCESS loads happen frequently enough in FUN to risk
>> +   moving them to the caller and only pass the result.  */
>>  
>>  static bool
>> -check_gensum_access (tree parm, gensum_param_desc *desc,
>> +dereference_probable_p (struct function *fun, gensum_param_access *access)
>> +{
>> +  return access->load_count
>> +>= ENTRY_BLOCK_PTR_FOR_FN (fun)->count.apply_scale (1, 2);
>
> We may want to have --param for this.
>

OK, I have added a percentage-style parameter.  I plan to commit the
following after the last round of testing (I have already checked the
documentation bits with make info and make pdf).

Thanks!

Martin


C++ and especially Fortran pass data by references which are not
pointers potentially pointing anywhere and so can be assumed to be
safely dereferencable.  This patch teaches IPA-SRA to treat them as
such and avoid the dance we do to prove that we can move loads from
them to the caller.

When we do not know that a dereference will happen all the time, we
need a heuristics so that we do not force memory accesses that normally
happen only rarely.  The patch simply uses the (possibly guessed)
profile and checks whether the (expected) number of loads is at least
half of function invocations invocations - the half is now
configurable with a param as requested by Honza.

gcc/ChangeLog:

2022-12-13  Martin Jambor  

PR ipa/103585
* params.opt (ipa-sra-deref-prob-threshold): New parameter.
* doc/invoke.texi (ipa-sra-deref-prob-threshold): Document it.
* ipa-sra.cc (struct gensum_param_access): New field load_count.
(struct gensum_param_desc): New field safe_ref, adjusted comments.
(by_ref_count): Renamed to unsafe_by_ref_count, adjusted all uses.
(dump_gensum_access): Dump the new field.
(dump_gensum_param_descriptor): Likewise.
(create_parameter_descriptors): Set safe_ref field, move setting
by_ref forward.  Only increment unsafe_by_ref_count for unsafe
by_ref parameters.
(allocate_access): Initialize new field.
(mark_param_dereference): Adjust indentation.  Only add data to
bb_dereferences for unsafe by_ref parameters.
(scan_expr_access): For loads, accumulate BB counts.
(dereference_probable_p): New function.
(check_gensum_access): Fix leading comment, add parameter FUN.
Check cumulative counts of loads for safe by_ref accesses instead
of dereferences.
(process_scan_results): Do not propagate dereference distances for
safe by_ref parameters.  Pass fun to check_gensum_access.  Safe
by_ref params do not need the postdominance check.

gcc/testsuite/ChangeLog:

2022-11-11  Martin Jambor  

* g++.dg/ipa/ipa-sra-5.C: New test
---
 gcc/doc/invoke.texi  |   5 ++
 gcc/ipa-sra.cc   | 101 +++
 gcc/params.opt   |   4 ++
 gcc/testsuite/g++.dg/ipa/ipa-sra-5.C |  23 ++
 4 files changed, 105 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/ipa-sra-5.C

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cb40b38b73a..8eae914f10e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15508,6 +15508,11 @@ the par

Re: [PATCH 3/9] ipa-cp: Leave removal of unused parameters to IPA-SRA

2022-12-13 Thread Martin Jambor
Hi,

On Mon, Dec 12 2022, Jan Hubicka wrote:
>>
[...]
>> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
>> index cc031ebed0f..172ea353c49 100644
>> --- a/gcc/ipa-cp.cc
>> +++ b/gcc/ipa-cp.cc
>> @@ -3722,7 +3722,10 @@ estimate_local_effects (struct cgraph_node *node)
>>  &removable_params_cost);
>>int devirt_bonus = devirtualization_time_bonus (node, &avals);
>>if (always_const || devirt_bonus
>> -  || (removable_params_cost && node->can_change_signature))
>> +  || (removable_params_cost
>> +  && node->can_change_signature
>> +  /* If IPA-SRA can do it, it can do it better.  */
>> +  && !node->can_be_local_p ()))
> Perhaps we could dump that into dump file (i.e. not cloning because
> ipa-sra will do it later).  That could save me from debugging session
> in future :)
> OK with that change.
> Honza

OK, I plan to commit the following after a last round of testing.
Thanks!

Martin


Looking at some benchmarks I have noticed many cases when IPA-CP
cloned a function for all contexts just because it knew that some
parameters were not used at all.  Then IPA-SRA looked at the function
and cloned it again to split another parameter or two.  The latter
pass is better equipped to detect when parameters can be altogether
removed and so the IPA-CP cloning was for no good reason.

This patch simply alters the IPA-CP not to do that in the situations
where IPA-SRA can (for nodes which can be made local) with additional
dumping requested by Honza.

gcc/ChangeLog:

2022-12-13  Martin Jambor  

* ipa-cp.cc (clone_for_param_removal_p): New function.
(estimate_local_effects): Call it before considering cloning
just to remove unused parameters.
---
 gcc/ipa-cp.cc | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index cc031ebed0f..300bec54bbd 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -3700,6 +3700,29 @@ get_max_overall_size (cgraph_node *node)
   return max_new_size;
 }
 
+/* Return true if NODE should be cloned just for a parameter removal, possibly
+   dumping a reason if not.  */
+
+static bool
+clone_for_param_removal_p (cgraph_node *node)
+{
+  if (!node->can_change_signature)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "  Not considering cloning to remove parameters, "
+"function cannot change signature.\n");
+  return false;
+}
+  if (node->can_be_local_p ())
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "  Not considering cloning to remove parameters, "
+"IPA-SRA can do it potentially better.\n");
+  return false;
+}
+  return true;
+}
+
 /* Iterate over known values of parameters of NODE and estimate the local
effects in terms of time and size they have.  */
 
@@ -3722,7 +3745,7 @@ estimate_local_effects (struct cgraph_node *node)
&removable_params_cost);
   int devirt_bonus = devirtualization_time_bonus (node, &avals);
   if (always_const || devirt_bonus
-  || (removable_params_cost && node->can_change_signature))
+  || (removable_params_cost && clone_for_param_removal_p (node)))
 {
   struct caller_statistics stats;
   ipa_call_estimates estimates;
-- 
2.38.1




[Patch] OpenMP: Parse align clause in allocate directive in C/C++

2022-12-13 Thread Tobias Burnus

We have a working parsing support for the 'allocate' directive
(failing immediately with a sorry after parsing).

To be in line with the rest of the allocat(e,or) etc. handling,
it makes sense to take care of 'align' as well, which is this
patch does - it still fails with a 'sorry' after parsing.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Parse align clause in allocate directive in C/C++

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_allocate): Parse align
	clause and check for restrictions.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_allocate): Parse align
	clause.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/allocate-5.c: Extend for align clause.

 gcc/c/c-parser.cc| 88 
 gcc/cp/parser.cc | 58 +-
 gcc/testsuite/c-c++-common/gomp/allocate-5.c | 36 
 3 files changed, 144 insertions(+), 38 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 1bbb39f9b08..62c302748dd 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -18819,32 +18819,71 @@ c_parser_oacc_wait (location_t loc, c_parser *parser, char *p_name)
   return stmt;
 }
 
-/* OpenMP 5.0:
-   # pragma omp allocate (list)  [allocator(allocator)]  */
+/* OpenMP 5.x:
+   # pragma omp allocate (list)  clauses
+
+   OpenMP 5.0 clause:
+   allocator (omp_allocator_handle_t expression)
+
+   OpenMP 5.1 additional clause:
+   align (int expression)] */
 
 static void
 c_parser_omp_allocate (location_t loc, c_parser *parser)
 {
+  tree alignment = NULL_TREE;
   tree allocator = NULL_TREE;
   tree nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_ALLOCATE, NULL_TREE);
-  if (c_parser_next_token_is (parser, CPP_COMMA)
-  && c_parser_peek_2nd_token (parser)->type == CPP_NAME)
-c_parser_consume_token (parser);
-  if (c_parser_next_token_is (parser, CPP_NAME))
+  do
 {
+  if (c_parser_next_token_is (parser, CPP_COMMA)
+	  && c_parser_peek_2nd_token (parser)->type == CPP_NAME)
+	c_parser_consume_token (parser);
+  if (!c_parser_next_token_is (parser, CPP_NAME))
+	break;
   matching_parens parens;
   const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
   c_parser_consume_token (parser);
-  if (strcmp ("allocator", p) != 0)
-	error_at (c_parser_peek_token (parser)->location,
-		  "expected %");
-  else if (parens.require_open (parser))
+  location_t expr_loc = c_parser_peek_token (parser)->location;
+  if (strcmp ("align", p) != 0 && strcmp ("allocator", p) != 0)
 	{
-	  location_t expr_loc = c_parser_peek_token (parser)->location;
-	  c_expr expr = c_parser_expr_no_commas (parser, NULL);
-	  expr = convert_lvalue_to_rvalue (expr_loc, expr, false, true);
-	  allocator = expr.value;
-	  allocator = c_fully_fold (allocator, false, NULL);
+	  error_at (c_parser_peek_token (parser)->location,
+		"expected % or %");
+	  break;
+	}
+  if (!parens.require_open (parser))
+	break;
+
+  c_expr expr = c_parser_expr_no_commas (parser, NULL);
+  expr = convert_lvalue_to_rvalue (expr_loc, expr, false, true);
+  expr_loc = c_parser_peek_token (parser)->location;
+  if (p[2] == 'i' && alignment)
+	{
+	  error_at (expr_loc, "too many %qs clauses", "align");
+	  break;
+	}
+  else if (p[2] == 'i')
+	{
+	  alignment = c_fully_fold (expr.value, false, NULL);
+	  if (TREE_CODE (alignment) != INTEGER_CST
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (alignment))
+	  || tree_int_cst_sgn (alignment) != 1
+	  || !integer_pow2p (alignment))
+	{
+	  error_at (expr_loc, "% clause argument needs to be "
+  "positive constant power of two integer "
+  "expression");
+	  alignment = NULL_TREE;
+	}
+	}
+  else if (allocator)
+	{
+	  error_at (expr_loc, "too many %qs clauses", "allocator");
+	  break;
+	}
+  else
+	{
+	  allocator = c_fully_fold (expr.value, false, NULL);
 	  tree orig_type
 	= expr.original_type ? expr.original_type : TREE_TYPE (allocator);
 	  orig_type = TYPE_MAIN_VARIANT (orig_type);
@@ -18853,20 +18892,23 @@ c_parser_omp_allocate (location_t loc, c_parser *parser)
 	  || TYPE_NAME (orig_type)
 		 != get_identifier ("omp_allocator_handle_t"))
 	{
-	  error_at (expr_loc, "% clause allocator expression "
-"has type %qT rather than "
-"%",
-TREE_TYPE (allocator));
+	  error_at (expr_loc,
+			"% clause allocator expression has type "
+			"%qT rather than %",
+			TREE_TYPE (allocator));
 	  allocator = NULL_TREE;
 	}
-	  parens.skip_until_found_close (parser);
 	}
-}
+  parens.skip_until_found_close (parser);
+} while (true);
   c_parser_skip_to_pragma_eol (parser);
 
-  if (allocator)
+  if (allocator || alignm

RE: [PATCH]AArch64 Fix ILP32 tbranch

2022-12-13 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Tamar Christina 
> Sent: Tuesday, December 13, 2022 5:14 PM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
> ; Richard Sandiford
> 
> Subject: [PATCH]AArch64 Fix ILP32 tbranch
> 
> Hi All,
> 
> the baremetal builds are currently broken because the shift ends up in the
> wrong
> representation if the mode is SImode and the shift amount if 31.   To fix this
> create the rtx constant with an explicit mode so the backend passes know
> which
> representation it needs to take.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Build aarch64-none-elf with ILP32 multilib and no issues
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.md (tbranch_3): Use
> gen_int_mode.
> 
> --- inline copy of patch --
> diff --git a/gcc/config/aarch64/aarch64.md
> b/gcc/config/aarch64/aarch64.md
> index
> d749c98eef63de4b92e589a167af823416f6a71d..6c27fb89e663d6ed845b41da
> f32476c2a58a169c 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -957,7 +957,7 @@ (define_expand "tbranch_3"
>  {
>rtx bitvalue = gen_reg_rtx (mode);
>rtx reg = gen_lowpart (mode, operands[0]);
> -  rtx val = GEN_INT (1UL << UINTVAL (operands[1]));
> +  rtx val = gen_int_mode (HOST_WIDE_INT_1U << UINTVAL (operands[1]),
> mode);
>emit_insn (gen_and3 (bitvalue, reg, val));
>operands[1] = const0_rtx;
>operands[0] = aarch64_gen_compare_reg (, bitvalue,
> 
> 
> 
> 
> --


[PATCH]AArch64 Fix ILP32 tbranch

2022-12-13 Thread Tamar Christina via Gcc-patches
Hi All,

the baremetal builds are currently broken because the shift ends up in the wrong
representation if the mode is SImode and the shift amount if 31.   To fix this
create the rtx constant with an explicit mode so the backend passes know which
representation it needs to take.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Build aarch64-none-elf with ILP32 multilib and no issues

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64.md (tbranch_3): Use gen_int_mode.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
d749c98eef63de4b92e589a167af823416f6a71d..6c27fb89e663d6ed845b41daf32476c2a58a169c
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -957,7 +957,7 @@ (define_expand "tbranch_3"
 {
   rtx bitvalue = gen_reg_rtx (mode);
   rtx reg = gen_lowpart (mode, operands[0]);
-  rtx val = GEN_INT (1UL << UINTVAL (operands[1]));
+  rtx val = gen_int_mode (HOST_WIDE_INT_1U << UINTVAL (operands[1]), 
mode);
   emit_insn (gen_and3 (bitvalue, reg, val));
   operands[1] = const0_rtx;
   operands[0] = aarch64_gen_compare_reg (, bitvalue,




-- 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
d749c98eef63de4b92e589a167af823416f6a71d..6c27fb89e663d6ed845b41daf32476c2a58a169c
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -957,7 +957,7 @@ (define_expand "tbranch_3"
 {
   rtx bitvalue = gen_reg_rtx (mode);
   rtx reg = gen_lowpart (mode, operands[0]);
-  rtx val = GEN_INT (1UL << UINTVAL (operands[1]));
+  rtx val = gen_int_mode (HOST_WIDE_INT_1U << UINTVAL (operands[1]), 
mode);
   emit_insn (gen_and3 (bitvalue, reg, val));
   operands[1] = const0_rtx;
   operands[0] = aarch64_gen_compare_reg (, bitvalue,





[Patch] Fortran: Extend align-clause checks of OpenMP's allocate clause

2022-12-13 Thread Tobias Burnus

I missed that 'align' needs to be a power of 2 - contrary to 'aligned',
which does not have this restriction for some odd reason.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Extend align-clause checks of OpenMP's allocate directive

gcc/fortran/ChangeLog:

	* openmp.cc (resolve_omp_clauses): Check also for
	power of two.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/allocate-3.f90: Fix ALIGN
	usage, remove unused -fdump-tree-original.
	* testsuite/libgomp.fortran/allocate-4.f90: New.

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 686f924b47a..5468cc97969 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -7315,11 +7315,12 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	  || n->u.align->ts.type != BT_INTEGER
 	  || n->u.align->rank != 0
 	  || gfc_extract_int (n->u.align, &alignment)
-	  || alignment <= 0)
+	  || alignment <= 0
+	  || !pow2p_hwi (alignment))
 	{
-	  gfc_error ("ALIGN modifier requires a scalar positive "
-			 "constant integer alignment expression at %L",
-			 &n->u.align->where);
+	  gfc_error ("ALIGN modifier requires at %L a scalar positive "
+			 "constant integer alignment expression that is a "
+			 "power of two", &n->u.align->where);
 	  break;
 	}
 	}

diff --git a/libgomp/testsuite/libgomp.fortran/allocate-3.f90 b/libgomp/testsuite/libgomp.fortran/allocate-3.f90
index a39819164d6..1fa0bb932c3 100644
--- a/libgomp/testsuite/libgomp.fortran/allocate-3.f90
+++ b/libgomp/testsuite/libgomp.fortran/allocate-3.f90
@@ -1,5 +1,4 @@
 ! { dg-do compile }
-! { dg-additional-options "-fdump-tree-original" }
 
 use omp_lib
 implicit none
@@ -23,6 +22,7 @@ integer :: q, x,y,z
 ! { dg-error "Object 'omp_high_bw_mem_alloc' is not a variable" "" { target *-*-* } .-1 }
 !$omp end parallel
 
-!$omp parallel allocate( align(q) : x) firstprivate(x) ! { dg-error "31:ALIGN modifier requires a scalar positive constant integer alignment expression at" }
+!$omp parallel allocate( align(128) : x) firstprivate(x) ! OK
 !$omp end parallel
+
 end
diff --git a/libgomp/testsuite/libgomp.fortran/allocate-4.f90 b/libgomp/testsuite/libgomp.fortran/allocate-4.f90
new file mode 100644
index 000..ddb507ba8e4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/allocate-4.f90
@@ -0,0 +1,42 @@
+! { dg-do compile }
+
+
+subroutine test()
+use iso_c_binding, only: c_intptr_t
+implicit none
+integer, parameter :: omp_allocator_handle_kind = 1 !! <<<
+integer (kind=omp_allocator_handle_kind), &
+ parameter :: omp_high_bw_mem_alloc = 4
+integer :: q, x,y,z
+integer, parameter :: cnst(2) = [64, 101]
+
+!$omp parallel allocate( omp_high_bw_mem_alloc : x)  firstprivate(x) ! { dg-error "Expected integer expression of the 'omp_allocator_handle_kind' kind" }
+!$omp end parallel
+
+!$omp parallel allocate( allocator (omp_high_bw_mem_alloc) : x)  firstprivate(x) ! { dg-error "Expected integer expression of the 'omp_allocator_handle_kind' kind" }
+!$omp end parallel
+
+!$omp parallel allocate( align (q) : x)  firstprivate(x) ! { dg-error "32:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+
+!$omp parallel allocate( align (32) : x)  firstprivate(x) ! OK
+!$omp end parallel
+
+!$omp parallel allocate( align(q) : x) firstprivate(x) ! { dg-error "31:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+
+!$omp parallel allocate( align(cnst(1)) : x ) firstprivate(x) ! OK
+!$omp end parallel
+
+!$omp parallel allocate( align(cnst(2)) : x) firstprivate(x)  ! { dg-error "31:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+
+!$omp parallel allocate( align( 31) :x) firstprivate(x)  ! { dg-error "32:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+
+!$omp parallel allocate( align (32.0): x) firstprivate(x)  ! { dg-error "32:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+
+!$omp parallel allocate( align(cnst ) : x ) firstprivate(x)  ! { dg-error "31:ALIGN modifier requires at \\(1\\) a scalar positive constant integer alignment expression that is a power of two" }
+!$omp end parallel
+end


[Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-13 Thread Tobias Burnus

This is a 12/13 regression as come changes to fix the GFC/CFI descriptor
that went into GCC 12 fail with the (bogus) descriptor passed via by a
GCC-11-compiled program.

As later GCC 12 changes moved the descriptor to the front end, those
functions are only in libgomp.so to cater for old program. Richard
suggested in the PR that the best way is to move to the GCC 11 version,
such that libgfortran.so won't regress.

I now did so - except for three fixes (cf. changelog). See also
PR: https://gcc.gnu.org/PR108056

There is no testcase as it needs to be compiled by GCC <= 11 and then
run with linking (dynamically) to a GCC 12 or 13 libgfortran.

OK for mainline and GCC 12?

 * * *

Note: It is strongly recommended to use GCC 12 (or 13) with array-descriptor
C interop as many issues were fixed. Like for the testcase in the PR; in GCC 11
the type arriving in libgomp is BT_ASSUME ('type(*)'). But as the effective
argument is passed as array descriptor through out, the 'float' (real(4)) type
info is actually preservable (as GCC 12 cf. testcase of comment 0 and my comment
in the PR for the C part of the testcase).(*)

Tobias

((*) This is not possible if using a scalar 'type(*)' or a non-array-descriptor
array in between. I think GCC 12 uses 'CFI_other' in the information-is-lost 
case.)
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

Since GCC 12, the conversion between the array descriptors formats - the
internal (GFC) and the C binding one (CFI) moved to the compiler itself
such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only
used with older code (GCC 9 to 11).  The newly added checks caused asserts
as older code did not pass the proper values (e.g. real(4) as effective
argument arrived as BT_ASSUME type as the effective type got lost inbetween).

As proposed in the PR, revert to the GCC 11 version - known bugs is better
than some fixes and new issues. Still, GCC 12 is much better in terms of
TS29113 support and should really be used.

This patch uses the current libgomp version of the GCC 11 branch, except
it fixes the GFC version number (which is 0), uses calloc instead of malloc,
and sets the lower bound to 1 instead of keeping it as is for
CFI_attribute_other.

libgfortran/ChangeLog:

	PR libfortran/108056
	* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc,
	gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for
	those backward-compatiblity-only functions.

diff --git a/libgfortran/runtime/ISO_Fortran_binding.c b/libgfortran/runtime/ISO_Fortran_binding.c
index 342df4275b9..e63a717a69b 100644
--- a/libgfortran/runtime/ISO_Fortran_binding.c
+++ b/libgfortran/runtime/ISO_Fortran_binding.c
@@ -39,60 +39,31 @@ export_proto(cfi_desc_to_gfc_desc);
 void
 cfi_desc_to_gfc_desc (gfc_array_void *d, CFI_cdesc_t **s_ptr)
 {
-  signed char type;
-  size_t size;
   int n;
+  index_type kind;
   CFI_cdesc_t *s = *s_ptr;
 
   if (!s)
 return;
 
-  /* Verify descriptor.  */
-  switch (s->attribute)
-{
-case CFI_attribute_pointer:
-case CFI_attribute_allocatable:
-  break;
-case CFI_attribute_other:
-  if (s->base_addr)
-	break;
-  runtime_error ("Nonallocatable, nonpointer actual argument to BIND(C) "
-		 "dummy argument where the effective argument is either "
-		 "not allocated or not associated");
-  break;
-default:
-  runtime_error ("Invalid attribute type %d in CFI_cdesc_t descriptor",
-		 (int) s->attribute);
-  break;
-}
   GFC_DESCRIPTOR_DATA (d) = s->base_addr;
+  GFC_DESCRIPTOR_TYPE (d) = (signed char)(s->type & CFI_type_mask);
+  kind = (index_type)((s->type - (s->type & CFI_type_mask)) >> CFI_type_kind_shift);
 
   /* Correct the unfortunate difference in order with types.  */
-  type = (signed char)(s->type & CFI_type_mask);
-  switch (type)
-{
-case CFI_type_Character:
-  type = BT_CHARACTER;
-  break;
-case CFI_type_struct:
-  type = BT_DERIVED;
-  break;
-case CFI_type_cptr:
-  /* FIXME: PR 100915.  GFC descriptors do not distinguish between
-	 CFI_type_cptr and CFI_type_cfunptr.  */
-  type = BT_VOID;
-  break;
-default:
-  break;
-}
-
-  GFC_DESCRIPTOR_TYPE (d) = type;
-  GFC_DESCRIPTOR_SIZE (d) = s->elem_len;
+  if (GFC_DESCRIPTOR_TYPE (d) == BT_CHARACTER)
+GFC_DESCRIPTOR_TYPE (d) = BT_DERIVED;
+  else if (GFC_DESCRIPTOR_TYPE (d) == BT_DERIVED)
+GFC_DESCRIPTOR_TYPE (d) = BT_CHARACTER;
+
+  if (!s->rank || s->dim[0].sm == (CFI_index_t)s->elem_len)
+GFC_DESCRIPTOR_SIZE (d) = s->elem_len;
+  else if (GFC_DESCRIPTOR_TYPE (d) != BT_DERIVED)
+GFC_DESCRIPTOR_SIZE (d) = kind;
+  else
+GFC_DESCRIPTOR_SIZE (d) = s->elem_len;
 
   d->dtype.version = 

[OG12][committed] OpenMP, libgomp: Handle unified shared memory in omp_target_is_accessible.

2022-12-13 Thread Marcel Vollweiler

This patch handles Unified Shared Memory (USM) in the OpenMP runtime routine
omp_target_is_accessible implementation.

A previous patch was submitted some months ago
(https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594187.html) but not yet
reviewed due to dependencies on the Unified Shared Memory implementation.
Although USM is not yet in mainline, the corresponding patches were already
committed to OG12. I rebased, updated, and committed my patch to OG12
(devel/omp/gcc-12 branch).

I tested the patch with nvptx offloading (x86_64-linux and PowerPC) without
regressions. Since USM is not supported for all gcn targets, I tested gcn with
offloading for x86_64-linux on various targets (gfx90a, gfx908, gfx906, gfx803)
- also without regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 9044b7efb3518de180a5b3168615b7e12d93eea8
Author: Marcel Vollweiler 
Date:   Tue Dec 13 12:04:48 2022 +

OpenMP, libgomp: Handle unified shared memory in omp_target_is_accessible

This patch handles Unified Shared Memory (USM) in the OpenMP runtime routine
omp_target_is_accessible.

libgomp/ChangeLog:

* target.c (omp_target_is_accessible): Handle unified shared memory.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Updated.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: Updated.
* testsuite/libgomp.c-c++-common/target-is-accessible-2.c: New test.
* testsuite/libgomp.fortran/target-is-accessible-2.f90: New test.

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 32bcc84..a0d0271 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,11 @@
+2022-12-13  Marcel Vollweiler  
+
+   * target.c (omp_target_is_accessible): Handle unified shared memory.
+   * testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Updated.
+   * testsuite/libgomp.fortran/target-is-accessible-1.f90: Updated.
+   * testsuite/libgomp.c-c++-common/target-is-accessible-2.c: New test.
+   * testsuite/libgomp.fortran/target-is-accessible-2.f90: New test.
+
 2022-12-12  Tobias Burnus  
 
Backported from master:
diff --git a/libgomp/target.c b/libgomp/target.c
index 50709f0..2cd8e2a 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -5067,9 +5067,13 @@ omp_target_is_accessible (const void *ptr, size_t size, 
int device_num)
   if (devicep == NULL)
 return false;
 
-  /* TODO: Unified shared memory must be handled when available.  */
+  if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+return true;
 
-  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
+  if (devicep->is_usm_ptr_func && devicep->is_usm_ptr_func ((void *) ptr))
+return true;
+
+  return false;
 }
 
 int
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c 
b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
index 2e75c63..e7f9cf2 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
@@ -1,3 +1,5 @@
+/* { dg-do run } */
+
 #include 
 
 int
@@ -6,7 +8,8 @@ main ()
   int d = omp_get_default_device ();
   int id = omp_get_initial_device ();
   int n = omp_get_num_devices ();
-  void *p;
+  int i = 42;
+  void *p = &i;
 
   if (d < 0 || d >= n)
 d = id;
@@ -26,23 +29,28 @@ main ()
   if (omp_target_is_accessible (p, sizeof (int), n + 1))
 __builtin_abort ();
 
-  /* Currently, a host pointer is accessible if the device supports shared
- memory or omp_target_is_accessible is executed on the host. This
- test case must be adapted when unified shared memory is avialable.  */
   int a[128];
   for (int d = 0; d <= omp_get_num_devices (); d++)
 {
+  /* SHARED_MEM is 1 if and only if host and device share the same memory.
+OMP_TARGET_IS_ACCESSIBLE should not return 0 for shared memory.  */
   int shared_mem = 0;
   #pragma omp target map (alloc: shared_mem) device (d)
shared_mem = 1;
-  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+
+  if (shared_mem && !omp_target_is_accessible (p, sizeof (int), d))
+   __builtin_abort ();
+
+  /* USM is disabled by default.  Hence OMP_TARGET_IS_ACCESSIBLE should
+return 0 if shared_mem is false.  */
+  if (!shared_mem && omp_target_is_accessible (p, sizeof (int), d))
__builtin_abort ();
 
-  if (omp_target_is_accessible (a, 128 * sizeof (int), d) != shared_mem)
+  if (shared_mem && !omp_target_is_accessible (a, 128 * sizeof (int), d))
__builtin_abort ();
 
   for (int i = 0; i < 128; i++)
-   if (omp_target_is_accessible (&a[i], sizeof (int), d) != shared_mem)
+   if (shared_mem && !omp_target

Re: [PATCH] i386: Fix up *concat*_{5,6,7} patterns [PR108044]

2022-12-13 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 13, 2022 at 01:21:54PM +0100, Uros Bizjak wrote:
> On Tue, Dec 13, 2022 at 10:20 AM Jakub Jelinek  wrote:
> >
> > Hi!
> >
> > The following patch fixes 2 issues with the *concat3_5 and
> > *concat3_{6,7} patterns.
> > One is that if the destination is memory rather than register, then
> > we can't use movabsq and so can't support all the possible immediates.
> > I see 3 possibilities to fix that.  One would be to use
> > x86_64_hilo_int_operand predicate instead of const_scalar_int_operand
> > and thus not match it at all during combine in such cases, but that
> > unnecessarily pessimizes also the case when it is loaded into register
> > where we can use movabsq.
> > Another one is what is implemented in the patch, use Wd constraint
> > for the integer on 64-bit if destination is memory and X (didn't find
> > more appropriate one which would accept any const_int/const_wide_int
> > and the value checking is done in the conditions) otherwise.
> 
> Perhaps you should use "n" instead of "X".

I was afraid of the PIC stuff in:
(define_constraint "n"
  "Matches a non-symbolic integer constant."
  (and (match_test "CONST_SCALAR_INT_P (op)")
   (match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))
but now that I look at i386.cc (legitimate_pic_operand_p), you're right,
for CONST_INT and CONST_WIDE_INT it always returns true, so n is fine.

I'll retest with "n".

Jakub



Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-13 Thread Qing Zhao via Gcc-patches
Richard, 

Do you have any decision on this one? 
Do we need this warning option For GCC? 

thanks.

Qing

> On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
> 
> '-Wstrict-flex-arrays'
> Warn about inproper usages of flexible array members according to
> the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> the trailing array field of a structure if it's available,
> otherwise according to the LEVEL of the option
> '-fstrict-flex-arrays=LEVEL'.
> 
> This option is effective only when LEVEL is bigger than 0.
> Otherwise, it will be ignored with a warning.
> 
> when LEVEL=1, warnings will be issued for a trailing array
> reference of a structure that have 2 or more elements if the
> trailing array is referenced as a flexible array member.
> 
> when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> issued for a trailing one-element array reference of a structure if
> the array is referenced as a flexible array member.
> 
> when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> issued for a trailing zero-length array reference of a structure if
> the array is referenced as a flexible array member.
> 
> gcc/ChangeLog:
> 
>   * doc/invoke.texi: Document -Wstrict-flex-arrays option.
>   * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
>   arguments.
>   (array_bounds_checker::check_array_ref): Issue warnings for
>   -Wstrict-flex-arrays.
>   * opts.cc (finish_options): Issue warning for unsupported combination
>   of -Wstrict_flex_arrays and -fstrict-flex-array.
>   * tree-vrp.cc (execute_ranger_vrp): Enable the pass when
>   warn_strict_flex_array is true.
> 
> gcc/c-family/ChangeLog:
> 
>   * c.opt (Wstrict-flex-arrays): New option.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
>   -Wstrict-flex-arrays.
>   * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
>   * c-c++-common/Wstrict-flex-arrays.c: New test.
>   * gcc.dg/Wstrict-flex-arrays-2.c: New test.
>   * gcc.dg/Wstrict-flex-arrays-3.c: New test.
>   * gcc.dg/Wstrict-flex-arrays.c: New test.
> ---
> gcc/c-family/c.opt|   5 +
> gcc/doc/invoke.texi   |  27 -
> gcc/gimple-array-bounds.cc| 103 ++
> gcc/opts.cc   |   8 ++
> .../c-c++-common/Wstrict-flex-arrays.c|   9 ++
> .../gcc.dg/Warray-bounds-flex-arrays-1.c  |   5 +-
> .../gcc.dg/Warray-bounds-flex-arrays-2.c  |   6 +-
> .../gcc.dg/Warray-bounds-flex-arrays-3.c  |   7 +-
> .../gcc.dg/Warray-bounds-flex-arrays-4.c  |   5 +-
> .../gcc.dg/Warray-bounds-flex-arrays-5.c  |   6 +-
> .../gcc.dg/Warray-bounds-flex-arrays-6.c  |   7 +-
> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  39 +++
> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  39 +++
> gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  39 +++
> gcc/tree-vrp.cc   |   2 +-
> 15 files changed, 273 insertions(+), 34 deletions(-)
> create mode 100644 gcc/testsuite/c-c++-common/Wstrict-flex-arrays.c
> create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c
> create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c
> create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 0d0ad0a6374..33edeefd285 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -976,6 +976,11 @@ Wstringop-truncation
> C ObjC C++ LTO ObjC++ Var(warn_stringop_truncation) Warning Init (1) 
> LangEnabledBy(C ObjC C++ LTO ObjC++, Wall)
> Warn about truncation in string manipulation functions like strncat and 
> strncpy.
> 
> +Wstrict-flex-arrays
> +C C++ Var(warn_strict_flex_arrays) Warning
> +Warn about inproper usages of flexible array members
> +according to the level of -fstrict-flex-arrays.
> +
> Wsuggest-attribute=format
> C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning
> Warn about functions which might be candidates for format attributes.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 726392409b6..4402b0427ef 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -398,7 +398,7 @@ Objective-C and Objective-C++ Dialects}.
> -Wstrict-aliasing=n  -Wstrict-overflow  -Wstrict-overflow=@var{n} @gol
> -Wstring-compare @gol
> -Wno-stringop-overflow -Wno-stringop-overread @gol
> --Wno-stringop-truncation @gol
> +-Wno-stringop-truncation -Wstrict-flex-arrays @gol
> -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]}
>  @gol
> -Wswitch  -Wno-switch-bool  -Wswitch-

Re: [PATCH] gcov: Fix -fprofile-update=atomic

2022-12-13 Thread Richard Biener via Gcc-patches
On Fri, Dec 9, 2022 at 2:56 PM Sebastian Huber
 wrote:
>
> The code coverage support uses counters to determine which edges in the 
> control
> flow graph were executed.  If a counter overflows, then the code coverage
> information is invalid.  Therefore the counter type should be a 64-bit 
> integer.
> In multithreaded applications, it is important that the counter increments are
> atomic.  This is not the case by default.  The user can enable atomic counter
> increments through the -fprofile-update=atomic and
> -fprofile-update=prefer-atomic options.
>
> If the hardware supports 64-bit atomic operations, then everything is fine.  
> If
> not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
> counter increments will be used.  However, if the hardware does not support 
> the
> required atomic operations and -fprofile-atomic=update was chosen by the user,
> then a warning was issued and as a forced fall-back to non-atomic operations
> was done.  This is probably not what a user wants.  There is still hardware on
> the market which does not have atomic operations and is used for multithreaded
> applications.  A user which selects -fprofile-update=atomic wants consistent
> code coverage data and not random data.
>
> This patch removes the fall-back to non-atomic operations for
> -fprofile-update=atomic.  If atomic operations in hardware are not available,
> then a library call to libatomic is emitted.  To mitigate potential 
> performance
> issues an optimization for systems which only support 32-bit atomic operations
> is provided.  Here, the edge counter increments are done like this:
>
>   low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED);
>   high_inc = low == 0 ? 1 : 0;
>   __atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED);

You check for compare_and_swapsi and the old code checks
TYPE_PRECISION (gcov_type_node) > 32 to determine whether 8 byte or 4 byte
gcov_type is used.  But you do not seem to handle the case where
neither SImode nor DImode atomic operations are available?  Can we instead
do

  if (gcov_type_size == 4)
can_support_atomic4
  = HAVE_sync_compare_and_swapsi || HAVE_atomic_compare_and_swapsi;
  else if (gcov_type_size == 8)
can_support_atomic8
  = HAVE_sync_compare_and_swapdi || HAVE_atomic_compare_and_swapdi;

  if (flag_profile_update == PROFILE_UPDATE_ATOMIC
  && !can_support_atomic4 && !can_support_atomic8)
{
  warning (0, "target does not support atomic profile update, "
   "single mode is selected");
  flag_profile_update = PROFILE_UPDATE_SINGLE;
}

thus retain the diagnostic & fallback for this case?  The existing
code also suggests
there might be targets with HImode or TImode counters?

In another mail you mentioned that for gen_time_profiler this doesn't
work but its
code relies on flag_profile_update as well.  So do we need to split the flag
somehow, or continue using the PROFILE_UPDATE_SINGLE fallback when
we are doing more than just edge profiling?

Thanks,
Richard.

> gcc/ChangeLog:
>
> * tree-profile.cc (split_atomic_increment): New.
> (gimple_gen_edge_profiler): Split the atomic edge counter increment in
> two 32-bit atomic operations if necessary.
> (tree_profiling): Remove profile update warning and fall-back.  Set
> split_atomic_increment if necessary.
> ---
>  gcc/tree-profile.cc | 81 +
>  1 file changed, 59 insertions(+), 22 deletions(-)
>
> diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
> index 2beb49241f2..1d326dde59a 100644
> --- a/gcc/tree-profile.cc
> +++ b/gcc/tree-profile.cc
> @@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var;
>  static GTY(()) tree ic_tuple_counters_field;
>  static GTY(()) tree ic_tuple_callee_field;
>
> +/* If the user selected atomic profile counter updates
> +   (-fprofile-update=atomic), then the counter updates will be done 
> atomically.
> +   Ideally, this is done through atomic operations in hardware.  If the
> +   hardware supports only 32-bit atomic increments and gcov_type_node is a
> +   64-bit integer type, then for the profile edge counters the increment is
> +   performed through two separate 32-bit atomic increments.  This case is
> +   indicated by the split_atomic_increment variable begin true.  If the
> +   hardware does not support atomic operations at all, then a library call to
> +   libatomic is emitted.  */
> +static bool split_atomic_increment;
> +
>  /* Do initialization work for the edge profiler.  */
>
>  /* Add code:
> @@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void)
>  void
>  gimple_gen_edge_profiler (int edgeno, edge e)
>  {
> -  tree one;
> -
> -  one = build_int_cst (gcov_type_node, 1);
> +  const char *name = "PROF_edge_counter";
> +  tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
> +  tree one = build_int_cst (gcov_type_node, 1);
>
>if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
>

Re: Patch [0/3] for PR target/107299 (GCC does not build on PowerPC when long double is IEEE 128-bit)

2022-12-13 Thread Segher Boessenkool
On Tue, Dec 06, 2022 at 04:03:35PM +0100, Jakub Jelinek wrote:
> On Tue, Dec 06, 2022 at 08:56:09AM -0600, Segher Boessenkool wrote:
> > >   In the past, _Float128 was a C extended type,
> > >   but now it is a part of the C/C++ 2x standards.
> > 
> > Only if you select a new enough -std=, it still is an extended type if
> > not?
> 
> No, as an extension _Float{16,32,64,128}{,x} are available (where the backend
> has support for such IEEE format) even in older C or C++ modes,
> similarly the {f,F}{16,32,64,128} suffixes on literals (with pedwarn
> on everything but C++23).  In C++ it is in all language modes treated as
> distinct type from __float128, and _FloatNN is handled as extended floating
> point type per C++23 rules, while __float128 is not.

Right, so there is no "in the past" here, the actual situation is quite
different.

Reviewing these patches is harder and a lot more work than writing them
can ever have been :-( :-( :-(


Segher


[PATCH] tree-optimization/105801 - CCP and .DEFERRED_INIT

2022-12-13 Thread Richard Biener via Gcc-patches
This makes sure we treat .DEFERRED_INIT as producing UNDEFINED so
we can continue optimizing uninitialized uses the same as without
-ftrivial-auto-var-init=zero.  For the testcase this means we
catch the return 1 optimization opportunity at CCP rather than
only at FRE which already does the right thing here.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to
trunk sofar.

Richard.

PR tree-optimization/105801
* tree-ssa-ccp.cc (likely_value): .DEFERRED_INIT produces
UNDEFINED.
* doc/invoke.texi (ftrivial-auto-var-init): Explicitely
mention we treat variables without an initializer as
undefined also for optimization purposes.

* gcc.dg/tree-ssa/ssa-ccp-43.c: New testcase.
---
 gcc/doc/invoke.texi|  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-43.c | 12 
 gcc/tree-ssa-ccp.cc|  4 
 3 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-43.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cb40b38b73a..13371972fd1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13208,7 +13208,8 @@ disclosure and use.
 GCC still considers an automatic variable that doesn't have an explicit
 initializer as uninitialized, @option{-Wuninitialized} and
 @option{-Wanalyzer-use-of-uninitialized-value} will still report
-warning messages on such automatic variables.
+warning messages on such automatic variables and the compiler will
+perform optimization as if the variable were uninitialized.
 With this option, GCC will also initialize any padding of automatic variables
 that have structure or union types to zeroes.
 However, the current implementation cannot initialize automatic variables that
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-43.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-43.c
new file mode 100644
index 000..3e0a3d659d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-43.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ftrivial-auto-var-init=zero -fdump-tree-ccp1" } */
+
+int foo (int flag)
+{
+  int i;
+  if (flag)
+i = 1;
+  return i;
+}
+
+/* { dg-final { scan-tree-dump "return 1;" "ccp1" } } */
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 68e69bfe129..0d47289b31d 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -722,6 +722,10 @@ likely_value (gimple *stmt)
   if (gimple_has_volatile_ops (stmt))
 return VARYING;
 
+  /* .DEFERRED_INIT produces undefined.  */
+  if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT))
+return UNDEFINED;
+
   /* Arrive here for more complex cases.  */
   has_constant_operand = false;
   has_undefined_operand = false;
-- 
2.35.3


Re: [pushed] c++: build initializer_list in a loop [PR105838]

2022-12-13 Thread Stephan Bergmann via Gcc-patches

On 08/12/2022 19:45, Jason Merrill via Gcc-patches wrote:

Tested x86_64-pc-linux-gnu, applying to trunk.


Bisecting shows this started to break


$ cat test.cc
#include 
template struct ConstCharArrayDetector;
template struct ConstCharArrayDetector { using Type = 
int; };
struct OUString {
template OUString(T &, typename ConstCharArrayDetector::Type 
= 0);
};
struct Sequence { Sequence(std::initializer_list); };
Sequence f() { return {""}; }



$ g++ -fsyntax-only test.cc
test.cc: In function ‘Sequence f()’:
test.cc:8:26: error: no matching function for call to ‘OUString::OUString(const 
char* const)’
8 | Sequence f() { return {""}; }
  |  ^
test.cc:5:26: note: candidate: ‘template OUString::OUString(T&, typename 
ConstCharArrayDetector::Type)’
5 | template OUString(T &, typename 
ConstCharArrayDetector::Type = 0);
  |  ^~~~
test.cc:5:26: note:   template argument deduction/substitution failed:
test.cc: In substitution of ‘template OUString::OUString(T&, typename 
ConstCharArrayDetector::Type) [with T = const char* const]’:
test.cc:8:26:   required from here
test.cc:5:26: error: invalid use of incomplete type ‘struct 
ConstCharArrayDetector’
test.cc:2:27: note: declaration of ‘struct ConstCharArrayDetector’
2 | template struct ConstCharArrayDetector;
  |   ^~
test.cc:4:8: note: candidate: ‘constexpr OUString::OUString(const OUString&)’
4 | struct OUString {
  |^~~~
test.cc:4:8: note:   no known conversion for argument 1 from ‘const char* const’ to 
‘const OUString&’
test.cc:4:8: note: candidate: ‘constexpr OUString::OUString(OUString&&)’
test.cc:4:8: note:   no known conversion for argument 1 from ‘const char* const’ to 
‘OUString&&’
test.cc:8:26: error: conversion from ‘const char* const’ to non-scalar type 
‘const OUString’ requested
8 | Sequence f() { return {""}; }
  |  ^




Re: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-12-13 Thread Segher Boessenkool
On Tue, Dec 13, 2022 at 01:14:39AM -0500, Michael Meissner wrote:
> On Mon, Dec 12, 2022 at 06:20:14PM +0800, Kewen.Lin wrote:
> > Without or with patch #1, the below ICE in libgcc exists, the ICE should 
> > have
> > nothing to do with the special handling for building_libgcc in patch #1.  I
> > think patch #2 which makes _Float128 and __float128 use the same internal
> > type fixes that ICE.
> > 
> > I still don't get the point why we need the special handling for 
> > building_libgcc,
> > I also tested on top of patch #1 and #2 w/ and w/o the special handling for
> > building_libgcc, both bootstrapped and regress-tested.
> > 
> > Could you have a double check?
> 
> As long as patch #2 and #3 are installed, we don't need the special handling
> for building_libgcc.  Good catch.
> 
> I will send out a replacement patch for it.

Please send a complete new series replacing this one.  Thanks.


Segher


Re: Rust front-end patches v4

2022-12-13 Thread Arthur Cohen

Hi Martin,

On 12/13/22 14:30, Martin Liška wrote:

On 12/13/22 14:26, Arthur Cohen wrote:

Thank you, and congratulations, to all the contributors.


We thank you!! Congratulations.


Thank you :)


I have one question: do you have a list of supported architectures Rust FE
can support right now?


You can have a look at what architectures we support on our github 
repository: https://github.com/Rust-GCC/gccrs


We pass our own testsuite on i386, x86_64 and arm64. As mentioned by 
Mark, the other architectures currently fail due to two testcases which 
do not work properly on big-endian systems, and one unsupported 
test-case on PowerPC 64bits.


It shouldn't be too hard to fix these failures, and I am working on it.

We've also got contributions from Iain Buclaw, which will help 
supporting more target-specific features. These will be upstreamed soon, 
as we update GCC's master with our current development branch.


> What are you plans for GCC 13.1 release?

We aim to be able to compile the entirety of Rust's core library, 
libcore 1.49, before the release of GCC 13.1. There aren't too many 
features left, and we hope to get to them quickly. We also want to 
integrate at least part of a borrow-checker to gccrs, as soon as possible.


This will surely unearth a lot of bugs for us to fix in time for April :)


Cheers,
Martin



All the best,

Arthur




Kindly,

--
Arthur Cohen 

Toolchain Engineer

Embecosm GmbH

Geschäftsführer: Jeremy Bennett
Niederlassung: Nürnberg
Handelsregister: HR-B 36368
www.embecosm.de

Fürther Str. 27
90429 Nürnberg


Tel.: 091 - 128 707 040
Fax: 091 - 128 707 077


OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: Rust front-end patches v4

2022-12-13 Thread Arthur Cohen
We've also added one more commit, which only affects files inside the 
Rust front-end folder. This commit adds an experimental flag, which 
blocks the compilation of Rust code when not used. We hope this helps 
indicate to users that the compiler is not yet ready, but can still be 
experimented with :)


We plan on removing that flag as soon as possible, but in the meantime, 
we think it will help not creating divide within the Rust ecosystem, as 
well as not waste Rust crate maintainers' time.


Thanks again,

Arthur

On 12/13/22 14:26, Arthur Cohen wrote:

Hi everyone,

I have pushed the commits onto master. Thanks again for all the reviews 
and support.


Thank you, and congratulations, to all the contributors.

All the best,

Arthur

On 12/6/22 12:03, Richard Biener via Gcc-rust wrote:

On Tue, Dec 6, 2022 at 11:11 AM  wrote:


This patchset contains the fixed version of our most recent patchset. We
have fixed most of the issues noted in the previous round of reviews, 
and are
keeping some for later as they would otherwise create too many 
conflicts with

our updated development branch.

Similarly to the previous round of patches, this patchset does not 
contain any
new features - only fixes for the reviews of the v3. New features 
will follow

shortly once that first patchset is merged.

Once again, thank you to all the contributors who made this possible and
especially to Philip Herron for his dedication to the project.


Thanks a lot - this is OK to merge now, thanks for your patience and I'm
looking forward for the future improvements.

Thanks,
Richard.


You can see the current status of our work on our branch:
https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/devel/rust/master

Patch status table:

An 'x' in the first column indicates a patch has been reviewed.
One in the second column indicates that a patch has been approved.

|0001-Use-DW_ATE_UTF-for-the-Rust-char-type.patch   |x|x|
|0002-gccrs-Add-necessary-hooks-for-a-Rust-front-end-tests.patch|x|x|
|0003-gccrs-Add-Debug-info-testsuite.patch  | | |
|0004-gccrs-Add-link-cases-testsuite.patch  | | |
|0005-gccrs-Add-general-compilation-test-cases.patch    | | |
|0006-gccrs-Add-execution-test-cases.patch  | | |
|0007-gccrs-Add-gcc-check-target-check-rust.patch   |x| |
|0008-gccrs-Add-Rust-front-end-base-AST-data-structures.patch   | | |
|0009-gccrs-Add-definitions-of-Rust-Items-in-AST-data-stru.patch| | |
|0010-gccrs-Add-full-definitions-of-Rust-AST-data-structur.patch| | |
|0011-gccrs-Add-Rust-AST-visitors.patch | | |
|0012-gccrs-Add-Lexer-for-Rust-front-end.patch  |x| |
|0013-gccrs-Add-Parser-for-Rust-front-end-pt.1.patch    | | |
|0014-gccrs-Add-Parser-for-Rust-front-end-pt.2.patch    | | |
|0015-gccrs-Add-expansion-pass-for-the-Rust-front-end.patch | | |
|0016-gccrs-Add-name-resolution-pass-to-the-Rust-front-end.patch| | |
|0017-gccrs-Add-declarations-for-Rust-HIR.patch | | |
|0018-gccrs-Add-HIR-definitions-and-visitor-framework.patch | | |
|0019-gccrs-Add-AST-to-HIR-lowering-pass.patch  | | |
|0020-gccrs-Add-wrapper-for-make_unique.patch   | | |
|0021-gccrs-Add-port-of-FNV-hash-used-during-legacy-symbol.patch| | |
|0022-gccrs-Add-Rust-ABI-enum-helpers.patch | | |
|0023-gccrs-Add-Base62-implementation.patch | | |
|0024-gccrs-Add-implementation-of-Optional.patch    | | |
|0025-gccrs-Add-attributes-checker.patch    | | |
|0026-gccrs-Add-helpers-mappings-canonical-path-and-lang-i.patch| | |
|0027-gccrs-Add-type-resolution-and-trait-solving-pass.patch    | | |
|0028-gccrs-Add-Rust-type-information.patch | | |
|0029-gccrs-Add-remaining-type-system-transformations.patch | | |
|0030-gccrs-Add-unsafe-checks-for-Rust.patch    | | |
|0031-gccrs-Add-const-checker.patch | | |
|0032-gccrs-Add-privacy-checks.patch    | | |
|0033-gccrs-Add-dead-code-scan-on-HIR.patch | | |
|0034-gccrs-Add-unused-variable-scan.patch  | | |
|0035-gccrs-Add-metadata-output-pass.patch  | | |
|0036-gccrs-Add-base-for-HIR-to-GCC-GENERIC-lowering.patch  | | |
|0037-gccrs-Add-HIR-to-GCC-GENERIC-lowering-for-all-nodes.patch |x|x|
|0038-gccrs-Add-HIR-to-GCC-GENERIC-lowering-entry-point.patch   |x|x|
|0039-gccrs-These-are-wrappers-ported-from-reusing-gccgo.patch  | | |
|0040-gccrs-Add-GCC-Rust-front-end-Make-lang.in.patch   |x| |
|0041-gccrs-Add-config-lang.in.patch    |x|x|
|0042-gccrs-Add-lang-spec.h.patch   | | |
|0043-gccrs-Add-lang.opt.patch  |x| |
|0044-gccrs-Add-compiler-driver.patch   | | |
|0045-gccrs-Compiler-proper-interface-ki

Re: Rust front-end patches v4

2022-12-13 Thread Martin Liška
On 12/13/22 14:26, Arthur Cohen wrote:
> Thank you, and congratulations, to all the contributors.

We thank you!! Congratulations.

I have one question: do you have a list of supported architectures Rust FE
can support right now? What are you plans for GCC 13.1 release?

Cheers,
Martin

> 
> All the best,
> 
> Arthur



Re: Rust front-end patches v4

2022-12-13 Thread Arthur Cohen

Hi everyone,

I have pushed the commits onto master. Thanks again for all the reviews 
and support.


Thank you, and congratulations, to all the contributors.

All the best,

Arthur

On 12/6/22 12:03, Richard Biener via Gcc-rust wrote:

On Tue, Dec 6, 2022 at 11:11 AM  wrote:


This patchset contains the fixed version of our most recent patchset. We
have fixed most of the issues noted in the previous round of reviews, and are
keeping some for later as they would otherwise create too many conflicts with
our updated development branch.

Similarly to the previous round of patches, this patchset does not contain any
new features - only fixes for the reviews of the v3. New features will follow
shortly once that first patchset is merged.

Once again, thank you to all the contributors who made this possible and
especially to Philip Herron for his dedication to the project.


Thanks a lot - this is OK to merge now, thanks for your patience and I'm
looking forward for the future improvements.

Thanks,
Richard.


You can see the current status of our work on our branch:
https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/devel/rust/master

Patch status table:

An 'x' in the first column indicates a patch has been reviewed.
One in the second column indicates that a patch has been approved.

|0001-Use-DW_ATE_UTF-for-the-Rust-char-type.patch   |x|x|
|0002-gccrs-Add-necessary-hooks-for-a-Rust-front-end-tests.patch|x|x|
|0003-gccrs-Add-Debug-info-testsuite.patch  | | |
|0004-gccrs-Add-link-cases-testsuite.patch  | | |
|0005-gccrs-Add-general-compilation-test-cases.patch| | |
|0006-gccrs-Add-execution-test-cases.patch  | | |
|0007-gccrs-Add-gcc-check-target-check-rust.patch   |x| |
|0008-gccrs-Add-Rust-front-end-base-AST-data-structures.patch   | | |
|0009-gccrs-Add-definitions-of-Rust-Items-in-AST-data-stru.patch| | |
|0010-gccrs-Add-full-definitions-of-Rust-AST-data-structur.patch| | |
|0011-gccrs-Add-Rust-AST-visitors.patch | | |
|0012-gccrs-Add-Lexer-for-Rust-front-end.patch  |x| |
|0013-gccrs-Add-Parser-for-Rust-front-end-pt.1.patch| | |
|0014-gccrs-Add-Parser-for-Rust-front-end-pt.2.patch| | |
|0015-gccrs-Add-expansion-pass-for-the-Rust-front-end.patch | | |
|0016-gccrs-Add-name-resolution-pass-to-the-Rust-front-end.patch| | |
|0017-gccrs-Add-declarations-for-Rust-HIR.patch | | |
|0018-gccrs-Add-HIR-definitions-and-visitor-framework.patch | | |
|0019-gccrs-Add-AST-to-HIR-lowering-pass.patch  | | |
|0020-gccrs-Add-wrapper-for-make_unique.patch   | | |
|0021-gccrs-Add-port-of-FNV-hash-used-during-legacy-symbol.patch| | |
|0022-gccrs-Add-Rust-ABI-enum-helpers.patch | | |
|0023-gccrs-Add-Base62-implementation.patch | | |
|0024-gccrs-Add-implementation-of-Optional.patch| | |
|0025-gccrs-Add-attributes-checker.patch| | |
|0026-gccrs-Add-helpers-mappings-canonical-path-and-lang-i.patch| | |
|0027-gccrs-Add-type-resolution-and-trait-solving-pass.patch| | |
|0028-gccrs-Add-Rust-type-information.patch | | |
|0029-gccrs-Add-remaining-type-system-transformations.patch | | |
|0030-gccrs-Add-unsafe-checks-for-Rust.patch| | |
|0031-gccrs-Add-const-checker.patch | | |
|0032-gccrs-Add-privacy-checks.patch| | |
|0033-gccrs-Add-dead-code-scan-on-HIR.patch | | |
|0034-gccrs-Add-unused-variable-scan.patch  | | |
|0035-gccrs-Add-metadata-output-pass.patch  | | |
|0036-gccrs-Add-base-for-HIR-to-GCC-GENERIC-lowering.patch  | | |
|0037-gccrs-Add-HIR-to-GCC-GENERIC-lowering-for-all-nodes.patch |x|x|
|0038-gccrs-Add-HIR-to-GCC-GENERIC-lowering-entry-point.patch   |x|x|
|0039-gccrs-These-are-wrappers-ported-from-reusing-gccgo.patch  | | |
|0040-gccrs-Add-GCC-Rust-front-end-Make-lang.in.patch   |x| |
|0041-gccrs-Add-config-lang.in.patch|x|x|
|0042-gccrs-Add-lang-spec.h.patch   | | |
|0043-gccrs-Add-lang.opt.patch  |x| |
|0044-gccrs-Add-compiler-driver.patch   | | |
|0045-gccrs-Compiler-proper-interface-kicks-off-the-pipeli.patch| | |
|0046-gccrs-Add-README-CONTRIBUTING-and-compiler-logo.patch | | |

Patches 34 to 39 and 44 to 45 interact with common GCC APIs:

0034-gccrs-Add-unused-variable-scan.patch
0035-gccrs-Add-metadata-output-pass.patch
0036-gccrs-Add-base-for-HIR-to-GCC-GENERIC-lowering.patch
0037-gccrs-Add-HIR-to-GCC-GENERIC-lowering-for-all-nodes.patch
0038-gccrs-Add-HIR-to-GCC-GENERIC-lowering-entry-point.patch
0039-gccrs-These-are-wrappers-ported-from-reusing-gccgo.patch
0044-gccrs-Add-compiler-driver.patch
0045-gccrs-Compiler-proper-interface-kicks

Re: [PATCH Rust front-end v4 46/46] gccrs: Add README, CONTRIBUTING and compiler logo

2022-12-13 Thread Martin Liška
On 12/13/22 02:43, Joseph Myers wrote:
> On Fri, 9 Dec 2022, Martin Liška wrote:
> 
>> On 12/6/22 11:14, arthur.co...@embecosm.com wrote:
>>> |We still need to write out a documentation section, but these READMEs will 
>>> help in the meantime.|
>>
>> Hello.
>>
>> Just a quick comment: The Sphinx conversion didn't make it for all GCC 
>> manuals. However, you can still use Sphinx for a newly created manual, 
>> similarly to what libgccjit or Ada manuals do.
> 
> I would also encourage people using Sphinx for a newly created manual to 
> consider setting up common build infrastructure for such manuals, possibly 
> based on that used in the attempted Sphinx conversion.  It may be easier 
> to get common infrastructure for such manuals into shape if it's initially 
> only being used for one or two manuals - that is, if the addition of such 
> infrastructure isn't done at the same time as converting any existing 
> manuals to use Sphinx, or even converting any existing manuals using 
> Sphinx to use such infrastructure.
> 

Hi.

If the Rust folks are willing to use Sphinx, then yes, I'm going to prepare a 
common
infrastructure (baseconf.py, common license files and a common Makefile). So 
something
similar to what I prepared for the Sphinx conversion that didn't make it.

Cheers,
Martin


Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-13 Thread Richard Biener via Gcc-patches
On Mon, 12 Dec 2022, Jan Hubicka wrote:

> > > Hi,
> > > 
> > > I'm re-posting patches which I have posted at the end of stage 1 but
> > > which have not passed review yet.
> > > 
> > > 8<
> > > 
> > > I have noticed that scan_expr_access passes all the expressions it
> > > gets to get_ref_base_and_extent even when we are really only
> > > interested in memory accesses.  So bail out when the expression is
> > > something clearly uninteresting.
> > > 
> > > Bootstrapped and tested individually when I originally posted it and
> > > now bootstrapped and LTO-bootstrapped and tested as part of the whole
> > > series.  OK for master?
> > > 
> > > 
> > > gcc/ChangeLog:
> > > 
> > > 2021-12-14  Martin Jambor  
> > > 
> > >   * ipa-sra.c (scan_expr_access): Bail out early if expr is something we
> > >   clearly do not need to pass to get_ref_base_and_extent.
> > > ---
> > >  gcc/ipa-sra.cc | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
> > > index 93fceeafc73..3646d71468c 100644
> > > --- a/gcc/ipa-sra.cc
> > > +++ b/gcc/ipa-sra.cc
> > > @@ -1748,6 +1748,11 @@ scan_expr_access (tree expr, gimple *stmt, 
> > > isra_scan_context ctx,
> > >|| TREE_CODE (expr) == REALPART_EXPR)
> > >  expr = TREE_OPERAND (expr, 0);
> > >  
> > > +  if (!handled_component_p (expr)
> > > +  && !DECL_P (expr)
> > > +  && TREE_CODE (expr) != MEM_REF)
> > > +return;
> > Is this needed because get_ref_base_and_extend crashes if given SSA_NAME
> > or something else or is it just optimization?
> > Perhaps Richi will know if there is better test for this.

Also the code preceeding the above

  if (TREE_CODE (expr) == BIT_FIELD_REF
  || TREE_CODE (expr) == IMAGPART_EXPR
  || TREE_CODE (expr) == REALPART_EXPR)
expr = TREE_OPERAND (expr, 0); 

but get_ref_base_and_extent shouldn't crash on anything here.  The 
question is what you want 'expr' to be?  The comment of the function
says CTX specifies that, but doesn't constrain the CALL case (does
it have to be a memory argument)?

With allowing handled_component_p but above not handling
VIEW_CONVERT_EXPR you leave the possibility of VIEW_CONVERT_EXPR (d_1)
slipping through.  Since the non-memory cases will have at most
one wrapping handled_component get_ref_base_and_extent should be
reasonably cheap, so maybe just cut off SSA_NAME, ADDR_EXPR and
CONSTANT_CLASS_P at the start of the function?

> Looking at:
> 
> static inline bool
> gimple_assign_load_p (const gimple *gs)
> {
>   tree rhs;
>   if (!gimple_assign_single_p (gs))
> return false;
>   rhs = gimple_assign_rhs1 (gs);
>   if (TREE_CODE (rhs) == WITH_SIZE_EXPR)
> return true;
>   rhs = get_base_address (rhs);
>   return (DECL_P (rhs)
>   || TREE_CODE (rhs) == MEM_REF || TREE_CODE (rhs) == TARGET_MEM_REF);
> } 
> 
> I wonder if we don't want to avoid get_base_address (which is loopy) and
> use same check and move it into a new predicate that is more convenient
> to use?
> 
> Honza
> > 
> > Honza
> > > +
> > >base = get_ref_base_and_extent (expr, &poffset, &psize, &pmax_size, 
> > > &reverse);
> > >  
> > >if (TREE_CODE (base) == MEM_REF)
> > > -- 
> > > 2.38.1
> > > 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


[PATCH] build: doc: Obsolete Solaris 11.3 support

2022-12-13 Thread Rainer Orth
This patch implements the Solaris 11.[0-3] obsoletion just announced in

https://gcc.gnu.org/pipermail/gcc/2022-December/240322.html

Bootstrapped without regressions on Solaris 11.3 (i386-pc-solaris2.11,
sparc-sun-solaris2.11 without and with --enable-obsolete) and 11.4.

Ok for trunk?

While I've been extra careful with the config.gcc part to make it work
correctly in native and cross configurations, it would be good if some
build maintainer could check.

The trouble is that config.guess doesn't include the minor version in
the triple and even if that were to change now, it's guaranteed to break
lots of code that doesn't expect this, so I'm doing the determination
locally.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2022-12-09  Rainer Orth  

gcc:
* config.gcc: Determine Solaris minor version.
Obsolete *-*-solaris2.11.[0-3]*.
* doc/install.texi (Specific, *-*-solaris2*): Document it.

# HG changeset patch
# Parent  224d7e66257de134e767773473a133a1e4372118
build: doc: Obsolete Solaris 11.3 support

diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -246,14 +246,25 @@ target_type_format_char='@'
 xm_file=
 md_file=
 
+# Determine Solaris minor version
+case ${target}:`uname -v` in
+  # Only do this on Solaris.  Illumos uses illumos-* instead.
+  *-*-solaris2.11*:11.*)
+# Restrict to native configurations.
+if test x$host = x$target; then
+  uname_version="`uname -v`"
+  # Prepend dot as needed below.
+  target_min=".`expr "$uname_version" : '11\.\([0-9]*\)'`"
+fi
+;;
+esac
+
 # Obsolete configurations.
-case ${target} in
-  *)
-  ;;
-  obsoleted-target \
+case ${target}${target_min} in
+*-*-solaris2.11.[0-3]*		\
  )
 if test "x$enable_obsolete" != xyes; then
-  echo "*** Configuration ${target} is obsolete." >&2
+  echo "*** Configuration ${target}${target_min} is obsolete." >&2
   echo "*** Specify --enable-obsolete to build it anyway." >&2
   echo "*** Support will be REMOVED in the next major release of GCC," >&2
   echo "*** unless a maintainer comes forward." >&2
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -4825,6 +4825,8 @@ supported as cross-compilation target on
 @c alone is too unspecific and must be avoided.
 @anchor{x-x-solaris2}
 @heading *-*-solaris2*
+Support for Solaris 11.3 and earlier has been obsoleted in GCC 13, but
+can still be enabled by configuring with @option{--enable-obsolete}.
 Support for Solaris 10 has been removed in GCC 10.  Support for Solaris
 9 has been removed in GCC 5.  Support for Solaris 8 has been removed in
 GCC 4.8.  Support for Solaris 7 has been removed in GCC 4.6.


Re: [PATCH] i386: Fix up *concat*_{5,6,7} patterns [PR108044]

2022-12-13 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 13, 2022 at 10:20 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The following patch fixes 2 issues with the *concat3_5 and
> *concat3_{6,7} patterns.
> One is that if the destination is memory rather than register, then
> we can't use movabsq and so can't support all the possible immediates.
> I see 3 possibilities to fix that.  One would be to use
> x86_64_hilo_int_operand predicate instead of const_scalar_int_operand
> and thus not match it at all during combine in such cases, but that
> unnecessarily pessimizes also the case when it is loaded into register
> where we can use movabsq.
> Another one is what is implemented in the patch, use Wd constraint
> for the integer on 64-bit if destination is memory and X (didn't find
> more appropriate one which would accept any const_int/const_wide_int
> and the value checking is done in the conditions) otherwise.

Perhaps you should use "n" instead of "X".

> Yet another option would be to add match_scratch to the pattern and use
> it with =X constraints except for the =o case for 64-bit non-Wd where it
> would give a single DImode register (rather than 2).
>
> Another thing is that if one half of the constant is
> ix86_endbr_immediate_operand, then for -fcf-protection=branch we
> force those constants into memory and that might not work properly
> with -fpic.  So we should refuse to match with such constants.
> OT, seems for movabsq we don't check that and happily allow the endbr
> pattern in the immediate.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
> or do you prefer another way to do it (see above)?

> 2022-12-13  Jakub Jelinek  
>
> PR target/108044
> * config/i386/i386.md (*concat3_5, *concat3_6,
> *concat3_7): Split alternative with =ro output constraint
> into =r,o,o and use Wd input constraint for the last alternative which
> is enabled for TARGET_64BIT.  Reject ix86_endbr_immediate_operand
> in the input constant.
>
> * gcc.target/i386/pr108044-1.c: New test.
> * gcc.target/i386/pr108044-2.c: New test.
> * gcc.target/i386/pr108044-3.c: New test.
> * gcc.target/i386/pr108044-4.c: New test.

OK with or without the change to "n" constraint, although I would
prefer "n", since "X" can perhaps result in some yet unknown
surprises.

Thanks,
Uros.
> --- gcc/config/i386/i386.md.jj  2022-12-08 14:55:38.807303856 +0100
> +++ gcc/config/i386/i386.md 2022-12-12 10:37:09.332995296 +0100
> @@ -11470,11 +11470,11 @@ (define_insn_and_split "*concat  })
>
>  (define_insn_and_split "*concat3_5"
> -  [(set (match_operand:DWI 0 "nonimmediate_operand" "=ro")
> +  [(set (match_operand:DWI 0 "nonimmediate_operand" "=r,o,o")
> (any_or_plus:DWI
> - (ashift:DWI (match_operand:DWI 1 "register_operand" "r")
> + (ashift:DWI (match_operand:DWI 1 "register_operand" "r,r,r")
>   (match_operand:DWI 2 "const_int_operand"))
> - (match_operand:DWI 3 "const_scalar_int_operand")))]
> + (match_operand:DWI 3 "const_scalar_int_operand" "X,X,Wd")))]
>"INTVAL (operands[2]) ==  * BITS_PER_UNIT / 2
> && (mode == DImode
> ? CONST_INT_P (operands[3])
> @@ -11482,7 +11482,12 @@ (define_insn_and_split "*concat : CONST_INT_P (operands[3])
> ? INTVAL (operands[3]) >= 0
> : CONST_WIDE_INT_NUNITS (operands[3]) == 2
> -&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)"
> +&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)
> +   && !(CONST_INT_P (operands[3])
> +   ? ix86_endbr_immediate_operand (operands[3], VOIDmode)
> +   : ix86_endbr_immediate_operand (GEN_INT (CONST_WIDE_INT_ELT 
> (operands[3],
> +0)),
> +   VOIDmode))"
>"#"
>"&& reload_completed"
>[(clobber (const_int 0))]
> @@ -11491,16 +11496,17 @@ (define_insn_and_split "*concatsplit_double_concat (mode, operands[0], op3,
>gen_lowpart (mode, operands[1]));
>DONE;
> -})
> +}
> +  [(set_attr "isa" "*,nox64,x64")])
>
>  (define_insn_and_split "*concat3_6"
> -  [(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
> +  [(set (match_operand: 0 "nonimmediate_operand" "=r,o,o,r")
> (any_or_plus:
>   (ashift:
> (zero_extend:
> - (match_operand:DWIH 1 "nonimmediate_operand" "r,m"))
> + (match_operand:DWIH 1 "nonimmediate_operand" "r,r,r,m"))
> (match_operand: 2 "const_int_operand"))
> - (match_operand: 3 "const_scalar_int_operand")))]
> + (match_operand: 3 "const_scalar_int_operand" "X,X,Wd,X")))]
>"INTVAL (operands[2]) ==  * BITS_PER_UNIT
> && (mode == DImode
> ? CONST_INT_P (operands[3])
> @@ -11508,7 +11514,12 @@ (define_insn_and_split "*concat : CONST_INT_P (operands[3])
> ? INTVAL (operands[3]) >= 0
> : CONST_WIDE_INT_NUNITS (operands[3]) == 2
> -&& CONST

[PATCH] tree-optimization/108076 - if-conversion and forced labels

2022-12-13 Thread Richard Biener via Gcc-patches
When doing if-conversion we simply throw away labels without checking
whether they are possibly targets of non-local gotos or have their
address taken.  The following rectifies this and refuses to if-convert
such loops.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108076
* tree-if-conv.cc (if_convertible_loop_p_1): Reject blocks
with non-local or forced labels that we later remove
labels from.

* gcc.dg/torture/pr108076.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108076.c | 17 +
 gcc/tree-if-conv.cc | 14 --
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108076.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108076.c 
b/gcc/testsuite/gcc.dg/torture/pr108076.c
new file mode 100644
index 000..ebe2e51bee0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108076.c
@@ -0,0 +1,17 @@
+/* { dg-do link } */
+
+static void *j;
+int v, g;
+__attribute__((__leaf__)) int atoi (const char *);
+
+int
+main ()
+{
+  j = &&lab1;
+  &&lab2;
+  atoi ("42");
+lab1:
+lab2:
+  if (v)
+goto *j;
+}
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 64b20b4a9e1..0807201cefb 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1433,10 +1433,20 @@ if_convertible_loop_p_1 (class loop *loop, 
vec *refs)
   basic_block bb = ifc_bbs[i];
   gimple_stmt_iterator gsi;
 
+  bool may_have_nonlocal_labels
+   = bb_with_exit_edge_p (loop, bb) || bb == loop->latch;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
switch (gimple_code (gsi_stmt (gsi)))
  {
  case GIMPLE_LABEL:
+   if (!may_have_nonlocal_labels)
+ {
+   tree label
+ = gimple_label_label (as_a  (gsi_stmt (gsi)));
+   if (DECL_NONLOCAL (label) || FORCED_LABEL (label))
+ return false;
+ }
+   /* Fallthru.  */
  case GIMPLE_ASSIGN:
  case GIMPLE_CALL:
  case GIMPLE_DEBUG:
@@ -2627,8 +2637,8 @@ remove_conditions_and_labels (loop_p loop)
   basic_block bb = ifc_bbs[i];
 
   if (bb_with_exit_edge_p (loop, bb)
-|| bb == loop->latch)
-  continue;
+ || bb == loop->latch)
+   continue;
 
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
switch (gimple_code (gsi_stmt (gsi)))
-- 
2.35.3


Re: [PATCH] vect-patterns: Fix up vect_recog_rotate_pattern [PR108064]

2022-12-13 Thread Richard Biener via Gcc-patches



> Am 13.12.2022 um 10:28 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> Since vect_recog_rotate_pattern has been extended to work also
> on signed types in r13-1100 we miscompile the testcase below.
> vect_recog_rotate_pattern actually emits correct scalar code into
> the pattern def sequence (in particular cast to utype, doing the
> 2 shifts in utype so that the right shift is logical and not arithmetic,
> or and then cast back to the signed type), but it didn't supply vectype
> for most of those pattern statements, which means that the generic handling
> fills it up later with the vectype provided by vect_recog_rotate_pattern.
> The problem is that it is vectype of the result of the whole pattern,
> i.e. vector of signed values in this case, while the conversion to utype,
> 2 shifts and or (everything with utype lhs in scalar code) should have
> uvectype as STMT_VINFO_VECTYPE.

What an interesting trap…

> Fixed with following patch, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?

Ok.

Thanks,
Richard 


> 2022-12-13  Jakub Jelinek  
> 
>PR tree-optimization/108064
>* tree-vect-patterns.cc (vect_recog_rotate_pattern): Pass uvectype
>as 4th argument to append_pattern_def_seq for statements with lhs
>with utype type.
> 
>* gcc.c-torture/execute/pr108064.c: New test.
> 
> --- gcc/tree-vect-patterns.cc.jj2022-12-05 11:10:37.0 +0100
> +++ gcc/tree-vect-patterns.cc2022-12-12 13:14:23.356628767 +0100
> @@ -3113,7 +3113,7 @@ vect_recog_rotate_pattern (vec_info *vin
> {
>   def = vect_recog_temp_ssa_var (utype, NULL);
>   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd0);
> -  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
>   oprnd0 = def;
> }
> 
> @@ -3137,7 +3137,7 @@ vect_recog_rotate_pattern (vec_info *vin
> {
>   def = vect_recog_temp_ssa_var (utype, NULL);
>   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
> -  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
> }
>   stype = TREE_TYPE (def);
> 
> @@ -3185,13 +3185,13 @@ vect_recog_rotate_pattern (vec_info *vin
>   def_stmt = gimple_build_assign (var1, rhs_code == LROTATE_EXPR
>? LSHIFT_EXPR : RSHIFT_EXPR,
>  oprnd0, def);
> -  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
> 
>   var2 = vect_recog_temp_ssa_var (utype, NULL);
>   def_stmt = gimple_build_assign (var2, rhs_code == LROTATE_EXPR
>? RSHIFT_EXPR : LSHIFT_EXPR,
>  oprnd0, def2);
> -  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
> 
>   /* Pattern detected.  */
>   vect_pattern_detected ("vect_recog_rotate_pattern", last_stmt);
> @@ -3202,7 +3202,7 @@ vect_recog_rotate_pattern (vec_info *vin
> 
>   if (!useless_type_conversion_p (type, utype))
> {
> -  append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt);
> +  append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, uvectype);
>   tree result = vect_recog_temp_ssa_var (type, NULL);
>   pattern_stmt = gimple_build_assign (result, NOP_EXPR, var);
> }
> --- gcc/testsuite/gcc.c-torture/execute/pr108064.c.jj2022-12-12 
> 13:22:29.875542508 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr108064.c2022-12-12 
> 13:21:32.516377957 +0100
> @@ -0,0 +1,28 @@
> +/* PR tree-optimization/108064 */
> +
> +static inline short
> +foo (short value)
> +{
> +  return ((value >> 8) & 0xff) | ((value & 0xff) << 8);
> +}
> +
> +__attribute__((noipa))
> +void
> +bar (short *d, const short *s)
> +{
> +  for (unsigned long i = 0; i < 4; i++)
> +d[i] = foo (s[i]);
> +}
> +
> +int
> +main ()
> +{
> +  short a[4] __attribute__((aligned (16))) = { 0xff, 0, 0, 0 };
> +  short b[4] __attribute__((aligned (16)));
> +  short c[4] __attribute__((aligned (16)));
> +
> +  bar (b, a);
> +  bar (c, b);
> +  if (a[0] != c[0])
> +__builtin_abort ();
> +}
> 
>Jakub
> 


Re: [PATCH] libstdc++: Update backtrace-rename.h

2022-12-13 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 13, 2022 at 09:50:23AM +, Jonathan Wakely via Gcc-patches wrote:
> On Tue, 13 Dec 2022 at 09:44, Jakub Jelinek  wrote:
> >
> > Hi!
> >
> > When writing the r13-4629 commit log I've realized that libsanitizer
> > isn't the only place which nowadays renames libbacktrace symbols,
> > libstdc++ does that too.
> >
> > Ok for trunk if this passes bootstrap/regtest?
> 
> OK, thanks.
> 
> When we move the backtrace symbols from libstdc++_libbacktrace.a into
> libstdc++.so we probably want to look into removing the symbols we
> don't actually use. Renaming them to our private namespace is good,
> but not including them in the library at all would be better.

Most of them I assume are actually used, the reason they aren't static
is that libbacktrace contains multiple TUs and some APIs are used to
interface between the TUs.  __attribute__((visibility ("hidden"))) would
work for shared libraries and targets where it actually works, but
we still have the *.a libraries where such symbols are visible, so I think
some renaming is needed.

Though, my understanding of backtrace_uncompress_{lzma,zdebug,zstd} is that
those are there just as small wrappers for make check purposes only,
so I bet those symbols could be easily removed (say by not defining them
at all if their names are macros, then we could keep them in
backtrace-rename.h as is).

> > 2022-12-13  Jakub Jelinek  
> >
> > * src/libbacktrace/backtrace-rename.h (backtrace_uncompress_zstd):
> > Define.
> >
> > --- libstdc++-v3/src/libbacktrace/backtrace-rename.h.jj 2022-09-01 
> > 09:37:58.452624676 +0200
> > +++ libstdc++-v3/src/libbacktrace/backtrace-rename.h2022-12-13 
> > 10:41:14.551699599 +0100
> > @@ -16,6 +16,7 @@
> >  #define backtrace_syminfo __glibcxx_backtrace_syminfo
> >  #define backtrace_uncompress_lzma __glibcxx_backtrace_uncompress_lzma
> >  #define backtrace_uncompress_zdebug __glibcxx_backtrace_uncompress_zdebug
> > +#define backtrace_uncompress_zstd __glibcxx_backtrace_uncompress_zstd
> >  #define backtrace_vector_finish __glibcxx_backtrace_vector_finish
> >  #define backtrace_vector_grow __glibcxx_backtrace_vector_grow
> >  #define backtrace_vector_release __glibcxx_backtrace_vector_release

Jakub



Re: [PATCH] c++, libstdc++: Add typeinfo for _Float{16, 32, 64, 128, 32x, 64x} and __bf16 types [PR108075]

2022-12-13 Thread Jonathan Wakely via Gcc-patches
On Tue, 13 Dec 2022 at 09:40, Jakub Jelinek wrote:
>
> Hi!
>
> The following patch adds typeinfos for the extended floating point
> types and _Float{32,64}x.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The libstdc++ parts look good, thanks.


> 2022-12-13  Jakub Jelinek  
>
> PR libstdc++/108075
> gcc/cp/
> * rtti.cc (emit_support_tinfos): Add pointers to
> {bfloat16,float{16,32,64,128,32x,64x,128x}}_type_node to fundamentals
> array.
> gcc/testsuite/
> * g++.dg/cpp23/ext-floating13.C: New test.
> libstdc++-v3/
> * config/abi/pre/gnu.ver (CXXABI_1.3.14): Export
> _ZTIDF[0-9]*[_bx], _ZTIPDF[0-9]*[_bx] and _ZTIPKDF[0-9]*[_bx].
> * testsuite/util/testsuite_abi.cc (check_version): Handle
> CXXABI_1.3.14.
>
> --- gcc/cp/rtti.cc.jj   2022-10-17 12:29:33.519016406 +0200
> +++ gcc/cp/rtti.cc  2022-12-12 15:23:48.244190755 +0100
> @@ -1603,7 +1603,9 @@ emit_support_tinfos (void)
>  &long_long_integer_type_node, &long_long_unsigned_type_node,
>  &float_type_node, &double_type_node, &long_double_type_node,
>  &dfloat32_type_node, &dfloat64_type_node, &dfloat128_type_node,
> -&nullptr_type_node,
> +&bfloat16_type_node, &float16_type_node, &float32_type_node,
> +&float64_type_node, &float128_type_node, &float32x_type_node,
> +&float64x_type_node, &float128x_type_node, &nullptr_type_node,
>  0
>};
>int ix;
> --- gcc/testsuite/g++.dg/cpp23/ext-floating13.C.jj  2022-12-12 
> 15:38:51.357009408 +0100
> +++ gcc/testsuite/g++.dg/cpp23/ext-floating13.C 2022-12-12 15:39:04.568816597 
> +0100
> @@ -0,0 +1,35 @@
> +// P1467R9 - Extended floating-point types and standard names.
> +// { dg-do link { target c++23 } }
> +// { dg-options "" }
> +
> +#include 
> +
> +#ifdef __STDCPP_FLOAT16_T__
> +const std::type_info &a = typeid(decltype(0.0f16));
> +#endif
> +#ifdef __STDCPP_BFLOAT16_T__
> +const std::type_info &b = typeid(decltype(0.0bf16));
> +#endif
> +#ifdef __STDCPP_FLOAT32_T__
> +const std::type_info &c = typeid(decltype(0.0f32));
> +#endif
> +#ifdef __STDCPP_FLOAT64_T__
> +const std::type_info &d = typeid(decltype(0.0f64));
> +#endif
> +#ifdef __STDCPP_FLOAT128_T__
> +const std::type_info &e = typeid(decltype(0.0f128));
> +#endif
> +#ifdef __FLT32X_MAX__
> +const std::type_info &f = typeid(decltype(0.0f32x));
> +#endif
> +#ifdef __FLT64X_MAX__
> +const std::type_info &g = typeid(decltype(0.0f64x));
> +#endif
> +#ifdef __FLT128X_MAX__
> +const std::type_info &h = typeid(decltype(0.0f128x));
> +#endif
> +
> +int
> +main ()
> +{
> +}
> --- libstdc++-v3/config/abi/pre/gnu.ver.jj  2022-11-11 08:15:45.646183974 
> +0100
> +++ libstdc++-v3/config/abi/pre/gnu.ver 2022-12-12 15:34:08.178142084 +0100
> @@ -2794,6 +2794,16 @@ CXXABI_1.3.13 {
>
>  } CXXABI_1.3.12;
>
> +CXXABI_1.3.14 {
> +
> +# typeinfo for _Float{16,32,64,128,32x,64x,128x} and
> +# __bf16
> +_ZTIDF[0-9]*[_bx];
> +_ZTIPDF[0-9]*[_bx];
> +_ZTIPKDF[0-9]*[_bx];
> +
> +} CXXABI_1.3.13;
> +
>  # Symbols in the support library (libsupc++) supporting transactional memory.
>  CXXABI_TM_1 {
>
> --- libstdc++-v3/testsuite/util/testsuite_abi.cc.jj 2022-09-12 
> 11:30:14.224870022 +0200
> +++ libstdc++-v3/testsuite/util/testsuite_abi.cc2022-12-12 
> 15:46:41.036156477 +0100
> @@ -230,6 +230,7 @@ check_version(symbol& test, bool added)
>known_versions.push_back("CXXABI_1.3.11");
>known_versions.push_back("CXXABI_1.3.12");
>known_versions.push_back("CXXABI_1.3.13");
> +  known_versions.push_back("CXXABI_1.3.14");
>known_versions.push_back("CXXABI_IEEE128_1.3.13");
>known_versions.push_back("CXXABI_TM_1");
>known_versions.push_back("CXXABI_FLOAT128");
> @@ -251,7 +252,7 @@ check_version(symbol& test, bool added)
>bool latestp = (test.version_name == "GLIBCXX_3.4.31"
>   // XXX remove next line when baselines have been regenerated.
>  || test.version_name == "GLIBCXX_IEEE128_3.4.30"
> -|| test.version_name == "CXXABI_1.3.13"
> +|| test.version_name == "CXXABI_1.3.14"
>  || test.version_name == "CXXABI_FLOAT128"
>  || test.version_name == "CXXABI_TM_1");
>if (added && !latestp)
>
> Jakub
>



Re: [PATCH] libstdc++: Update backtrace-rename.h

2022-12-13 Thread Jonathan Wakely via Gcc-patches
On Tue, 13 Dec 2022 at 09:44, Jakub Jelinek  wrote:
>
> Hi!
>
> When writing the r13-4629 commit log I've realized that libsanitizer
> isn't the only place which nowadays renames libbacktrace symbols,
> libstdc++ does that too.
>
> Ok for trunk if this passes bootstrap/regtest?

OK, thanks.

When we move the backtrace symbols from libstdc++_libbacktrace.a into
libstdc++.so we probably want to look into removing the symbols we
don't actually use. Renaming them to our private namespace is good,
but not including them in the library at all would be better.


> 2022-12-13  Jakub Jelinek  
>
> * src/libbacktrace/backtrace-rename.h (backtrace_uncompress_zstd):
> Define.
>
> --- libstdc++-v3/src/libbacktrace/backtrace-rename.h.jj 2022-09-01 
> 09:37:58.452624676 +0200
> +++ libstdc++-v3/src/libbacktrace/backtrace-rename.h2022-12-13 
> 10:41:14.551699599 +0100
> @@ -16,6 +16,7 @@
>  #define backtrace_syminfo __glibcxx_backtrace_syminfo
>  #define backtrace_uncompress_lzma __glibcxx_backtrace_uncompress_lzma
>  #define backtrace_uncompress_zdebug __glibcxx_backtrace_uncompress_zdebug
> +#define backtrace_uncompress_zstd __glibcxx_backtrace_uncompress_zstd
>  #define backtrace_vector_finish __glibcxx_backtrace_vector_finish
>  #define backtrace_vector_grow __glibcxx_backtrace_vector_grow
>  #define backtrace_vector_release __glibcxx_backtrace_vector_release
>
> Jakub
>



[PATCH] libstdc++: Update backtrace-rename.h

2022-12-13 Thread Jakub Jelinek via Gcc-patches
Hi!

When writing the r13-4629 commit log I've realized that libsanitizer
isn't the only place which nowadays renames libbacktrace symbols,
libstdc++ does that too.

Ok for trunk if this passes bootstrap/regtest?

2022-12-13  Jakub Jelinek  

* src/libbacktrace/backtrace-rename.h (backtrace_uncompress_zstd):
Define.

--- libstdc++-v3/src/libbacktrace/backtrace-rename.h.jj 2022-09-01 
09:37:58.452624676 +0200
+++ libstdc++-v3/src/libbacktrace/backtrace-rename.h2022-12-13 
10:41:14.551699599 +0100
@@ -16,6 +16,7 @@
 #define backtrace_syminfo __glibcxx_backtrace_syminfo
 #define backtrace_uncompress_lzma __glibcxx_backtrace_uncompress_lzma
 #define backtrace_uncompress_zdebug __glibcxx_backtrace_uncompress_zdebug
+#define backtrace_uncompress_zstd __glibcxx_backtrace_uncompress_zstd
 #define backtrace_vector_finish __glibcxx_backtrace_vector_finish
 #define backtrace_vector_grow __glibcxx_backtrace_vector_grow
 #define backtrace_vector_release __glibcxx_backtrace_vector_release

Jakub



[PATCH] c++, libstdc++: Add typeinfo for _Float{16,32,64,128,32x,64x} and __bf16 types [PR108075]

2022-12-13 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch adds typeinfos for the extended floating point
types and _Float{32,64}x.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-13  Jakub Jelinek  

PR libstdc++/108075
gcc/cp/
* rtti.cc (emit_support_tinfos): Add pointers to
{bfloat16,float{16,32,64,128,32x,64x,128x}}_type_node to fundamentals
array.
gcc/testsuite/
* g++.dg/cpp23/ext-floating13.C: New test.
libstdc++-v3/
* config/abi/pre/gnu.ver (CXXABI_1.3.14): Export
_ZTIDF[0-9]*[_bx], _ZTIPDF[0-9]*[_bx] and _ZTIPKDF[0-9]*[_bx].
* testsuite/util/testsuite_abi.cc (check_version): Handle
CXXABI_1.3.14.

--- gcc/cp/rtti.cc.jj   2022-10-17 12:29:33.519016406 +0200
+++ gcc/cp/rtti.cc  2022-12-12 15:23:48.244190755 +0100
@@ -1603,7 +1603,9 @@ emit_support_tinfos (void)
 &long_long_integer_type_node, &long_long_unsigned_type_node,
 &float_type_node, &double_type_node, &long_double_type_node,
 &dfloat32_type_node, &dfloat64_type_node, &dfloat128_type_node,
-&nullptr_type_node,
+&bfloat16_type_node, &float16_type_node, &float32_type_node,
+&float64_type_node, &float128_type_node, &float32x_type_node,
+&float64x_type_node, &float128x_type_node, &nullptr_type_node,
 0
   };
   int ix;
--- gcc/testsuite/g++.dg/cpp23/ext-floating13.C.jj  2022-12-12 
15:38:51.357009408 +0100
+++ gcc/testsuite/g++.dg/cpp23/ext-floating13.C 2022-12-12 15:39:04.568816597 
+0100
@@ -0,0 +1,35 @@
+// P1467R9 - Extended floating-point types and standard names.
+// { dg-do link { target c++23 } }
+// { dg-options "" }
+
+#include 
+
+#ifdef __STDCPP_FLOAT16_T__
+const std::type_info &a = typeid(decltype(0.0f16));
+#endif
+#ifdef __STDCPP_BFLOAT16_T__
+const std::type_info &b = typeid(decltype(0.0bf16));
+#endif
+#ifdef __STDCPP_FLOAT32_T__
+const std::type_info &c = typeid(decltype(0.0f32));
+#endif
+#ifdef __STDCPP_FLOAT64_T__
+const std::type_info &d = typeid(decltype(0.0f64));
+#endif
+#ifdef __STDCPP_FLOAT128_T__
+const std::type_info &e = typeid(decltype(0.0f128));
+#endif
+#ifdef __FLT32X_MAX__
+const std::type_info &f = typeid(decltype(0.0f32x));
+#endif
+#ifdef __FLT64X_MAX__
+const std::type_info &g = typeid(decltype(0.0f64x));
+#endif
+#ifdef __FLT128X_MAX__
+const std::type_info &h = typeid(decltype(0.0f128x));
+#endif
+
+int
+main ()
+{
+}
--- libstdc++-v3/config/abi/pre/gnu.ver.jj  2022-11-11 08:15:45.646183974 
+0100
+++ libstdc++-v3/config/abi/pre/gnu.ver 2022-12-12 15:34:08.178142084 +0100
@@ -2794,6 +2794,16 @@ CXXABI_1.3.13 {
 
 } CXXABI_1.3.12;
 
+CXXABI_1.3.14 {
+
+# typeinfo for _Float{16,32,64,128,32x,64x,128x} and
+# __bf16
+_ZTIDF[0-9]*[_bx];
+_ZTIPDF[0-9]*[_bx];
+_ZTIPKDF[0-9]*[_bx];
+
+} CXXABI_1.3.13;
+
 # Symbols in the support library (libsupc++) supporting transactional memory.
 CXXABI_TM_1 {
 
--- libstdc++-v3/testsuite/util/testsuite_abi.cc.jj 2022-09-12 
11:30:14.224870022 +0200
+++ libstdc++-v3/testsuite/util/testsuite_abi.cc2022-12-12 
15:46:41.036156477 +0100
@@ -230,6 +230,7 @@ check_version(symbol& test, bool added)
   known_versions.push_back("CXXABI_1.3.11");
   known_versions.push_back("CXXABI_1.3.12");
   known_versions.push_back("CXXABI_1.3.13");
+  known_versions.push_back("CXXABI_1.3.14");
   known_versions.push_back("CXXABI_IEEE128_1.3.13");
   known_versions.push_back("CXXABI_TM_1");
   known_versions.push_back("CXXABI_FLOAT128");
@@ -251,7 +252,7 @@ check_version(symbol& test, bool added)
   bool latestp = (test.version_name == "GLIBCXX_3.4.31"
  // XXX remove next line when baselines have been regenerated.
 || test.version_name == "GLIBCXX_IEEE128_3.4.30"
-|| test.version_name == "CXXABI_1.3.13"
+|| test.version_name == "CXXABI_1.3.14"
 || test.version_name == "CXXABI_FLOAT128"
 || test.version_name == "CXXABI_TM_1");
   if (added && !latestp)

Jakub



[committed] libsanitizer: Fix up libbacktrace build after r13-4547 [PR108072]

2022-12-13 Thread Jakub Jelinek via Gcc-patches
Hi!

The r13-4547 commit added new non-static function to libbacktrace:
backtrace_uncompress_zstd but for the libsanitizer use we need to
rename it, so that it is in __asan_* namespace and doesn't clash
with other copies of libbacktrace.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed
to trunk as obvious.

2022-12-13  Jakub Jelinek  

PR sanitizer/108072
* libbacktrace/backtrace-rename.h (backtrace_uncompress_zstd): Define.

--- libsanitizer/libbacktrace/backtrace-rename.h.jj 2020-09-22 
09:49:22.532137619 +0200
+++ libsanitizer/libbacktrace/backtrace-rename.h2022-12-12 
14:09:01.681819952 +0100
@@ -13,6 +13,7 @@
 #define backtrace_syminfo __asan_backtrace_syminfo
 #define backtrace_uncompress_lzma __asan_backtrace_uncompress_lzma
 #define backtrace_uncompress_zdebug __asan_backtrace_uncompress_zdebug
+#define backtrace_uncompress_zstd __asan_backtrace_uncompress_zstd
 #define backtrace_vector_finish __asan_backtrace_vector_finish
 #define backtrace_vector_grow __asan_backtrace_vector_grow
 #define backtrace_vector_release __asan_backtrace_vector_release

Jakub



[PATCH] vect-patterns: Fix up vect_recog_rotate_pattern [PR108064]

2022-12-13 Thread Jakub Jelinek via Gcc-patches
Hi!

Since vect_recog_rotate_pattern has been extended to work also
on signed types in r13-1100 we miscompile the testcase below.
vect_recog_rotate_pattern actually emits correct scalar code into
the pattern def sequence (in particular cast to utype, doing the
2 shifts in utype so that the right shift is logical and not arithmetic,
or and then cast back to the signed type), but it didn't supply vectype
for most of those pattern statements, which means that the generic handling
fills it up later with the vectype provided by vect_recog_rotate_pattern.
The problem is that it is vectype of the result of the whole pattern,
i.e. vector of signed values in this case, while the conversion to utype,
2 shifts and or (everything with utype lhs in scalar code) should have
uvectype as STMT_VINFO_VECTYPE.

Fixed with following patch, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2022-12-13  Jakub Jelinek  

PR tree-optimization/108064
* tree-vect-patterns.cc (vect_recog_rotate_pattern): Pass uvectype
as 4th argument to append_pattern_def_seq for statements with lhs
with utype type.

* gcc.c-torture/execute/pr108064.c: New test.

--- gcc/tree-vect-patterns.cc.jj2022-12-05 11:10:37.0 +0100
+++ gcc/tree-vect-patterns.cc   2022-12-12 13:14:23.356628767 +0100
@@ -3113,7 +3113,7 @@ vect_recog_rotate_pattern (vec_info *vin
 {
   def = vect_recog_temp_ssa_var (utype, NULL);
   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd0);
-  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
+  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
   oprnd0 = def;
 }
 
@@ -3137,7 +3137,7 @@ vect_recog_rotate_pattern (vec_info *vin
 {
   def = vect_recog_temp_ssa_var (utype, NULL);
   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
-  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
+  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
 }
   stype = TREE_TYPE (def);
 
@@ -3185,13 +3185,13 @@ vect_recog_rotate_pattern (vec_info *vin
   def_stmt = gimple_build_assign (var1, rhs_code == LROTATE_EXPR
? LSHIFT_EXPR : RSHIFT_EXPR,
  oprnd0, def);
-  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
+  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
 
   var2 = vect_recog_temp_ssa_var (utype, NULL);
   def_stmt = gimple_build_assign (var2, rhs_code == LROTATE_EXPR
? RSHIFT_EXPR : LSHIFT_EXPR,
  oprnd0, def2);
-  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
+  append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt, uvectype);
 
   /* Pattern detected.  */
   vect_pattern_detected ("vect_recog_rotate_pattern", last_stmt);
@@ -3202,7 +3202,7 @@ vect_recog_rotate_pattern (vec_info *vin
 
   if (!useless_type_conversion_p (type, utype))
 {
-  append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt);
+  append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, uvectype);
   tree result = vect_recog_temp_ssa_var (type, NULL);
   pattern_stmt = gimple_build_assign (result, NOP_EXPR, var);
 }
--- gcc/testsuite/gcc.c-torture/execute/pr108064.c.jj   2022-12-12 
13:22:29.875542508 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr108064.c  2022-12-12 
13:21:32.516377957 +0100
@@ -0,0 +1,28 @@
+/* PR tree-optimization/108064 */
+
+static inline short
+foo (short value)
+{
+  return ((value >> 8) & 0xff) | ((value & 0xff) << 8);
+}
+
+__attribute__((noipa))
+void
+bar (short *d, const short *s)
+{
+  for (unsigned long i = 0; i < 4; i++)
+d[i] = foo (s[i]);
+}
+
+int
+main ()
+{
+  short a[4] __attribute__((aligned (16))) = { 0xff, 0, 0, 0 };
+  short b[4] __attribute__((aligned (16)));
+  short c[4] __attribute__((aligned (16)));
+
+  bar (b, a);
+  bar (c, b);
+  if (a[0] != c[0])
+__builtin_abort ();
+}

Jakub



[PATCH] i386: Fix up *concat*_{5,6,7} patterns [PR108044]

2022-12-13 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch fixes 2 issues with the *concat3_5 and
*concat3_{6,7} patterns.
One is that if the destination is memory rather than register, then
we can't use movabsq and so can't support all the possible immediates.
I see 3 possibilities to fix that.  One would be to use
x86_64_hilo_int_operand predicate instead of const_scalar_int_operand
and thus not match it at all during combine in such cases, but that
unnecessarily pessimizes also the case when it is loaded into register
where we can use movabsq.
Another one is what is implemented in the patch, use Wd constraint
for the integer on 64-bit if destination is memory and X (didn't find
more appropriate one which would accept any const_int/const_wide_int
and the value checking is done in the conditions) otherwise.
Yet another option would be to add match_scratch to the pattern and use
it with =X constraints except for the =o case for 64-bit non-Wd where it
would give a single DImode register (rather than 2).

Another thing is that if one half of the constant is
ix86_endbr_immediate_operand, then for -fcf-protection=branch we
force those constants into memory and that might not work properly
with -fpic.  So we should refuse to match with such constants.
OT, seems for movabsq we don't check that and happily allow the endbr
pattern in the immediate.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
or do you prefer another way to do it (see above)?

2022-12-13  Jakub Jelinek  

PR target/108044
* config/i386/i386.md (*concat3_5, *concat3_6,
*concat3_7): Split alternative with =ro output constraint
into =r,o,o and use Wd input constraint for the last alternative which
is enabled for TARGET_64BIT.  Reject ix86_endbr_immediate_operand
in the input constant.

* gcc.target/i386/pr108044-1.c: New test.
* gcc.target/i386/pr108044-2.c: New test.
* gcc.target/i386/pr108044-3.c: New test.
* gcc.target/i386/pr108044-4.c: New test.

--- gcc/config/i386/i386.md.jj  2022-12-08 14:55:38.807303856 +0100
+++ gcc/config/i386/i386.md 2022-12-12 10:37:09.332995296 +0100
@@ -11470,11 +11470,11 @@ (define_insn_and_split "*concat3_5"
-  [(set (match_operand:DWI 0 "nonimmediate_operand" "=ro")
+  [(set (match_operand:DWI 0 "nonimmediate_operand" "=r,o,o")
(any_or_plus:DWI
- (ashift:DWI (match_operand:DWI 1 "register_operand" "r")
+ (ashift:DWI (match_operand:DWI 1 "register_operand" "r,r,r")
  (match_operand:DWI 2 "const_int_operand"))
- (match_operand:DWI 3 "const_scalar_int_operand")))]
+ (match_operand:DWI 3 "const_scalar_int_operand" "X,X,Wd")))]
   "INTVAL (operands[2]) ==  * BITS_PER_UNIT / 2
&& (mode == DImode
? CONST_INT_P (operands[3])
@@ -11482,7 +11482,12 @@ (define_insn_and_split "*concat= 0
: CONST_WIDE_INT_NUNITS (operands[3]) == 2
-&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)"
+&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)
+   && !(CONST_INT_P (operands[3])
+   ? ix86_endbr_immediate_operand (operands[3], VOIDmode)
+   : ix86_endbr_immediate_operand (GEN_INT (CONST_WIDE_INT_ELT 
(operands[3],
+0)),
+   VOIDmode))"
   "#"
   "&& reload_completed"
   [(clobber (const_int 0))]
@@ -11491,16 +11496,17 @@ (define_insn_and_split "*concatmode, operands[0], op3,
   gen_lowpart (mode, operands[1]));
   DONE;
-})
+}
+  [(set_attr "isa" "*,nox64,x64")])
 
 (define_insn_and_split "*concat3_6"
-  [(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
+  [(set (match_operand: 0 "nonimmediate_operand" "=r,o,o,r")
(any_or_plus:
  (ashift:
(zero_extend:
- (match_operand:DWIH 1 "nonimmediate_operand" "r,m"))
+ (match_operand:DWIH 1 "nonimmediate_operand" "r,r,r,m"))
(match_operand: 2 "const_int_operand"))
- (match_operand: 3 "const_scalar_int_operand")))]
+ (match_operand: 3 "const_scalar_int_operand" "X,X,Wd,X")))]
   "INTVAL (operands[2]) ==  * BITS_PER_UNIT
&& (mode == DImode
? CONST_INT_P (operands[3])
@@ -11508,7 +11514,12 @@ (define_insn_and_split "*concat= 0
: CONST_WIDE_INT_NUNITS (operands[3]) == 2
-&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)"
+&& CONST_WIDE_INT_ELT (operands[3], 1) == 0)
+   && !(CONST_INT_P (operands[3])
+   ? ix86_endbr_immediate_operand (operands[3], VOIDmode)
+   : ix86_endbr_immediate_operand (GEN_INT (CONST_WIDE_INT_ELT 
(operands[3],
+0)),
+   VOIDmode))"
   "#"
   "&& reload_completed"
   [(clobber (const_int 0))]
@@ -11516,20 +11527,25 @@ (define_insn_and_split "*concatmode, operands[3], mode, 0);
   split_double_concat (mode, operands[0], op3, operands[1]);
   DONE;
-})
+}
+  [(set_attr "i

PING: New reg note REG_CFA_NORESTORE

2022-12-13 Thread Andreas Krebbel via Gcc-patches
Hi,

I need a way to save registers on the stack and generate proper CFI for it. 
Since I do not intend to
restore them I needed a way to tell the CFI generation step about it:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606128.html

Is this ok for mainline?

Bye,

Andreas


Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-13 Thread Richard Biener via Gcc-patches



> Am 12.12.2022 um 22:59 schrieb Jan Hubicka via Gcc-patches 
> :
> 
> 
>> 
>>> Hi,
>>> 
>>> I'm re-posting patches which I have posted at the end of stage 1 but
>>> which have not passed review yet.
>>> 
>>> 8<
>>> 
>>> I have noticed that scan_expr_access passes all the expressions it
>>> gets to get_ref_base_and_extent even when we are really only
>>> interested in memory accesses.  So bail out when the expression is
>>> something clearly uninteresting.
>>> 
>>> Bootstrapped and tested individually when I originally posted it and
>>> now bootstrapped and LTO-bootstrapped and tested as part of the whole
>>> series.  OK for master?
>>> 
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2021-12-14  Martin Jambor  
>>> 
>>>* ipa-sra.c (scan_expr_access): Bail out early if expr is something we
>>>clearly do not need to pass to get_ref_base_and_extent.
>>> ---
>>> gcc/ipa-sra.cc | 5 +
>>> 1 file changed, 5 insertions(+)
>>> 
>>> diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
>>> index 93fceeafc73..3646d71468c 100644
>>> --- a/gcc/ipa-sra.cc
>>> +++ b/gcc/ipa-sra.cc
>>> @@ -1748,6 +1748,11 @@ scan_expr_access (tree expr, gimple *stmt, 
>>> isra_scan_context ctx,
>>>   || TREE_CODE (expr) == REALPART_EXPR)
>>> expr = TREE_OPERAND (expr, 0);
>>> 
>>> +  if (!handled_component_p (expr)
>>> +  && !DECL_P (expr)
>>> +  && TREE_CODE (expr) != MEM_REF)
>>> +return;
>> Is this needed because get_ref_base_and_extend crashes if given SSA_NAME
>> or something else or is it just optimization?
>> Perhaps Richi will know if there is better test for this.
> Looking at:
> 
> static inline bool
> gimple_assign_load_p (const gimple *gs)
> {
>  tree rhs;
>  if (!gimple_assign_single_p (gs))
>return false;
>  rhs = gimple_assign_rhs1 (gs);
>  if (TREE_CODE (rhs) == WITH_SIZE_EXPR)
>return true;
>  rhs = get_base_address (rhs);
>  return (DECL_P (rhs)
>  || TREE_CODE (rhs) == MEM_REF || TREE_CODE (rhs) == TARGET_MEM_REF);
> } 
> 
> I wonder if we don't want to avoid get_base_address (which is loopy) and
> use same check and move it into a new predicate that is more convenient
> to use?

We can simplify the above to a single stripping of a handled component and 
considering another handled component as load (register ops are always single)

Richard 
> 
> Honza
>> 
>> Honza
>>> +
>>>   base = get_ref_base_and_extent (expr, &poffset, &psize, &pmax_size, 
>>> &reverse);
>>> 
>>>   if (TREE_CODE (base) == MEM_REF)
>>> -- 
>>> 2.38.1
>>>