Re: [x86 PATCH] PR target/107548: Handle vec_select in STV.

2022-12-22 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 23, 2022 at 12:19 AM Roger Sayle  wrote:
>
>
> This patch enhances x86's STV pass to handle VEC_SELECT during general
> scalar chain conversion, performing SImode scalar extraction from V4SI
> and DImode scalar extraction from V2DI vector registers.
>
> The motivating test case from bugzilla is:
>
> typedef unsigned int v4si __attribute__((vector_size(16)));
>
> unsigned int f (v4si a, v4si b)
> {
>   a[0] += b[0];
>   return a[0] + a[1];
> }
>
> currently with -O2 -march=znver2 this generates:
>
> vpextrd $1, %xmm0, %edx
> vmovd   %xmm0, %eax
> addl%edx, %eax
> vmovd   %xmm1, %edx
> addl%edx, %eax
> ret
>
> which performs three transfers from the vector unit to the scalar unit,
> and performs the two additions there.  With this patch, we now generate:
>
> vmovdqa %xmm0, %xmm2
> vpshufd $85, %xmm0, %xmm0
> vpaddd  %xmm0, %xmm2, %xmm0
> vpaddd  %xmm1, %xmm0, %xmm0
> vmovd   %xmm0, %eax
> ret
>
> which performs the two additions in the vector unit, and then transfers
> the result to the scalar unit.  Technically the (cheap) movdqa isn't
> needed with better register allocation (or this could be cleaned up
> during peephole2), but even so this transform is still a win.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-12-22  Roger Sayle  
>
> gcc/ChangeLog
> PR target/107548
> * config/i386/i386-features.cc (scalar_chain::add_insn): The
> operands of a VEC_SELECT don't need to added to the scalar chain.
> (general_scalar_chain::compute_convert_gain) :
> Provide gains for performing STV on a VEC_SELECT.
> (general_scalar_chain::convert_insn): Convert VEC_SELECT to pshufd,
> psrldq or no-op.
> (general_scalar_to_vector_candidate_p): Handle VEC_SELECT of a
> single element from a vector register to a scalar register.
>
> gcc/testsuite/ChangeLog
> PR target/107548
> * gcc.target/i386/pr107548-1.c: New test V4SI case.
> * gcc.target/i386/pr107548-1.c: New test V2DI case.

LGTM.

Thanks,
Uros.


Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-22 Thread Alexander Monakov via Gcc-patches


On Thu, 22 Dec 2022, Qing Zhao wrote:

> > I think scheduling across calls in the pre-RA scheduler is simply an 
> > oversight,
> > we do not look at dataflow information and with 50% chance risk extending
> > lifetime of a pseudoregister across a call, causing higher register 
> > pressure at
> > the point of the call, and potentially an extra spill.
> 
> I am a little confused, you mean pre-RA scheduler does not look at the data 
> flow
>  information at all when scheduling insns across calls currently?

I think it does not inspect liveness info, and may extend lifetime of a pseudo
across a call, transforming
 
  call foo
  reg = 1
  ...
  use reg

to

  reg = 1
  call foo
  ...
  use reg

but this is undesirable, because now register allocation cannot select a
call-clobbered register for 'reg'.

Alexander


Re: [x86 PATCH] PR target/106933: Limit TImode STV to SSA-like def-use chains.

2022-12-22 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 23, 2022 at 12:09 AM Roger Sayle  wrote:
>
>
> With many thanks to H.J. for doing all the hard work, this patch resolves
> two P1 regressions; PR target/106933 and PR target/106959.
>
> Although superficially similar, the i386 backend's two scalar-to-vector
> (STV) passes perform their transformations in importantly different ways.
> The original pass converting SImode and DImode operations to V4SImode
> or V2DImode operations is "soft", allowing values to be maintained in
> both integer and vector hard registers.  The newer pass converting TImode
> operations to V1TImode is "hard" (all or nothing) that converts all uses
> of a pseudo to vector form.  To implement this it invokes powerful ju-ju
> calling SET_MODE on a REG_rtx, which due to RTL sharing, often updates
> this pseudo's mode everywhere in the RTL chain.  Hence, TImode STV can only
> be performed when all uses of a pseudo are convertible to V1TImode form.
> To ensure this the STV passes currently use data-flow analysis to inspect
> all DEFs and USEs in a chain.  This works fine for chains that are in
> the usual single assignment form, but the occurrence of uninitialized
> variables, or multiple assignments that split a pseudo's usage into
> several independent chains (lifetimes) can lead to situations where
> some but not all of a pseudo's occurrences need to be updated.  This is
> safe for the SImode/DImode pass, but leads to the above bugs during
> the TImode pass.
>
> My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959
> is to only perform the new single_def_chain_p check for TImode STV; it
> turns out that STV of SImode/DImode min/max operates safely on multiple-def
> chains, and prohibiting this leads to testsuite regressions.  We don't
> (yet) support V1TImode min/max, so this idiom isn't an issue during the
> TImode STV pass.
>
> For the record, the two alternate possible fixes are (i) make the TImode
> STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx
> with a new pseudo, or (ii) merging "chains" so that multiple DFA
> chains/lifetimes are considered a single STV chain.

I assume these two alternatives would result in much more invasive
surgery, so let's consider these "for the future".

> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-12-22  H.J. Lu  
> Roger Sayle  
>
> gcc/ChangeLog
> PR target/106933
> PR target/106959
> * config/i386/i386-features.cc (single_def_chain_p): New predicate
> function to check that a pseudo's use-def chain is in SSA form.
> (timode_scalar_to_vector_candidate_p): Check that TImode regs that
> are SET_DEST or SET_SRC of an insn match/are single_def_chain_p.
>
> gcc/testsuite/ChangeLog
> PR target/106933
> PR target/106959
> * gcc.target/i386/pr106933-1.c: New test case.
> * gcc.target/i386/pr106933-2.c: Likewise.
> * gcc.target/i386/pr106959-1.c: Likewise.
> * gcc.target/i386/pr106959-2.c: Likewise.
> * gcc.target/i386/pr106959-3.c: Likewise.

OK.

Thanks,
Uros.


[PATCH 1/1] Fixed typo in RISCV

2022-12-22 Thread jinma via Gcc-patches
From 21904908689318ab81c630adc8cc7067e1a12488 Mon Sep 17 00:00:00 2001
From: Jin Ma 
Date: Fri, 23 Dec 2022 10:42:19 +0800
Subject: [PATCH 1/1] Fixed typo

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
---
 gcc/common/config/riscv/riscv-common.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 4b7f777c103..0a89fdaffe2 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1538,7 +1538,7 @@ riscv_check_conds (

   for (itr = conds.begin (); itr != conds.end (); ++itr)
 {
-  /* We'll check march= and mabi= in ohter place.  */
+  /* We'll check march= and mabi= in other place.  */
   if (prefixed_with (*itr, "march=") || prefixed_with (*itr, "mabi="))
continue;

--
2.17.1

Re: Re: [PATCH] RISC-V: Fix incorrect annotation

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

Ju-Zhe has not figured out how to commit to his environment yet, I am
helping him to set up.

On Wed, Dec 21, 2022 at 7:38 AM Palmer Dabbelt  wrote:
>
> On Tue, 20 Dec 2022 15:33:11 PST (-0800), juzhe.zh...@rivai.ai wrote:
> > Thanks. I received an email from sourceware:
> > "You should now have write access to the source control repository for your 
> > project."
> > It seems that I can merge codes? However, I still don't know how to merge 
> > codes.
>
> You should have a sourceware account, along with an associated private
> key.  With those you should be able to get push access via
> https://gcc.gnu.org/gitwrite.html
>
> >
> >
> > juzhe.zh...@rivai.ai
> >
> > From: Jeff Law
> > Date: 2022-12-21 00:02
> > To: juzhe.zhong
> > CC: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; pal...@dabbelt.com
> > Subject: Re: [PATCH] RISC-V: Fix incorrect annotation
> >
> >
> > On 12/19/22 17:38, juzhe.zhong wrote:
> >> Would you mind merging it for me? I can‘t merge code.
> > Do you mean you do not have write access to the repository?  If so, that
> > can be easily fixed.
> >
> > https://sourceware.org/cgi-bin/pdw/ps_form.cgi
> >
> > List me as your sponsor.
> >
> > jeff
> >


Re: [PATCH] RISC-V: Update vsetvl/vsetvlmax intrinsics to the latest api name.

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Tue, Dec 20, 2022 at 11:57 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 12/20/22 07:58, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vector-builtins-shapes.cc (struct 
> > vsetvl_def): Add "__riscv_" prefix.
> >
> > gcc/testsuite/ChangeLog:
> >
> >  * gcc.target/riscv/rvv/base/vsetvl-1.c: Add "__riscv_" prefix.
> OK
> jeff


Re: [PATCH] RISC-V: Remove side effects of vsetvl/vsetvlmax intriniscs in properties

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Wed, Dec 21, 2022 at 12:00 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 12/20/22 07:51, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vector-builtins-bases.cc: Remove side effects.
> OK.
> Jeff


Re: [PATCH] RISC-V: Support vle.v/vse.v intrinsics

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Fri, Dec 23, 2022 at 8:57 AM 钟居哲  wrote:
>
> This patch is minimum intrinsics support for VSETVL PASS to support AVL model.
> The corresponding unit-test for vle.v/vse.v should be added after I support 
> AVL model
> and well tested VSETVL PASS patch.
>
>
> juzhe.zh...@rivai.ai
>
> From: juzhe.zhong
> Date: 2022-12-23 08:52
> To: gcc-patches
> CC: kito.cheng; palmer; Ju-Zhe Zhong
> Subject: [PATCH] RISC-V: Support vle.v/vse.v intrinsics
> From: Ju-Zhe Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (get_avl_type_rtx): New function.
> * config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto.
> * config/riscv/riscv-vector-builtins-bases.cc (class loadstore): New 
> class.
> (BASE): Ditto.
> * config/riscv/riscv-vector-builtins-bases.h: Ditto.
> * config/riscv/riscv-vector-builtins-functions.def (vle): Ditto.
> (vse): Ditto.
> * config/riscv/riscv-vector-builtins-shapes.cc (build_one): Ditto.
> (struct loadstore_def): Ditto.
> (SHAPE): Ditto.
> * config/riscv/riscv-vector-builtins-shapes.h: Ditto.
> * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_U_OPS): New 
> macro.
> (DEF_RVV_F_OPS): Ditto.
> (vuint8mf8_t): Add corresponding mask type.
> (vuint8mf4_t): Ditto.
> (vuint8mf2_t): Ditto.
> (vuint8m1_t): Ditto.
> (vuint8m2_t): Ditto.
> (vuint8m4_t): Ditto.
> (vuint8m8_t): Ditto.
> (vuint16mf4_t): Ditto.
> (vuint16mf2_t): Ditto.
> (vuint16m1_t): Ditto.
> (vuint16m2_t): Ditto.
> (vuint16m4_t): Ditto.
> (vuint16m8_t): Ditto.
> (vuint32mf2_t): Ditto.
> (vuint32m1_t): Ditto.
> (vuint32m2_t): Ditto.
> (vuint32m4_t): Ditto.
> (vuint32m8_t): Ditto.
> (vuint64m1_t): Ditto.
> (vuint64m2_t): Ditto.
> (vuint64m4_t): Ditto.
> (vuint64m8_t): Ditto.
> (vfloat32mf2_t): Ditto.
> (vfloat32m1_t): Ditto.
> (vfloat32m2_t): Ditto.
> (vfloat32m4_t): Ditto.
> (vfloat32m8_t): Ditto.
> (vfloat64m1_t): Ditto.
> (vfloat64m2_t): Ditto.
> (vfloat64m4_t): Ditto.
> (vfloat64m8_t): Ditto.
> * config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Adjust for 
> new macro.
> (DEF_RVV_I_OPS): Ditto.
> (DEF_RVV_U_OPS): New macro.
> (DEF_RVV_F_OPS): New macro.
> (use_real_mask_p): New function.
> (use_real_merge_p): Ditto.
> (get_tail_policy_for_pred): Ditto.
> (get_mask_policy_for_pred): Ditto.
> (function_builder::apply_predication): Ditto.
> (function_builder::append_base_name): Ditto.
> (function_builder::append_sew): Ditto.
> (function_expander::add_vundef_operand): Ditto.
> (function_expander::add_mem_operand): Ditto.
> (function_expander::use_contiguous_load_insn): Ditto.
> (function_expander::use_contiguous_store_insn): Ditto.
> * config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): Adjust for 
> adding mask type.
> (vbool64_t): Ditto.
> (vbool32_t): Ditto.
> (vbool16_t): Ditto.
> (vbool8_t): Ditto.
> (vbool4_t): Ditto.
> (vbool2_t): Ditto.
> (vbool1_t): Ditto.
> (vint8mf8_t): Ditto.
> (vint8mf4_t): Ditto.
> (vint8mf2_t): Ditto.
> (vint8m1_t): Ditto.
> (vint8m2_t): Ditto.
> (vint8m4_t): Ditto.
> (vint8m8_t): Ditto.
> (vint16mf4_t): Ditto.
> (vint16mf2_t): Ditto.
> (vint16m1_t): Ditto.
> (vint16m2_t): Ditto.
> (vint16m4_t): Ditto.
> (vint16m8_t): Ditto.
> (vint32mf2_t): Ditto.
> (vint32m1_t): Ditto.
> (vint32m2_t): Ditto.
> (vint32m4_t): Ditto.
> (vint32m8_t): Ditto.
> (vint64m1_t): Ditto.
> (vint64m2_t): Ditto.
> (vint64m4_t): Ditto.
> (vint64m8_t): Ditto.
> (vfloat32mf2_t): Ditto.
> (vfloat32m1_t): Ditto.
> (vfloat32m2_t): Ditto.
> (vfloat32m4_t): Ditto.
> (vfloat32m8_t): Ditto.
> (vfloat64m1_t): Ditto.
> (vfloat64m4_t): Ditto.
> * config/riscv/riscv-vector-builtins.h 
> (function_expander::add_output_operand): New function.
> (function_expander::add_all_one_mask_operand): Ditto.
> (function_expander::add_fixed_operand): Ditto.
> (function_expander::vector_mode): Ditto.
> (function_base::apply_vl_p): Ditto.
> (function_base::can_be_overloaded_p): Ditto.
> * config/riscv/riscv-vsetvl.cc (get_vl): Remove restrict of 
> supporting AVL is not VLMAX.
> * config/riscv/t-riscv: Add include file.
>
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  10 +-
> .../riscv/riscv-vector-builtins-bases.cc  |  49 +++-
> .../riscv/riscv-v

Re: [PATCH] RISC-V: Remove side effects of vsetvl pattern in RTL.

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Tue, Dec 20, 2022 at 11:59 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 12/20/22 07:56, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vector-builtins-bases.cc: Change it to no 
> > side effects.
> >  * config/riscv/vector.md (@vsetvl_no_side_effects): New 
> > pattern.
> OK
> jeff


Re: [PATCH] RISC-V: Fix muti-line condition format

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Tue, Dec 20, 2022 at 8:28 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 12/19/22 16:09, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vsetvl.cc (vlmax_avl_insn_p): Fix multi-line 
> > conditional.
> >  (vsetvl_insn_p): Ditto.
> >  (same_bb_and_before_p): Ditto.
> >  (same_bb_and_after_or_equal_p): Ditto.
> OK
> jeff


Re: [PATCH] RISC-V: Fix vle constraints

2022-12-22 Thread Kito Cheng via Gcc-patches
Committed, thanks :)

On Fri, Dec 23, 2022 at 11:33 AM  wrote:
>
> From: Ju-Zhe Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md: Fix contraints.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/vle-constraint-1.c: New test.
>
> ---
>  gcc/config/riscv/vector.md|  16 +--
>  .../riscv/rvv/base/vle-constraint-1.c | 109 ++
>  2 files changed, 117 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c
>
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index 98b8f701c92..89810b183fc 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -636,18 +636,18 @@
>  ;;2. (const_vector:VNx1SF repeat [
>  ;;(const_double:SF 0.0 [0x0.0p+0])]).
>  (define_insn_and_split "@pred_mov"
> -  [(set (match_operand:V 0 "nonimmediate_operand""=vd,  vr, m,   
>  vr,vr")
> +  [(set (match_operand:V 0 "nonimmediate_operand"  "=vd,vr, 
> m,vr,vr")
> (if_then_else:V
>   (unspec:
> -   [(match_operand: 1 "vector_mask_operand" " vm, Wc1, vmWc1,   
> Wc1,   Wc1")
> -(match_operand 4 "vector_length_operand"" rK,  rK,rK,
> rK,rK")
> -(match_operand 5 "const_int_operand""  i,   i, i,
>  i, i")
> -(match_operand 6 "const_int_operand""  i,   i, i,
>  i, i")
> -(match_operand 7 "const_int_operand""  i,   i, i,
>  i, i")
> +   [(match_operand: 1 "vector_mask_operand" "vmWc1, vmWc1, 
> vmWc1,   Wc1,   Wc1")
> +(match_operand 4 "vector_length_operand""   rK,rK,
> rK,rK,rK")
> +(match_operand 5 "const_int_operand""i, i, 
> i, i, i")
> +(match_operand 6 "const_int_operand""i, i, 
> i, i, i")
> +(match_operand 7 "const_int_operand""i, i, 
> i, i, i")
>  (reg:SI VL_REGNUM)
>  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> - (match_operand:V 3 "vector_move_operand"   "  m,   m,vr,
> vr, viWc0")
> - (match_operand:V 2 "vector_merge_operand"  "  0,  vu,   vu0,   
> vu0,   vu0")))]
> + (match_operand:V 3 "vector_move_operand"   "m, m,
> vr,vr, viWc0")
> + (match_operand:V 2 "vector_merge_operand"  "0,vu,   
> vu0,   vu0,   vu0")))]
>"TARGET_VECTOR"
>"@
> vle.v\t%0,%3%p1
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c
> new file mode 100644
> index 000..b7cf98bfd9f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c
> @@ -0,0 +1,109 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +#include "riscv_vector.h"
> +
> +/*
> +** f1:
> +** vsetvli\tzero,4,e32,m1,tu,ma
> +** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vse32\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** ret
> +*/
> +void f1 (float * in, float *out)
> +{
> +vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
> +vfloat32m1_t v2 = __riscv_vle32_v_f32m1_tu (v, in, 4);
> +__riscv_vse32_v_f32m1 (out, v2, 4);
> +}
> +
> +/*
> +** f2:
> +** vsetvli\t[a-x0-9]+,zero,e8,mf4,ta,ma
> +** vlm.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vsetvli\tzero,4,e32,m1,ta,ma
> +** vle32.v\tv[0-9]+,0\([a-x0-9]+\),v0.t
> +** vse32.v\tv[0-9]+,0\([a-x0-9]+\)
> +** ret
> +*/
> +void f2 (float * in, float *out)
> +{
> +vbool32_t mask = *(vbool32_t*)in;
> +asm volatile ("":::"memory");
> +vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
> +vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in, 4);
> +__riscv_vse32_v_f32m1 (out, v2, 4);
> +}
> +
> +/*
> +** f3:
> +** vsetvli\t[a-x0-9]+,zero,e8,mf4,ta,ma
> +** vlm.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vsetvli\tzero,4,e32,m1,tu,mu
> +** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vle32.v\tv[0-9]+,0\([a-x0-9]+\),v0.t
> +** vse32.v\tv[0-9]+,0\([a-x0-9]+\)
> +** ret
> +*/
> +void f3 (float * in, float *out)
> +{
> +vbool32_t mask = *(vbool32_t*)in;
> +asm volatile ("":::"memory");
> +vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
> +vfloat32m1_t v2 = __riscv_vle32_v_f32m1_tumu (mask, v, in, 4);
> +__riscv_vse32_v_f32m1 (out, v2, 4);
> +}
> +
> +/*
> +** f4:
> +** vsetvli\tzero,4,e8,mf8,tu,ma
> +** vle8\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vle8\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** vse8\.v\tv[0-9]+,0\([a-x0-9]+\)
> +** ret
> +*/
> +void f4 (int8_t * in, int8_t *out)
> +{
> +vint8mf8_t v = __riscv_vle8_v_i8mf8 (in, 4);
> +vint8mf8_t v2 = __riscv_vle8_v_i8mf8_tu (v, in, 4);
> + 

[PATCH] RISC-V: Fix vle constraints

2022-12-22 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/vector.md: Fix contraints.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vle-constraint-1.c: New test.

---
 gcc/config/riscv/vector.md|  16 +--
 .../riscv/rvv/base/vle-constraint-1.c | 109 ++
 2 files changed, 117 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 98b8f701c92..89810b183fc 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -636,18 +636,18 @@
 ;;2. (const_vector:VNx1SF repeat [
 ;;(const_double:SF 0.0 [0x0.0p+0])]).
 (define_insn_and_split "@pred_mov"
-  [(set (match_operand:V 0 "nonimmediate_operand""=vd,  vr, m,
vr,vr")
+  [(set (match_operand:V 0 "nonimmediate_operand"  "=vd,vr, m, 
   vr,vr")
(if_then_else:V
  (unspec:
-   [(match_operand: 1 "vector_mask_operand" " vm, Wc1, vmWc1,   
Wc1,   Wc1")
-(match_operand 4 "vector_length_operand"" rK,  rK,rK,
rK,rK")
-(match_operand 5 "const_int_operand""  i,   i, i, 
i, i")
-(match_operand 6 "const_int_operand""  i,   i, i, 
i, i")
-(match_operand 7 "const_int_operand""  i,   i, i, 
i, i")
+   [(match_operand: 1 "vector_mask_operand" "vmWc1, vmWc1, vmWc1,  
 Wc1,   Wc1")
+(match_operand 4 "vector_length_operand""   rK,rK,rK,  
  rK,rK")
+(match_operand 5 "const_int_operand""i, i, i,  
   i, i")
+(match_operand 6 "const_int_operand""i, i, i,  
   i, i")
+(match_operand 7 "const_int_operand""i, i, i,  
   i, i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (match_operand:V 3 "vector_move_operand"   "  m,   m,vr,
vr, viWc0")
- (match_operand:V 2 "vector_merge_operand"  "  0,  vu,   vu0,   
vu0,   vu0")))]
+ (match_operand:V 3 "vector_move_operand"   "m, m,vr,  
  vr, viWc0")
+ (match_operand:V 2 "vector_merge_operand"  "0,vu,   vu0,  
 vu0,   vu0")))]
   "TARGET_VECTOR"
   "@
vle.v\t%0,%3%p1
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c
new file mode 100644
index 000..b7cf98bfd9f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vle-constraint-1.c
@@ -0,0 +1,109 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "riscv_vector.h"
+
+/*
+** f1:
+** vsetvli\tzero,4,e32,m1,tu,ma
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vse32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f1 (float * in, float *out)
+{
+vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
+vfloat32m1_t v2 = __riscv_vle32_v_f32m1_tu (v, in, 4);
+__riscv_vse32_v_f32m1 (out, v2, 4);
+}
+
+/*
+** f2:
+** vsetvli\t[a-x0-9]+,zero,e8,mf4,ta,ma
+** vlm.v\tv[0-9]+,0\([a-x0-9]+\)
+** vsetvli\tzero,4,e32,m1,ta,ma
+** vle32.v\tv[0-9]+,0\([a-x0-9]+\),v0.t
+** vse32.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f2 (float * in, float *out)
+{
+vbool32_t mask = *(vbool32_t*)in;
+asm volatile ("":::"memory");
+vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
+vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in, 4);
+__riscv_vse32_v_f32m1 (out, v2, 4);
+}
+
+/*
+** f3:
+** vsetvli\t[a-x0-9]+,zero,e8,mf4,ta,ma
+** vlm.v\tv[0-9]+,0\([a-x0-9]+\)
+** vsetvli\tzero,4,e32,m1,tu,mu
+** vle32\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle32.v\tv[0-9]+,0\([a-x0-9]+\),v0.t
+** vse32.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f3 (float * in, float *out)
+{
+vbool32_t mask = *(vbool32_t*)in;
+asm volatile ("":::"memory");
+vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4);
+vfloat32m1_t v2 = __riscv_vle32_v_f32m1_tumu (mask, v, in, 4);
+__riscv_vse32_v_f32m1 (out, v2, 4);
+}
+
+/*
+** f4:
+** vsetvli\tzero,4,e8,mf8,tu,ma
+** vle8\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vle8\.v\tv[0-9]+,0\([a-x0-9]+\)
+** vse8\.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f4 (int8_t * in, int8_t *out)
+{
+vint8mf8_t v = __riscv_vle8_v_i8mf8 (in, 4);
+vint8mf8_t v2 = __riscv_vle8_v_i8mf8_tu (v, in, 4);
+__riscv_vse8_v_i8mf8 (out, v2, 4);
+}
+
+/*
+** f5:
+** vsetvli\t[a-x0-9]+,zero,e8,mf8,ta,ma
+** vlm.v\tv[0-9]+,0\([a-x0-9]+\)
+** vsetvli\tzero,4,e8,mf8,ta,ma
+** vle8.v\tv[0-9]+,0\([a-x0-9]+\),v0.t
+** vse8.v\tv[0-9]+,0\([a-x0-9]+\)
+** ret
+*/
+void f5 (int8_t * in, int8_t *out)
+{
+vbool64_t mask = *(vboo

build broke, cris-elf: [committed] libstdc++: Implement C++20 time zone support in

2022-12-22 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jonathan Wakely via Gcc-patches 
> Date: Fri, 23 Dec 2022 00:37:04 +0100

> This is the largest missing piece of C++20 support. Only the cxx11 ABI
> is supported, due to the use of std::string in the API for time zones.

> libstdc++-v3/ChangeLog:
> 
>   * acinclude.m4 (GLIBCXX_ZONEINFO_DIR): New macro.
>   * config.h.in: Regenerate.
>   * config/abi/pre/gnu.ver: Export new symbols.
>   * configure: Regenerate.
>   * configure.ac (GLIBCXX_ZONEINFO_DIR): Use new macro.
>   * include/std/chrono (utc_clock::from_sys): Correct handling
>   of leap seconds.
>   (nonexistent_local_time::_M_make_what_str): Define.
>   (ambiguous_local_time::_M_make_what_str): Define.
>   (__throw_bad_local_time): Define new function.
>   (time_zone, tzdb_list, tzdb): Implement all members.
>   (remote_version, zoned_time, get_leap_second_info): Define.
>   * include/std/version: Add comment for __cpp_lib_chrono.
>   * src/c++20/Makefile.am: Add new file.
>   * src/c++20/Makefile.in: Regenerate.
>   * src/c++20/tzdb.cc: New file.
>   * testsuite/lib/libstdc++.exp: Define effective target tzdb.
>   * testsuite/std/time/clock/file/members.cc: Check file_time
>   alias and file_clock::now() member.
>   * testsuite/std/time/clock/gps/1.cc: Likewise for gps_clock.
>   * testsuite/std/time/clock/tai/1.cc: Likewise for tai_clock.
>   * testsuite/std/time/syn_c++20.cc: Uncomment everything except
>   parse.
>   * testsuite/std/time/clock/utc/leap_second_info.cc: New test.
>   * testsuite/std/time/exceptions.cc: New test.
>   * testsuite/std/time/time_zone/get_info_local.cc: New test.
>   * testsuite/std/time/time_zone/get_info_sys.cc: New test.
>   * testsuite/std/time/time_zone/requirements.cc: New test.
>   * testsuite/std/time/tzdb/1.cc: New test.
>   * testsuite/std/time/tzdb/leap_seconds.cc: New test.
>   * testsuite/std/time/tzdb_list/1.cc: New test.
>   * testsuite/std/time/tzdb_list/requirements.cc: New test.
>   * testsuite/std/time/zoned_time/1.cc: New test.
>   * testsuite/std/time/zoned_time/custom.cc: New test.
>   * testsuite/std/time/zoned_time/deduction.cc: New test.
>   * testsuite/std/time/zoned_time/req_neg.cc: New test.
>   * testsuite/std/time/zoned_time/requirements.cc: New test.
>   * testsuite/std/time/zoned_traits.cc: New test.


> +++ b/libstdc++-v3/src/c++20/tzdb.cc

> +  static_assert(sizeof(datetime) == 8 && alignof(datetime) == 4);

This broke build for cris-elf:
x/autotest/hpautotest-gcc1/gcc/libstdc++-v3/src/c++20/tzdb.cc:451:38: error: 
static assertion failed
  451 |   static_assert(sizeof(datetime) == 8 && alignof(datetime) == 4);
  | ~^~~~
x/autotest/hpautotest-gcc1/gcc/libstdc++-v3/src/c++20/tzdb.cc:451:38: note: the 
comparison reduces to '(7 == 8)'
make[5]: *** [Makefile:562: tzdb.lo] Error 1

(and I don't think "alignof(datetime) == 4" is true either)

Happy holidays.
brgds, H-P


Re: [PATCH] RISC-V: Support vle.v/vse.v intrinsics

2022-12-22 Thread 钟居哲
This patch is minimum intrinsics support for VSETVL PASS to support AVL model.
The corresponding unit-test for vle.v/vse.v should be added after I support AVL 
model 
and well tested VSETVL PASS patch.


juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2022-12-23 08:52
To: gcc-patches
CC: kito.cheng; palmer; Ju-Zhe Zhong
Subject: [PATCH] RISC-V: Support vle.v/vse.v intrinsics
From: Ju-Zhe Zhong 
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (get_avl_type_rtx): New function.
* config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class loadstore): New 
class.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.  
* config/riscv/riscv-vector-builtins-functions.def (vle): Ditto.
(vse): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Ditto.
(struct loadstore_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_U_OPS): New 
macro.
(DEF_RVV_F_OPS): Ditto.
(vuint8mf8_t): Add corresponding mask type.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint8m8_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
(vfloat64m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Adjust for new 
macro.
(DEF_RVV_I_OPS): Ditto.
(DEF_RVV_U_OPS): New macro.
(DEF_RVV_F_OPS): New macro.
(use_real_mask_p): New function.
(use_real_merge_p): Ditto.
(get_tail_policy_for_pred): Ditto.
(get_mask_policy_for_pred): Ditto.
(function_builder::apply_predication): Ditto.
(function_builder::append_base_name): Ditto.
(function_builder::append_sew): Ditto.
(function_expander::add_vundef_operand): Ditto.
(function_expander::add_mem_operand): Ditto.
(function_expander::use_contiguous_load_insn): Ditto.
(function_expander::use_contiguous_store_insn): Ditto.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): Adjust for 
adding mask type.
(vbool64_t): Ditto.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m4_t): Ditto.
* config/riscv/riscv-vector-builtins.h 
(function_expander::add_output_operand): New function.
(function_expander::add_all_one_mask_operand): Ditto.
(function_expander::add_fixed_operand): Ditto.
(function_expander::vector_mode): Ditto.
(function_base::apply_vl_p): Ditto.
(function_base::can_be_overloaded_p): Ditto.
* config/riscv/riscv-vsetvl.cc (get_vl): Remove restrict of supporting 
AVL is not VLMAX.
* config/riscv/t-riscv: Add include file.
 
---
gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-v.cc   |  10 +-
.../riscv/riscv-vector-builtins-bases.cc  |  49 +++-
.../riscv/riscv-vector-builtins-bases.h   |   2 +
.../riscv/riscv-vector-builtins-functions.def |   3 +
.../riscv/riscv-vector-builtins-shapes.cc |  38 ++-
.../riscv/riscv-vector-builtins-shapes.h  |   1 +
.../riscv/riscv-vector-builtins-types.def |  49 +++-
gcc/config/riscv/riscv-vector-builtins.cc | 236 +++

[PATCH] RISC-V: Support vle.v/vse.v intrinsics

2022-12-22 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (get_avl_type_rtx): New function.
* config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class loadstore): New 
class.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.  
* config/riscv/riscv-vector-builtins-functions.def (vle): Ditto.
(vse): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Ditto.
(struct loadstore_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_U_OPS): New 
macro.
(DEF_RVV_F_OPS): Ditto.
(vuint8mf8_t): Add corresponding mask type.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint8m8_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
(vfloat64m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Adjust for new 
macro.
(DEF_RVV_I_OPS): Ditto.
(DEF_RVV_U_OPS): New macro.
(DEF_RVV_F_OPS): New macro.
(use_real_mask_p): New function.
(use_real_merge_p): Ditto.
(get_tail_policy_for_pred): Ditto.
(get_mask_policy_for_pred): Ditto.
(function_builder::apply_predication): Ditto.
(function_builder::append_base_name): Ditto.
(function_builder::append_sew): Ditto.
(function_expander::add_vundef_operand): Ditto.
(function_expander::add_mem_operand): Ditto.
(function_expander::use_contiguous_load_insn): Ditto.
(function_expander::use_contiguous_store_insn): Ditto.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): Adjust for 
adding mask type.
(vbool64_t): Ditto.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m4_t): Ditto.
* config/riscv/riscv-vector-builtins.h 
(function_expander::add_output_operand): New function.
(function_expander::add_all_one_mask_operand): Ditto.
(function_expander::add_fixed_operand): Ditto.
(function_expander::vector_mode): Ditto.
(function_base::apply_vl_p): Ditto.
(function_base::can_be_overloaded_p): Ditto.
* config/riscv/riscv-vsetvl.cc (get_vl): Remove restrict of supporting 
AVL is not VLMAX.
* config/riscv/t-riscv: Add include file.

---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   |  10 +-
 .../riscv/riscv-vector-builtins-bases.cc  |  49 +++-
 .../riscv/riscv-vector-builtins-bases.h   |   2 +
 .../riscv/riscv-vector-builtins-functions.def |   3 +
 .../riscv/riscv-vector-builtins-shapes.cc |  38 ++-
 .../riscv/riscv-vector-builtins-shapes.h  |   1 +
 .../riscv/riscv-vector-builtins-types.def |  49 +++-
 gcc/config/riscv/riscv-vector-builtins.cc | 236 +-
 gcc/config/riscv/riscv-vector-builtins.def| 122 -
 gcc/config/riscv/riscv-vector-builtins.h  |  65 +
 gcc/config/riscv/riscv-vsetvl.cc  |   4 -
 gcc/config/riscv/t-riscv  |   2 +-
 13 files changed, 506 insertions(+), 76 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-p

[committed] libstdc++: Avoid recursion in __nothrow_wait_cv::wait [PR105730]

2022-12-22 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk. Backport to gcc-12 needed too.

-- >8 --

The commit r12-5877-g9e18a25331fa25 removed the incorrect
noexcept-specifier from std::condition_variable::wait and gave the new
symbol version @@GLIBCXX_3.4.30. It also redefined the original symbol
std::condition_variable::wait(unique_lock&)@GLIBCXX_3.4.11 as an
alias for a new symbol, __gnu_cxx::__nothrow_wait_cv::wait, which still
has the incorrect noexcept guarantee. That __nothrow_wait_cv::wait is
just a wrapper around the real condition_variable::wait which adds
noexcept and so terminates on a __forced_unwind exception.

This doesn't work on uclibc, possibly due to a dynamic linker bug. When
__nothrow_wait_cv::wait calls the condition_variable::wait function it
binds to the alias symbol, which means it just calls itself recursively
until the stack overflows.

This change avoids the possibility of a recursive call by changing the
__nothrow_wait_cv::wait function so that instead of calling
condition_variable::wait it re-implements it. This requires accessing
the private _M_cond member of condition_variable, so we need to use the
trick of instantiating a template with the member-pointer of the private
member.

libstdc++-v3/ChangeLog:

PR libstdc++/105730
* src/c++11/compatibility-condvar.cc (__nothrow_wait_cv::wait):
Access private data member of base class and call its wait
member.
---
 .../src/c++11/compatibility-condvar.cc| 22 ++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/src/c++11/compatibility-condvar.cc 
b/libstdc++-v3/src/c++11/compatibility-condvar.cc
index e3a8b8403ca..3cef3bc0714 100644
--- a/libstdc++-v3/src/c++11/compatibility-condvar.cc
+++ b/libstdc++-v3/src/c++11/compatibility-condvar.cc
@@ -67,6 +67,24 @@ _GLIBCXX_END_NAMESPACE_VERSION
 && defined(_GLIBCXX_HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT)
 namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
+namespace
+{
+  // Pointer-to-member for private std::condition_variable::_M_cond member.
+  std::__condvar std::condition_variable::* __base_member;
+
+  template
+struct cracker
+{ static std::__condvar std::condition_variable::* value; };
+
+  // Initializer for this static member also initializes __base_member.
+  template
+std::__condvar std::condition_variable::*
+  cracker::value = __base_member = X;
+
+  // Explicit instantiation is allowed to access the private member.
+  template class cracker<&std::condition_variable::_M_cond>;
+}
+
 struct __nothrow_wait_cv : std::condition_variable
 {
   void wait(std::unique_lock&) noexcept;
@@ -76,7 +94,9 @@ __attribute__((used))
 void
 __nothrow_wait_cv::wait(std::unique_lock& lock) noexcept
 {
-  this->condition_variable::wait(lock);
+  // In theory this could be simply this->std::condition_variable::wait(lock)
+  // but with uclibc that binds to the @GLIBCXX_3.4.11 symbol, see PR 105730.
+  (this->*__base_member).wait(*lock.mutex());
 }
 } // namespace __gnu_cxx
 
-- 
2.38.1



[committed] libstdc++: Add std::format support to

2022-12-22 Thread Jonathan Wakely via Gcc-patches
Another big missing piece of C++20 support, but header-only this time so
no new symbol exports. The last thing missing for C++20  is
std::chrono::parse.

Tested x86_64-linux, sparc-solaris2.11, powerpc-aix. Pushed to trunk.

-- >8 --

This adds the operator<< overloads and std::formatter specializations
required by C++20 so that  types can be written to ostreams and
printed with std::format.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/std/chrono (operator<<): Move to new header.
(nonexistent_local_time::_M_make_what_str): Define correctly.
(ambiguous_local_time::_M_make_what_str): Likewise.
* include/bits/chrono_io.h: New file.
* src/c++20/tzdb.cc (operator<<(ostream&, const Rule&)): Use
new ostream output for month and weekday types.
* testsuite/20_util/duration/io.cc: Test std::format support.
* testsuite/std/time/exceptions.cc: Check what() strings.
* testsuite/std/time/syn_c++20.cc: Uncomment local_time_format.
* testsuite/std/time/time_zone/get_info_local.cc: Enable check
for formatted output of local_info objects.
* testsuite/std/time/clock/file/io.cc: New test.
* testsuite/std/time/clock/gps/io.cc: New test.
* testsuite/std/time/clock/system/io.cc: New test.
* testsuite/std/time/clock/tai/io.cc: New test.
* testsuite/std/time/clock/utc/io.cc: New test.
* testsuite/std/time/day/io.cc: New test.
* testsuite/std/time/format.cc: New test.
* testsuite/std/time/hh_mm_ss/io.cc: New test.
* testsuite/std/time/month/io.cc: New test.
* testsuite/std/time/weekday/io.cc: New test.
* testsuite/std/time/year/io.cc: New test.
* testsuite/std/time/year_month_day/io.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |1 +
 libstdc++-v3/include/Makefile.in  |1 +
 libstdc++-v3/include/bits/chrono_io.h | 2469 +
 libstdc++-v3/include/std/chrono   |  164 +-
 libstdc++-v3/src/c++20/tzdb.cc|   12 +-
 libstdc++-v3/testsuite/20_util/duration/io.cc |   48 +
 .../testsuite/std/time/clock/file/io.cc   |   23 +
 .../testsuite/std/time/clock/gps/io.cc|   24 +
 .../testsuite/std/time/clock/system/io.cc |   72 +
 .../testsuite/std/time/clock/tai/io.cc|   24 +
 .../testsuite/std/time/clock/utc/io.cc|  120 +
 libstdc++-v3/testsuite/std/time/day/io.cc |   75 +
 libstdc++-v3/testsuite/std/time/exceptions.cc |4 +-
 libstdc++-v3/testsuite/std/time/format.cc |  117 +
 .../testsuite/std/time/hh_mm_ss/io.cc |   46 +
 libstdc++-v3/testsuite/std/time/month/io.cc   |   98 +
 libstdc++-v3/testsuite/std/time/syn_c++20.cc  |3 +-
 .../std/time/time_zone/get_info_local.cc  |2 -
 libstdc++-v3/testsuite/std/time/weekday/io.cc |  101 +
 libstdc++-v3/testsuite/std/time/year/io.cc|   89 +
 .../testsuite/std/time/year_month_day/io.cc   |  121 +
 21 files changed, 3465 insertions(+), 149 deletions(-)
 create mode 100644 libstdc++-v3/include/bits/chrono_io.h
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/file/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/gps/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/system/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/tai/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/clock/utc/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/day/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/format.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/hh_mm_ss/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/month/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/weekday/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year/io.cc
 create mode 100644 libstdc++-v3/testsuite/std/time/year_month_day/io.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 92b5450fc14..e91f4ddd4de 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -175,6 +175,7 @@ bits_headers = \
${bits_srcdir}/char_traits.h \
${bits_srcdir}/charconv.h \
${bits_srcdir}/chrono.h \
+   ${bits_srcdir}/chrono_io.h \
${bits_srcdir}/codecvt.h \
${bits_srcdir}/cow_string.h \
${bits_srcdir}/deque.tcc \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 5d00f90a423..06589d53856 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -528,6 +528,7 @@ bits_freestanding = \
 @GLIBCXX_HOSTED_TRUE@  ${bits_srcdir}/char_traits.h \
 @GLIBCXX_HOSTED_TRUE@  ${bits_srcdir}/charconv.h \
 @GLIBCXX_HOSTED_TRUE@  ${bits_srcdir}/chrono.h \
+@GLIBCXX_HOSTED_TRUE@  ${bits_srcdir}/chrono_io.h \
 @GLIBCXX_HOSTED_TRUE@  ${bits_srcdir}/codecvt.h \
 @GLIBCXX_HOSTED_TRUE@  ${bits_s

[committed] libstdc++: Add helper function in

2022-12-22 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, sparc-solaris2.11, powerpc-aix. Pushed to trunk.

-- >8 --

Add a new __format::__write_padded_as_spec helper to remove duplicated
code in formatter specializations.

libstdc++-v3/ChangeLog:

* include/std/format (__format::__write_padded_as_spec): New
function.
(__format::__formatter_str, __format::__formatter_int::format)
(formatter): Use it.
---
 libstdc++-v3/include/std/format | 75 +++--
 1 file changed, 34 insertions(+), 41 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 9c928371415..98421e8c123 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -597,6 +597,7 @@ namespace __format
   return __dest;
 }
 
+  // Write STR to OUT (and do so efficiently if OUT is a _Sink_iter).
   template
 requires output_iterator<_Out, const _CharT&>
 inline _Out
@@ -668,6 +669,30 @@ namespace __format
   return __out;
 }
 
+  // Write STR to OUT, with alignment and padding as determined by SPEC.
+  // pre: __spec._M_align != _Align_default || __align != _Align_default
+  template
+_Out
+__write_padded_as_spec(basic_string_view> __str,
+  size_t __estimated_width,
+  basic_format_context<_Out, _CharT>& __fc,
+  const _Spec<_CharT>& __spec,
+  _Align __align = _Align_left)
+{
+  size_t __width = __spec._M_get_width(__fc);
+
+  if (__width <= __estimated_width)
+   return __format::__write(__fc.out(), __str);
+
+  const size_t __nfill = __width - __estimated_width;
+
+  if (__spec._M_align)
+   __align = __spec._M_align;
+
+  return __format::__write_padded(__fc.out(), __str, __align, __nfill,
+ __spec._M_fill);
+}
+
   // A lightweight optional.
   struct _Optional_locale
   {
@@ -799,7 +824,7 @@ namespace __format
   }
 
   template
-   typename basic_format_context<_Out, _CharT>::iterator
+   _Out
format(basic_string_view<_CharT> __s,
   basic_format_context<_Out, _CharT>& __fc) const
{
@@ -824,16 +849,8 @@ namespace __format
}
}
 
- size_t __width = _M_spec._M_get_width(__fc);
-
- if (__width <= __estimated_width)
-   return __format::__write(__fc.out(), __s);
-
- const size_t __nfill = __width - __estimated_width;
- _Align __align = _M_spec._M_align ? _M_spec._M_align : _Align_left;
-
- return __format::__write_padded(__fc.out(), __s,
- __align, __nfill, _M_spec._M_fill);
+ return __format::__write_padded_as_spec(__s, __estimated_width,
+ __fc, _M_spec);
}
 
 #if __cpp_lib_format_ranges
@@ -1089,32 +1106,16 @@ namespace __format
  __est_width = __s.size();
}
 
- return _M_format_str(__s, __est_width, __fc);
+ return __format::__write_padded_as_spec(__s, __est_width, __fc,
+ _M_spec);
}
 
   template
typename basic_format_context<_Out, _CharT>::iterator
_M_format_character(_CharT __c,
  basic_format_context<_Out, _CharT>& __fc) const
-   { return _M_format_str({&__c, 1u}, 1, __fc); }
-
-  template
-   typename basic_format_context<_Out, _CharT>::iterator
-   _M_format_str(basic_string_view<_CharT> __str, size_t __est_width,
- basic_format_context<_Out, _CharT>& __fc) const
{
- // TODO: this is identical to last part of __formatter_str::format
- // so refactor to reuse the same code.
-
- size_t __width = _M_spec._M_get_width(__fc);
-
- if (__width <= __est_width)
-   return __format::__write(__fc.out(), __str);
-
- size_t __nfill = __width - __est_width;
- _Align __align = _M_spec._M_align ? _M_spec._M_align : _Align_left;
- return __format::__write_padded(__fc.out(), __str,
- __align, __nfill, _M_spec._M_fill);
+ return __format::__write_padded_as_spec({&__c, 1u}, 1, __fc, _M_spec);
}
 
   template
@@ -2135,20 +2136,12 @@ namespace __format
  __str = wstring_view(__p, __n);
}
 
- size_t __width = _M_spec._M_get_width(__fc);
-
- if (__width <= (size_t)__n)
-   return __format::__write(__fc.out(), __str);
-
- size_t __nfill = __width - __n;
- __format::_Align __align
-   = _M_spec._M_align ? _M_spec._M_align : __format::_Align_right;
- return __format::__write_padded(__fc.out(), __str,
- __align, __nfill, _M_spec._M_fill);
+ return __format::__write_padded_as_spec(__str, __n, __fc, _M_spec,
+  

[pushed] testsuite: don't declare printf in coro.h

2022-12-22 Thread Jason Merrill via Gcc-patches
mingw stdio.h plays horrible games with extern "C++", but it also seems
sloppy for coro.h to declare printf in testcases that will also include
standard headers.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro.h: #include  instead of
declaring puts/printf.
* g++.dg/coroutines/torture/mid-suspend-destruction-0.C:
#include .
* g++.dg/coroutines/pr95599.C: Use PRINT instead of puts.
* g++.dg/coroutines/torture/call-00-co-aw-arg.C:
* g++.dg/coroutines/torture/call-01-multiple-co-aw.C:
* g++.dg/coroutines/torture/call-02-temp-co-aw.C:
* g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C:
* g++.dg/coroutines/torture/co-await-00-trivial.C:
* g++.dg/coroutines/torture/co-await-01-with-value.C:
* g++.dg/coroutines/torture/co-await-02-xform.C:
* g++.dg/coroutines/torture/co-await-03-rhs-op.C:
* g++.dg/coroutines/torture/co-await-04-control-flow.C:
* g++.dg/coroutines/torture/co-await-05-loop.C:
* g++.dg/coroutines/torture/co-await-06-ovl.C:
* g++.dg/coroutines/torture/co-await-07-tmpl.C:
* g++.dg/coroutines/torture/co-await-08-cascade.C:
* g++.dg/coroutines/torture/co-await-09-pair.C:
* g++.dg/coroutines/torture/co-await-11-forwarding.C:
* g++.dg/coroutines/torture/co-await-12-operator-2.C:
* g++.dg/coroutines/torture/co-await-13-return-ref.C:
* g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C:
* g++.dg/coroutines/torture/pr95003.C: Likewise.
---
 gcc/testsuite/g++.dg/coroutines/coro.h   | 5 +
 gcc/testsuite/g++.dg/coroutines/pr95599.C| 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/call-00-co-aw-arg.C  | 4 ++--
 .../g++.dg/coroutines/torture/call-01-multiple-co-aw.C   | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/call-02-temp-co-aw.C | 2 +-
 .../g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C   | 2 +-
 .../g++.dg/coroutines/torture/co-await-00-trivial.C  | 2 +-
 .../g++.dg/coroutines/torture/co-await-01-with-value.C   | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-02-xform.C  | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-03-rhs-op.C | 2 +-
 .../g++.dg/coroutines/torture/co-await-04-control-flow.C | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-05-loop.C   | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-06-ovl.C| 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-07-tmpl.C   | 2 +-
 .../g++.dg/coroutines/torture/co-await-08-cascade.C  | 2 +-
 gcc/testsuite/g++.dg/coroutines/torture/co-await-09-pair.C   | 2 +-
 .../g++.dg/coroutines/torture/co-await-11-forwarding.C   | 2 +-
 .../g++.dg/coroutines/torture/co-await-12-operator-2.C   | 2 +-
 .../g++.dg/coroutines/torture/co-await-13-return-ref.C   | 2 +-
 .../coroutines/torture/co-await-14-return-ref-to-auto.C  | 2 +-
 .../g++.dg/coroutines/torture/mid-suspend-destruction-0.C| 1 +
 gcc/testsuite/g++.dg/coroutines/torture/pr95003.C| 2 +-
 22 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/gcc/testsuite/g++.dg/coroutines/coro.h 
b/gcc/testsuite/g++.dg/coroutines/coro.h
index 02d26602727..491177f0cfd 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro.h
+++ b/gcc/testsuite/g++.dg/coroutines/coro.h
@@ -129,10 +129,6 @@ namespace coro = std;
 
 #endif // __has_include()
 
-/* just to avoid cluttering dump files. */
-extern "C" int puts (const char *);
-extern "C" int printf (const char *, ...);
-
 #include  /* for abort () */
 
 #include  /* for std::forward */
@@ -141,6 +137,7 @@ extern "C" int printf (const char *, ...);
 #  define PRINT(X)
 #  define PRINTF (void)
 #else
+#include 
 #  define PRINT(X) puts(X)
 #  define PRINTF printf
 #endif
diff --git a/gcc/testsuite/g++.dg/coroutines/pr95599.C 
b/gcc/testsuite/g++.dg/coroutines/pr95599.C
index 9376359d378..ec97a4aa165 100644
--- a/gcc/testsuite/g++.dg/coroutines/pr95599.C
+++ b/gcc/testsuite/g++.dg/coroutines/pr95599.C
@@ -64,6 +64,6 @@ int main ()
   PRINTF ("something happened in the wrong order %d, %d, %d, %d, %d, %d, 
%d\n", a, b, c, d, e, f, g);
   abort ();
 }
-  puts ("main: done");
+  PRINT ("main: done");
   return 0;
 }
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/call-00-co-aw-arg.C 
b/gcc/testsuite/g++.dg/coroutines/torture/call-00-co-aw-arg.C
index ee108072f69..19e3ec1fe68 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/call-00-co-aw-arg.C
+++ b/gcc/testsuite/g++.dg/coroutines/torture/call-00-co-aw-arg.C
@@ -68,6 +68,6 @@ int main ()
   abort ();
 }
 
-  puts ("main: done");
+  PRINT ("main: done");
   return 0;
-}
\ No newline at end of file
+}
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/call-01-multiple-co-aw.C 
b/gcc/testsuite/g++.dg/coroutines/torture/call-01-multiple-co-aw.C
index 0f5785163fc..573f4f86a52 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/call-01-multiple-co-aw.C
+++ b/gc

[committed] libstdc++: Implement C++20 time zone support in

2022-12-22 Thread Jonathan Wakely via Gcc-patches
This is the finished version of the last patch I posted before the end
of stage 1. This is quite late for stage 1 (!) and adds new symbols to
the shared library, but I'm pushing it now as it's an important piece of
C++20 support. As noted in the commit message, the symbols being added
are stable parts of the standard-required API, and all the internals are
hidden behind pimpl classes, so the ABI wouldn't need to change even if
the entire internal implementation was replaced.

Tested x86_64-linux, sparc-solaris2.11, powerpc-aix. Pushed to trunk.

-- >8 --

This is the largest missing piece of C++20 support. Only the cxx11 ABI
is supported, due to the use of std::string in the API for time zones.
For the old gcc4 ABI, utc_clock and leap seconds are supported, but only
using a hardcoded list of leap seconds, no up-to-date tzdb::leap_seconds
information is available, and no time zones or zoned_time conversions.

The implementation currently depends on a tzdata.zi file being provided
by the OS or the user. The expected location is /usr/share/zoneinfo but
that can be changed using --with-libstdcxx-zoneinfo-dir=PATH. On targets
that support it there is also a weak symbol that users can override in
their own program (which also helps with testing):

extern "C++" const char* __gnu_cxx::zoneinfo_dir_override();

If no file is found, a fallback tzdb object will be created which only
contains the "Etc/UTC" and "Etc/GMT" time zones.

A leapseconds file is also expected in the same directory, but if that
isn't present then a hardcoded list of leapseconds is used, which is
correct at least as far as 2023-06-28 (and it currently looks like no
leap second will be inserted for a few years).

The tzdata.zi and leapseconds files from https://www.iana.org/time-zones
are in the public domain, so shipping copies of them with GCC would be
an option. However, the tzdata.zi file will rapidly become outdated, so
users should really provide it themselves (or convince their OS vendor
to do so). It would also be possible to implement an alternative parser
for the compiled tzdata files (one per time zone) under
/usr/share/zoneinfo. Those files are present on more operating systems,
but do not contain all the information present in tzdata.zi.
Specifically, the "links" are not present, so that e.g. "UTC" and
"Universal" are distinct time zones, rather than both being links to the
canonical "Etc/UTC" zone. For some platforms those files are hard links
to the same file, but there's no indication which zone is the canonical
name and which is a link. Other platforms just store them in different
inodes anyway. I do not plan to add such an alternative parser for the
compiled files. That would need to be contributed by maintainers or
users of targets that require it, if making tzdata.zi available is not
an option. The library ABI would not need to change for a new tzdb
implementation, because everything in tzdb_list, tzdb and time_zone is
implemented as a pimpl (except for the shared_ptr links between nodes,
described below). That means the new exported symbols added by this
commit should be stable even if the implementation is completely
rewritten.

The information from tzdata.zi is parsed and stored in data structures
that closely model the info in the file. This is a space-efficient
representation that uses less memory that storing every transition for
every time zone.  It also avoids spending time expanding that
information into time zone transitions that might never be needed by the
program.  When a conversion to/from a local time to UTC is requested the
information will be processed to determine the time zone transitions
close to the time being converted.

There is a bug in some time zone transitions. When generating a sys_info
object immediately after one that was previously generated, we need to
find the previous rule that was in effect and note its offset and
letters. This is so that the start time and abbreviation of the new
sys_info will be correct. This only affects time zones that use a format
like "C%sT" where the LETTERS replacing %s are non-empty for standard
time, e.g. "Asia/Shanghai" which uses "CST" for standard time and "CDT"
for daylight time.

The tzdb_list structure maintains a linked list of tzdb nodes using
shared_ptr links. This allows the iterators into the list to share
ownership with the list itself. This offers a non-portable solution to a
lifetime issue in the API. Because tzdb objects can be erased from the
list using tzdb_list::erase_after, separate modules/libraries in a large
program cannot guarantee that any const tzdb& or const time_zone*
remains valid indefinitely. Holding onto a tzdb_list::const_iterator
will extend the tzdb object's lifetime, even if it's erased from the
list. An alternative design would be for the list iterator to hold a
weak_ptr. This would allow users to test whether the tzdb still exists
when the iterator is dereferenced, which is better than just having a
dangling raw pointer. That do

[committed] libstdc++: Add GDB printers for types

2022-12-22 Thread Jonathan Wakely via Gcc-patches
These should really have tests for the new types, but I've been using
them heavily for a few weeks and they work well. I would rather get them
committed now and add tests later.

Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdChronoDurationPrinter)
(StdChronoTimePointPrinter, StdChronoZonedTimePrinter)
(StdChronoCalendarPrinter, StdChronoTimeZonePrinter)
(StdChronoLeapSecondPrinter, StdChronoTzdbPrinter)
(StdChronoTimeZoneRulePrinter): New printers.
---
 libstdc++-v3/python/libstdcxx/v6/printers.py | 265 ++-
 1 file changed, 261 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 1abf0a4bce3..7e694f48f28 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -19,6 +19,7 @@ import gdb
 import itertools
 import re
 import sys, os, errno
+from datetime import datetime, timezone
 
 ### Python 2 + Python 3 compatibility code
 
@@ -1871,6 +1872,239 @@ class StdFormatArgsPrinter:
 size = self.val['_M_unpacked_size']
 return "%s with %d arguments" % (typ, size)
 
+def std_ratio_t_tuple(ratio_type):
+# TODO use reduced period i.e. duration::period
+period = self.val.type.template_argument(1)
+num = period.template_argument(0)
+den = period.template_argument(1)
+return (num, den)
+
+class StdChronoDurationPrinter:
+"Print a std::chrono::duration"
+
+def __init__(self, typename, val):
+self.typename = strip_versioned_namespace(typename)
+self.val = val
+
+def _ratio(self):
+# TODO use reduced period i.e. duration::period
+period = self.val.type.template_argument(1)
+num = period.template_argument(0)
+den = period.template_argument(1)
+return (num, den)
+
+def _suffix(self):
+num, den = self._ratio()
+if num == 1:
+if den == 1:
+return 's'
+if den == 1000:
+return 'ms'
+if den == 100:
+return 'us'
+if den == 10:
+return 'ns'
+elif den == 1:
+if num == 60:
+return 'min'
+if num == 3600:
+return 'h'
+if num == 86400:
+return 'd'
+return '[{}]s'.format(num)
+return "[{}/{}]s".format(num, den)
+
+def to_string(self):
+return "std::chrono::duration = { %d%s }" % (self.val['__r'], 
self._suffix())
+
+
+class StdChronoTimePointPrinter:
+"Print a std::chrono::time_point"
+
+def __init__(self, typename, val):
+self.typename = strip_versioned_namespace(typename)
+self.val = val
+
+def _clock(self):
+clock = self.val.type.template_argument(0)
+name = strip_versioned_namespace(clock.name)
+if name == 'std::chrono::_V2::system_clock' \
+or name == 'std::chrono::system_clock':
+return ('std::chrono::sys_time', 0)
+# XXX need to remove leap seconds from utc, gps, and tai
+#if name == 'std::chrono::utc_clock':
+#return ('std::chrono::utc_time', 0)
+#if name == 'std::chrono::gps_clock':
+#return ('std::chrono::gps_clock time_point', 315964809)
+#if name == 'std::chrono::tai_clock':
+#return ('std::chrono::tai_clock time_point', -378691210)
+if name == 'std::filesystem::__file_clock':
+return ('std::chrono::file_time', 6437664000)
+if name == 'std::chrono::local_t':
+return ('std::chrono::local_time', 0)
+return ('{} time_point'.format(name), None)
+
+def to_string(self):
+clock, offset = self._clock()
+d = self.val['__d']
+r = d['__r']
+printer = StdChronoDurationPrinter(d.type.name, d)
+suffix = printer._suffix()
+time = ''
+if offset is not None:
+num, den = printer._ratio()
+secs = (r * num / den) + offset
+try:
+dt = datetime.fromtimestamp(secs, timezone.utc)
+time = ' [{:%Y-%m-%d %H:%M:%S}]'.format(dt)
+except:
+pass
+return '%s = {%d%s%s}' % (clock, r, suffix, time)
+
+class StdChronoZonedTimePrinter:
+"Print a std::chrono::zoned_time"
+
+def __init__(self, typename, val):
+self.typename = strip_versioned_namespace(typename)
+self.val = val
+
+def to_string(self):
+zone = self.val['_M_zone'].dereference()
+time = self.val['_M_tp']
+return 'std::chrono::zoned_time = {{{} {}}}'.format(zone, time)
+
+
+months = [None, 'January', 'February', 'March', 'April', 'May', 'June',
+  'July', 'August', 'September', 'October', 'November', 'December']
+
+weekdays = ['Sunday', 'Monday', 'Tuesday', '

[x86 PATCH] PR target/107548: Handle vec_select in STV.

2022-12-22 Thread Roger Sayle

This patch enhances x86's STV pass to handle VEC_SELECT during general
scalar chain conversion, performing SImode scalar extraction from V4SI
and DImode scalar extraction from V2DI vector registers.

The motivating test case from bugzilla is:

typedef unsigned int v4si __attribute__((vector_size(16)));

unsigned int f (v4si a, v4si b)
{
  a[0] += b[0];
  return a[0] + a[1];
}

currently with -O2 -march=znver2 this generates:

vpextrd $1, %xmm0, %edx
vmovd   %xmm0, %eax
addl%edx, %eax
vmovd   %xmm1, %edx
addl%edx, %eax
ret

which performs three transfers from the vector unit to the scalar unit,
and performs the two additions there.  With this patch, we now generate:

vmovdqa %xmm0, %xmm2
vpshufd $85, %xmm0, %xmm0
vpaddd  %xmm0, %xmm2, %xmm0
vpaddd  %xmm1, %xmm0, %xmm0
vmovd   %xmm0, %eax
ret

which performs the two additions in the vector unit, and then transfers
the result to the scalar unit.  Technically the (cheap) movdqa isn't
needed with better register allocation (or this could be cleaned up
during peephole2), but even so this transform is still a win.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-12-22  Roger Sayle  

gcc/ChangeLog
PR target/107548
* config/i386/i386-features.cc (scalar_chain::add_insn): The
operands of a VEC_SELECT don't need to added to the scalar chain.
(general_scalar_chain::compute_convert_gain) :
Provide gains for performing STV on a VEC_SELECT.
(general_scalar_chain::convert_insn): Convert VEC_SELECT to pshufd,
psrldq or no-op.
(general_scalar_to_vector_candidate_p): Handle VEC_SELECT of a
single element from a vector register to a scalar register.

gcc/testsuite/ChangeLog
PR target/107548
* gcc.target/i386/pr107548-1.c: New test V4SI case.
* gcc.target/i386/pr107548-1.c: New test V2DI case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index fd212262..cb21d3b 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -429,6 +429,11 @@ scalar_chain::add_insn (bitmap candidates, unsigned int 
insn_uid)
   for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
 if (!HARD_REGISTER_P (DF_REF_REG (ref)))
   analyze_register_chain (candidates, ref);
+
+  /* The operand(s) of VEC_SELECT don't need to be converted/convertible.  */
+  if (def_set && GET_CODE (SET_SRC (def_set)) == VEC_SELECT)
+return;
+
   for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
 if (!DF_REF_REG_MEM_P (ref))
   analyze_register_chain (candidates, ref);
@@ -629,6 +634,23 @@ general_scalar_chain::compute_convert_gain ()
  }
break;
 
+ case VEC_SELECT:
+   if (XVECEXP (XEXP (src, 1), 0, 0) == const0_rtx)
+ {
+   // movd (4 bytes) replaced with movdqa (4 bytes).
+   if (!optimize_insn_for_size_p ())
+ igain += ix86_cost->sse_to_integer - ix86_cost->xmm_move;
+ }
+   else
+ {
+   // pshufd; movd replaced with pshufd.
+   if (optimize_insn_for_size_p ())
+ igain += COSTS_N_BYTES (4);
+   else
+ igain += ix86_cost->sse_to_integer;
+ }
+   break;
+
  default:
gcc_unreachable ();
  }
@@ -1167,6 +1189,24 @@ general_scalar_chain::convert_insn (rtx_insn *insn)
   convert_op (&src, insn);
   break;
 
+case VEC_SELECT:
+  if (XVECEXP (XEXP (src, 1), 0, 0) == const0_rtx)
+   src = XEXP (src, 0);
+  else if (smode == DImode)
+   {
+ rtx tmp = gen_lowpart (V1TImode, XEXP (src, 0));
+ dst = gen_lowpart (V1TImode, dst);
+ src = gen_rtx_LSHIFTRT (V1TImode, tmp, GEN_INT (64));
+   }
+  else
+   {
+ rtx tmp = XVECEXP (XEXP (src, 1), 0, 0);
+ rtvec vec = gen_rtvec (4, tmp, tmp, tmp, tmp);
+ rtx par = gen_rtx_PARALLEL (VOIDmode, vec);
+ src = gen_rtx_VEC_SELECT (vmode, XEXP (src, 0), par);
+   }
+  break;
+
 default:
   gcc_unreachable ();
 }
@@ -1917,6 +1957,16 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, 
enum machine_mode mode)
 case CONST_INT:
   return REG_P (dst);
 
+case VEC_SELECT:
+  /* Excluding MEM_P (dst) avoids intefering with vpextr[dq].  */
+  return REG_P (dst)
+&& REG_P (XEXP (src, 0))
+&& GET_MODE (XEXP (src, 0)) == (mode == DImode ? V2DImode
+   : V4SImode)
+&& GET_CODE (XEXP (src, 1)) == PARALLEL
+&& XVECLEN (XEXP (src, 1), 0) == 1
+

[x86 PATCH] PR target/106933: Limit TImode STV to SSA-like def-use chains.

2022-12-22 Thread Roger Sayle

With many thanks to H.J. for doing all the hard work, this patch resolves
two P1 regressions; PR target/106933 and PR target/106959.

Although superficially similar, the i386 backend's two scalar-to-vector
(STV) passes perform their transformations in importantly different ways.
The original pass converting SImode and DImode operations to V4SImode
or V2DImode operations is "soft", allowing values to be maintained in
both integer and vector hard registers.  The newer pass converting TImode
operations to V1TImode is "hard" (all or nothing) that converts all uses
of a pseudo to vector form.  To implement this it invokes powerful ju-ju
calling SET_MODE on a REG_rtx, which due to RTL sharing, often updates
this pseudo's mode everywhere in the RTL chain.  Hence, TImode STV can only
be performed when all uses of a pseudo are convertible to V1TImode form.
To ensure this the STV passes currently use data-flow analysis to inspect
all DEFs and USEs in a chain.  This works fine for chains that are in
the usual single assignment form, but the occurrence of uninitialized
variables, or multiple assignments that split a pseudo's usage into
several independent chains (lifetimes) can lead to situations where
some but not all of a pseudo's occurrences need to be updated.  This is
safe for the SImode/DImode pass, but leads to the above bugs during
the TImode pass.

My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959
is to only perform the new single_def_chain_p check for TImode STV; it
turns out that STV of SImode/DImode min/max operates safely on multiple-def
chains, and prohibiting this leads to testsuite regressions.  We don't
(yet) support V1TImode min/max, so this idiom isn't an issue during the
TImode STV pass.

For the record, the two alternate possible fixes are (i) make the TImode
STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx
with a new pseudo, or (ii) merging "chains" so that multiple DFA
chains/lifetimes are considered a single STV chain.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-12-22  H.J. Lu  
Roger Sayle  

gcc/ChangeLog
PR target/106933
PR target/106959
* config/i386/i386-features.cc (single_def_chain_p): New predicate
function to check that a pseudo's use-def chain is in SSA form.
(timode_scalar_to_vector_candidate_p): Check that TImode regs that
are SET_DEST or SET_SRC of an insn match/are single_def_chain_p.

gcc/testsuite/ChangeLog
PR target/106933
PR target/106959
* gcc.target/i386/pr106933-1.c: New test case.
* gcc.target/i386/pr106933-2.c: Likewise.
* gcc.target/i386/pr106959-1.c: Likewise.
* gcc.target/i386/pr106959-2.c: Likewise.
* gcc.target/i386/pr106959-3.c: Likewise.

Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index fd212262..4bf8bb3 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -1756,6 +1756,19 @@ pseudo_reg_set (rtx_insn *insn)
   return set;
 }
 
+/* Return true if the register REG is defined in a single DEF chain.
+   If it is defined in more than one DEF chains, we may not be able
+   to convert it in all chains.  */
+
+static bool
+single_def_chain_p (rtx reg)
+{
+  df_ref ref = DF_REG_DEF_CHAIN (REGNO (reg));
+  if (!ref)
+return false;
+  return DF_REF_NEXT_REG (ref) == nullptr;
+}
+
 /* Check if comparison INSN may be transformed into vector comparison.
Currently we transform equality/inequality checks which look like:
(set (reg:CCZ 17 flags) (compare:CCZ (reg:TI x) (reg:TI y)))  */
@@ -1972,9 +1985,14 @@ timode_scalar_to_vector_candidate_p (rtx_insn *insn)
   && !TARGET_SSE_UNALIGNED_STORE_OPTIMAL)
 return false;
 
+  if (REG_P (dst) && !single_def_chain_p (dst))
+return false;
+
   switch (GET_CODE (src))
 {
 case REG:
+  return single_def_chain_p (src);
+
 case CONST_WIDE_INT:
   return true;
 
diff --git a/gcc/testsuite/gcc.target/i386/pr106933-1.c 
b/gcc/testsuite/gcc.target/i386/pr106933-1.c
new file mode 100644
index 000..bcd9576
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106933-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2" } */
+
+short int
+bar (void);
+
+__int128
+empty (void)
+{
+}
+
+__attribute__ ((simd)) int
+foo (__int128 *p)
+{
+  int a = 0x8000;
+
+  *p = empty ();
+
+  return *p == (a < bar ());
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/pr106933-2.c 
b/gcc/testsuite/gcc.target/i386/pr106933-2.c
new file mode 100644
index 000..ac7d07e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106933-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-msse4 -Os" } */
+
+__int128 n;
+
+__int128
+empty (void)
+{
+}
+
+int
+foo (void

Re: [PATCH 3/3] contrib: Add dg-out-generator.pl

2022-12-22 Thread Arsen Arsenović via Gcc-patches

Jason Merrill  writes:

> Aha, I wonder why the original tests have the terminal *?  Testcases elsewhere
> in the testsuite that check for (\n|\r\n|\r) don't use *.  I think I'll drop
> the * from both the tests and the script.
>
> Jason

Yep, that sounds reasonable.  I'm not sure why, the original handler
didn't emit double newlines or something either I don't think.

Thanks for handling that anyway.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-22 Thread Patrick Palka via Gcc-patches
On Thu, 22 Dec 2022, Jason Merrill wrote:

> On 12/22/22 16:41, Patrick Palka wrote:
> > On Thu, 22 Dec 2022, Jason Merrill wrote:
> > 
> > > On 12/22/22 11:31, Patrick Palka wrote:
> > > > On Wed, 21 Dec 2022, Jason Merrill wrote:
> > > > 
> > > > > On 12/21/22 09:52, Patrick Palka wrote:
> > > > > > Here during ahead of time checking of C{}, we indirectly call
> > > > > > get_nsdmi
> > > > > > for C::m from finish_compound_literal, which in turn calls
> > > > > > break_out_target_exprs for C::m's (non-templated) initializer,
> > > > > > during
> > > > > > which we end up building a call to A::~A and checking
> > > > > > expr_noexcept_p
> > > > > > for it (from build_vec_delete_1).  But this is all done with
> > > > > > processing_template_decl set, so the built A::~A call is templated
> > > > > > (whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
> > > > > > expr_noexcept_p doesn't expect and we crash.
> > > > > > 
> > > > > > In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
> > > > > > expr_noexcept_p call with !processing_template_decl, which works
> > > > > > here
> > > > > > too.  But it seems to me since the initializer we obtain in
> > > > > > get_nsdmi is
> > > > > > always non-templated, it should be calling break_out_target_exprs
> > > > > > with
> > > > > > processing_template_decl cleared since otherwise the function might
> > > > > > end
> > > > > > up mixing templated and non-templated trees.
> > > > > > 
> > > > > > I'm not sure about this though, perhaps this is not the best fix
> > > > > > here.
> > > > > > Alternatively, when processing_template_decl we could make get_nsdmi
> > > > > > avoid calling break_out_target_exprs at all or something.
> > > > > > Additionally,
> > > > > > perhaps break_out_target_exprs should be a no-op more generally when
> > > > > > processing_template_decl since we shouldn't see any TARGET_EXPRs
> > > > > > inside
> > > > > > a template?
> > > > > 
> > > > > Hmm.
> > > > > 
> > > > > Any time we would call break_out_target_exprs we're dealing with
> > > > > non-dependent
> > > > > expressions; if we're in a template, we're building up an initializer
> > > > > or a
> > > > > call that we'll soon throw away, just for the purpose of checking or
> > > > > type
> > > > > computation.
> > > > > 
> > > > > Furthermore, as you say, the argument is always a non-template tree,
> > > > > whether
> > > > > in get_nsdmi or convert_default_arg.  So having
> > > > > processing_template_decl
> > > > > cleared would be correct.
> > > > > 
> > > > > I don't think we can get away with not calling break_out_target_exprs
> > > > > at
> > > > > all
> > > > > in a template; if nothing else, we would lose immediate invocation
> > > > > expansion.
> > > > > However, we could probably skip the bot_manip tree walk, which should
> > > > > avoid
> > > > > the problem.
> > > > > 
> > > > > Either way we end up returning non-template trees, as we do now, and
> > > > > callers
> > > > > have to deal with transient CONSTRUCTORs containing such (as we do in
> > > > > massage_init_elt).
> > > > 
> > > > Ah I see, makes sense.
> > > > 
> > > > > 
> > > > > Does convert_default_arg not run into the same problem, e.g. when
> > > > > calling
> > > > > 
> > > > > void g(B = {0});
> > > > 
> > > > In practice it seems not, because we don't call convert_default_arg
> > > > when processing_template_decl is set (verified with an assert to
> > > > that effect).  In build_over_call for example we exit early when
> > > > processing_template_decl is set, and return a templated CALL_EXPR
> > > > that doesn't include default arguments at all.  A consequence of
> > > > this is that we don't reject ahead of time a call that would use
> > > > an ill-formed dependent default argument, e.g.
> > > > 
> > > > template
> > > > void g(B = T{0});
> > > > 
> > > > template
> > > > void f() {
> > > >   g();
> > > > }
> > > > 
> > > > since the default argument instantiation would be the responsibility
> > > > of convert_default_arg.
> > > > 
> > > > Thinking hypothetically here, if we do in the future want to include
> > > > default
> > > > arguments in the templated form of a CALL_EXPR,
> > > 
> > > We definitely do not want to; the templated form should be as close as
> > > possible to the source.
> > 
> > Ah, sounds good.
> > 
> > > 
> > > We might want to perform non-dependent conversions to get any errors (such
> > > as
> > > this one) before throwing away the result.  Which would be parallel to
> > > what we
> > > currently do in calling get_nsdmi, and would want the same behavior.
> > 
> > *nod*
> > 
> > > 
> > > > [snip]
> > > 
> > > > shall we go with the original approach to clear
> > > > processing_template_decl directly from get_nsdmi?
> > > 
> > > OK, but then we should also checking_assert !processing_template_decl in
> > > b_o_t_e.
> > 
> > Unfortunately we'd trigger that assert from maybe_constant_value, which
> > potentially calls b_o_t_e with pro

Re: [PATCH] go: fix clang warnings

2022-12-22 Thread Ian Lance Taylor via Gcc-patches
On Wed, Dec 21, 2022 at 12:05 AM Martin Liška  wrote:
>
> The patch fixes the following Clang warnings:
>
> gcc/go/gofrontend/escape.cc:1290:17: warning: private field 'fn_' is not used 
> [-Wunused-private-field]
> gcc/go/gofrontend/escape.cc:3478:19: warning: private field 'context_' is not 
> used [-Wunused-private-field]
> gcc/go/gofrontend/lex.h:564:15: warning: private field 'input_file_name_' is 
> not used [-Wunused-private-field]
> gcc/go/gofrontend/types.cc:5788:20: warning: private field 'call_' is not 
> used [-Wunused-private-field]
> gcc/go/gofrontend/wb.cc:206:9: warning: private field 'gogo_' is not used 
> [-Wunused-private-field]
>
> Ready for master?

Thanks.  Committed to mainline.

Ian


Re: [PATCH 3/3] contrib: Add dg-out-generator.pl

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 16:56, Arsen Arsenović wrote:

Hi,

Jason Merrill  writes:

+# Newlines should be more tolerant.
+s/\n$/(\\n|\\r\\n|\\r)*/;


Isn't specifically handling \\r\\n redundant with the * operator?


To the extent of my knowledge, yes; I left that in since the original
tests I was replacing with this script also used this terminator:

-// { dg-output "default std::handle_contract_violation called: .*.C 21 
test::fun .*(\n|\r\n|\r)*" }


Aha, I wonder why the original tests have the terminal *?  Testcases 
elsewhere in the testsuite that check for (\n|\r\n|\r) don't use *.  I 
think I'll drop the * from both the tests and the script.


Jason



Re: [PATCH] phiopt, v2: Adjust instead of reset phires range

2022-12-22 Thread Aldy Hernandez via Gcc-patches
LGTM

On Thu, Dec 22, 2022, 22:44 Jakub Jelinek  wrote:

> On Thu, Dec 22, 2022 at 08:46:33PM +0100, Aldy Hernandez wrote:
> > I haven't looked at your problem above, but have you tried using
> > int_range_max (or even int_range<2>) instead of value_range above?
> >
> > value_range is deprecated and uses the legacy anti-range business,
> > which has a really hard time representing complex ranges, as well as
> > union/intersecting them.
>
> You're right.  With int_range_max it works as I expected.
> And no, floating point isn't possible here.
> If value_range is right now just the legacy single range or anti-range,
> then it explains why it didn't work - while on the first testcase
> we could have anti-range ~[0, 0], on the second case [-128, -1] U [1, 127]
> is turned into simple legacy [-128, 127].
>
> So, ok for trunk if this passes bootstrap/regtest?
>
> 2022-12-22  Jakub Jelinek  
> Aldy Hernandez  
>
> * tree-ssa-phiopt.cc (value_replacement): Instead of resetting
> phires range info, union it with oarg.
>
> --- gcc/tree-ssa-phiopt.cc.jj   2022-12-22 12:52:36.588469821 +0100
> +++ gcc/tree-ssa-phiopt.cc  2022-12-22 13:11:51.145060050 +0100
> @@ -1492,11 +1492,25 @@ value_replacement (basic_block cond_bb,
> break;
>   }
>   if (equal_p)
> -   /* After the optimization PHI result can have value
> -  which it couldn't have previously.
> -  We could instead of resetting it union the range
> -  info with oarg.  */
> -   reset_flow_sensitive_info (gimple_phi_result (phi));
> +   {
> + tree phires = gimple_phi_result (phi);
> + if (SSA_NAME_RANGE_INFO (phires))
> +   {
> + /* After the optimization PHI result can have value
> +which it couldn't have previously.  */
> + int_range_max r;
> + if (get_global_range_query ()->range_of_expr (r,
> phires,
> +   phi))
> +   {
> + int_range<2> tmp (carg, carg);
> + r.union_ (tmp);
> + reset_flow_sensitive_info (phires);
> + set_range_info (phires, r);
> +   }
> + else
> +   reset_flow_sensitive_info (phires);
> +   }
> +   }
>   if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
> {
>   imm_use_iterator imm_iter;
>
>
> Jakub
>
>


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-22 Thread Segher Boessenkool
On Thu, Dec 22, 2022 at 07:48:28PM +, Joseph Myers wrote:
> On Thu, 22 Dec 2022, Segher Boessenkool wrote:
> > On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> > > On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> > > > Joseph: what do you think about this patch?  Is the workaround it
> > > > removes still useful in any way, do we need to do that some other way if
> > > > we remove this?
> > 
> > You didn't address these questions.  We don't see negative effects from
> > removing this workaround, but it isn't clear (to me) what problems were
> > there that caused you to do this workaround.  Do you remember maybe?  Or
> > can we just delete it and try to forget such worries :-)
> 
> The purpose was to ensure that _Float128's TYPE_PRECISION was at least as 
> large as that of long double, in the case where they both have binary128 
> format.  I think at that time, in GCC 7, it was possible for _Float128 to 
> be KFmode and long double to be TFmode, with those being different modes 
> with the same format.

They still are separate modes :-(  It always is possible to create
KFmode entities (via mode((KF)) if nothing else) and those should behave
exactly the same as TFmode if TFmode is IEEE QP (just like KFmode always
is).

> In my view, it would be best not to have different modes with the same 
> format - not simply ensure types with the same format have the same mode, 
> but avoid multiple modes with the same format existing in the compiler at 
> all.  That is, TFmode should be the same mode as one of KFmode and IFmode 
> (one name should be defined as a macro for the other name, or something 
> similar).

Right, TFmode should be just a different *name* for either IFmode or
KFmode (and both of those modes always exist if either does).

> If you don't have different modes with the same format, many of 
> the problems go away.

I used to have patches for this.  A few problems remained, but this
was very long ago, who knows where we stand now.  I'll recreate those
patches, let's see where that gets us.

Thanks for the help,


Segher


Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 16:41, Patrick Palka wrote:

On Thu, 22 Dec 2022, Jason Merrill wrote:


On 12/22/22 11:31, Patrick Palka wrote:

On Wed, 21 Dec 2022, Jason Merrill wrote:


On 12/21/22 09:52, Patrick Palka wrote:

Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we end up building a call to A::~A and checking expr_noexcept_p
for it (from build_vec_delete_1).  But this is all done with
processing_template_decl set, so the built A::~A call is templated
(whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
expr_noexcept_p doesn't expect and we crash.

In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
expr_noexcept_p call with !processing_template_decl, which works here
too.  But it seems to me since the initializer we obtain in get_nsdmi is
always non-templated, it should be calling break_out_target_exprs with
processing_template_decl cleared since otherwise the function might end
up mixing templated and non-templated trees.

I'm not sure about this though, perhaps this is not the best fix here.
Alternatively, when processing_template_decl we could make get_nsdmi
avoid calling break_out_target_exprs at all or something.  Additionally,
perhaps break_out_target_exprs should be a no-op more generally when
processing_template_decl since we shouldn't see any TARGET_EXPRs inside
a template?


Hmm.

Any time we would call break_out_target_exprs we're dealing with
non-dependent
expressions; if we're in a template, we're building up an initializer or a
call that we'll soon throw away, just for the purpose of checking or type
computation.

Furthermore, as you say, the argument is always a non-template tree,
whether
in get_nsdmi or convert_default_arg.  So having processing_template_decl
cleared would be correct.

I don't think we can get away with not calling break_out_target_exprs at
all
in a template; if nothing else, we would lose immediate invocation
expansion.
However, we could probably skip the bot_manip tree walk, which should
avoid
the problem.

Either way we end up returning non-template trees, as we do now, and
callers
have to deal with transient CONSTRUCTORs containing such (as we do in
massage_init_elt).


Ah I see, makes sense.



Does convert_default_arg not run into the same problem, e.g. when calling

void g(B = {0});


In practice it seems not, because we don't call convert_default_arg
when processing_template_decl is set (verified with an assert to
that effect).  In build_over_call for example we exit early when
processing_template_decl is set, and return a templated CALL_EXPR
that doesn't include default arguments at all.  A consequence of
this is that we don't reject ahead of time a call that would use
an ill-formed dependent default argument, e.g.

template
void g(B = T{0});

template
void f() {
  g();
}

since the default argument instantiation would be the responsibility
of convert_default_arg.

Thinking hypothetically here, if we do in the future want to include default
arguments in the templated form of a CALL_EXPR,


We definitely do not want to; the templated form should be as close as
possible to the source.


Ah, sounds good.



We might want to perform non-dependent conversions to get any errors (such as
this one) before throwing away the result.  Which would be parallel to what we
currently do in calling get_nsdmi, and would want the same behavior.


*nod*




[snip]



shall we go with the original approach to clear
processing_template_decl directly from get_nsdmi?


OK, but then we should also checking_assert !processing_template_decl in
b_o_t_e.


Unfortunately we'd trigger that assert from maybe_constant_value, which
potentially calls b_o_t_e with processing_template_decl set.


maybe_constant_value could also clear processing_template_decl; entries 
in cv_cache are non-templated.



Bootstrapped and regtested on x86_64-pc-linux-gnu.

PR c++/108116

gcc/cp/ChangeLog:

* init.cc (get_nsdmi): Clear processing_template_decl before
processing the non-templated initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template24.C: New test.
---
gcc/cp/init.cc|  8 ++-
gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22
+++
2 files changed, 29 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 73e6547c076..c4345ebdaea 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -561,7 +561,8 @@ perform_target_ctor (tree init)
  return init;
}
-/* Return the non-static data initializer for FIELD_DECL MEMBER.  */
+/* Return the non-static data initializer for FIELD_DECL MEMBER.
+   The initializer returned is always non-templated.  */
  static GTY((cache)) decl_tree_cache_map *nsdmi_ins

Re: [PATCH 3/3] contrib: Add dg-out-generator.pl

2022-12-22 Thread Arsen Arsenović via Gcc-patches
Hi,

Jason Merrill  writes:
>> +# Newlines should be more tolerant.
>> +s/\n$/(\\n|\\r\\n|\\r)*/;
>
> Isn't specifically handling \\r\\n redundant with the * operator?

To the extent of my knowledge, yes; I left that in since the original
tests I was replacing with this script also used this terminator:

-// { dg-output "default std::handle_contract_violation called: .*.C 21 
test::fun .*(\n|\r\n|\r)*" }
+// { dg-output {contract violation in function test::fun at .*:21: b > 
0(\n|\r\n|\r)*} }
+// { dg-output {\[continue:on\](\n|\r\n|\r)*} }

That could easily use the simpler [\r\n]* form too:

% regexp {^[\r\n]*$} "\r\n\n\n"
1

Feel free to swap that in too.

Thanks, have a great night.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [PATCH 1/3] libstdc++: Improve output of default contract violation handler [PR107792]

2022-12-22 Thread Jonathan Wakely via Gcc-patches
On Thu, 22 Dec 2022 at 21:41, Jason Merrill via Libstdc++
 wrote:
>
> On 12/22/22 06:03, Arsen Arsenović wrote:
> > From: Jonathan Wakely 
> >
> > Make the output more readable. Don't output anything unless verbose
> > termination is enabled at configure-time.
>
> LGTM if Jonathan agrees.  The testsuite changes should be applied in the
> same commit.

Yup, Arsen and I have been discussing this patch over IRC, I'm happy with it.

>
> > libstdc++-v3/ChangeLog:
> >
> >   PR libstdc++/107792
> >   PR libstdc++/107778
> >   * src/experimental/contract.cc (handle_contract_violation): Make
> >   output more readable.
> > ---
> > Heh, wouldn't be me if I forgot nothing.  Sorry about that.
> >
> > How's this?
> >
> >   libstdc++-v3/src/experimental/contract.cc | 50 ++-
> >   1 file changed, 39 insertions(+), 11 deletions(-)
> >
> > diff --git a/libstdc++-v3/src/experimental/contract.cc 
> > b/libstdc++-v3/src/experimental/contract.cc
> > index c8d2697eddc..2d41a6326cf 100644
> > --- a/libstdc++-v3/src/experimental/contract.cc
> > +++ b/libstdc++-v3/src/experimental/contract.cc
> > @@ -1,4 +1,5 @@
> >   // -*- C++ -*- std::experimental::contract_violation and friends
> > +
> >   // Copyright (C) 2019-2022 Free Software Foundation, Inc.
> >   //
> >   // This file is part of GCC.
> > @@ -23,19 +24,46 @@
> >   // .
> >
> >   #include 
> > -#include 
> > +#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
> > +# include 
> > +#endif
> >
> >   __attribute__ ((weak)) void
> >   handle_contract_violation (const std::experimental::contract_violation 
> > &violation)
> >   {
> > -  std::cerr << "default std::handle_contract_violation called: \n"
> > -<< " " << violation.file_name()
> > -<< " " << violation.line_number()
> > -<< " " << violation.function_name()
> > -<< " " << violation.comment()
> > -<< " " << violation.assertion_level()
> > -<< " " << violation.assertion_role()
> > -<< " " << (int)violation.continuation_mode()
> > -<< std::endl;
> > +#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
> > +  bool level_default_p = violation.assertion_level() == "default";
> > +  bool role_default_p = violation.assertion_role() == "default";
> > +  bool cont_mode_default_p = violation.continuation_mode()
> > +== 
> > std::experimental::contract_violation_continuation_mode::never_continue;
> > +
> > +  const char* modes[]{ "off", "on" }; // Must match enumerators in header.
> > +  std::cerr << "contract violation in function " << 
> > violation.function_name()
> > +<< " at " << violation.file_name() << ':' << violation.line_number()
> > +<< ": " << violation.comment();
> > +
> > +  const char* delimiter = "\n[";
> > +
> > +  if (!level_default_p)
> > +{
> > +  std::cerr << delimiter << "level:" << violation.assertion_level();
> > +  delimiter = ", ";
> > +}
> > +  if (!role_default_p)
> > +{
> > +  std::cerr << delimiter << "role:" << violation.assertion_role();
> > +  delimiter = ", ";
> > +}
> > +  if (!cont_mode_default_p)
> > +{
> > +  std::cerr << delimiter << "continue:"
> > + << modes[(int)violation.continuation_mode() & 1];
> > +  delimiter = ", ";
> > +}
> > +
> > +  if (delimiter[0] == ',')
> > +std::cerr << ']';
> > +
> > +  std::cerr << std::endl;
> > +#endif
> >   }
> > -
>


Re: [PATCH] Fortran: check for invalid uses of statement functions arguments [PR69604]

2022-12-22 Thread Steve Kargl via Gcc-patches
On Thu, Dec 22, 2022 at 10:13:04PM +0100, Harald Anlauf via Fortran wrote:
> 
> the attached patch adds a check for statement function bodies for
> invalid uses of dummy arguments.  This fixes an ICE-on invalid.
> 
> Regtested on x86_64-pc-linux-gnu.  OK for mainline?
> 

Yes. Thanks for the patch.

-- 
Steve


Re: [PATCH] c, c++, cgraphunit: Prevent duplicated -Wunused-value warnings [PR108079]

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 05:32, Jakub Jelinek wrote:

Hi!

On the following testcase, we warn with -Wunused-value twice, once
in the FEs and later on cgraphunit again with slightly different
wording.

The following patch fixes that by registering a warning suppression in the
FEs when we warn and not warning in cgraphunit anymore if that happened.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2022-12-22  Jakub Jelinek  

PR c/108079
gcc/
* cgraphunit.cc (check_global_declaration): Don't warn for unused
variables which have OPT_Wunused_variable warning suppressed.
gcc/c/
* c-decl.cc (pop_scope): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/cp/
* decl.cc (poplevel): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/testsuite/
* c-c++-common/Wunused-var-18.c: New test.

--- gcc/cgraphunit.cc.jj2022-10-18 10:38:48.0 +0200
+++ gcc/cgraphunit.cc   2022-12-21 15:14:34.687939477 +0100
@@ -1122,6 +1122,7 @@ check_global_declaration (symtab_node *s
&& (TREE_CODE (decl) != FUNCTION_DECL
  || (!DECL_STATIC_CONSTRUCTOR (decl)
  && !DECL_STATIC_DESTRUCTOR (decl)))
+  && (! VAR_P (decl) || !warning_suppressed_p (decl, OPT_Wunused_variable))
/* Otherwise, ask the language.  */
&& lang_hooks.decls.warn_unused_global (decl))
  warning_at (DECL_SOURCE_LOCATION (decl),
--- gcc/c/c-decl.cc.jj  2022-12-19 11:08:31.500766238 +0100
+++ gcc/c/c-decl.cc 2022-12-21 14:52:40.251919370 +0100
@@ -1310,7 +1310,10 @@ pop_scope (void)
  && scope != external_scope)
{
  if (!TREE_USED (p))
-   warning (OPT_Wunused_variable, "unused variable %q+D", p);
+   {
+ warning (OPT_Wunused_variable, "unused variable %q+D", p);
+ suppress_warning (p, OPT_Wunused_variable);
+   }
  else if (DECL_CONTEXT (p) == current_function_decl)
warning_at (DECL_SOURCE_LOCATION (p),
OPT_Wunused_but_set_variable,
--- gcc/cp/decl.cc.jj   2022-12-21 09:03:45.437566855 +0100
+++ gcc/cp/decl.cc  2022-12-21 14:51:07.043265263 +0100
@@ -693,6 +693,7 @@ poplevel (int keep, int reverse, int fun
else
  warning_at (DECL_SOURCE_LOCATION (decl),
  OPT_Wunused_variable, "unused variable %qD", 
decl);
+   suppress_warning (decl, OPT_Wunused_variable);
  }
else if (DECL_CONTEXT (decl) == current_function_decl
 // For -Wunused-but-set-variable leave references alone.
--- gcc/testsuite/c-c++-common/Wunused-var-18.c.jj  2022-12-21 
15:28:03.112273963 +0100
+++ gcc/testsuite/c-c++-common/Wunused-var-18.c 2022-12-21 15:27:05.246107581 
+0100
@@ -0,0 +1,10 @@
+/* PR c/108079 */
+/* { dg-do compile } */
+/* { dg-options "-Wunused-variable" } */
+
+int
+main ()
+{
+  static int x;/* { dg-warning "unused variable 'x'" } */
+   /* { dg-bogus "'x' defined but not used" "" { target *-*-* } 
.-1 } */
+}

Jakub





[PATCH] phiopt, v2: Adjust instead of reset phires range

2022-12-22 Thread Jakub Jelinek via Gcc-patches
On Thu, Dec 22, 2022 at 08:46:33PM +0100, Aldy Hernandez wrote:
> I haven't looked at your problem above, but have you tried using
> int_range_max (or even int_range<2>) instead of value_range above?
> 
> value_range is deprecated and uses the legacy anti-range business,
> which has a really hard time representing complex ranges, as well as
> union/intersecting them.

You're right.  With int_range_max it works as I expected.
And no, floating point isn't possible here.
If value_range is right now just the legacy single range or anti-range,
then it explains why it didn't work - while on the first testcase
we could have anti-range ~[0, 0], on the second case [-128, -1] U [1, 127]
is turned into simple legacy [-128, 127].

So, ok for trunk if this passes bootstrap/regtest?

2022-12-22  Jakub Jelinek  
Aldy Hernandez  

* tree-ssa-phiopt.cc (value_replacement): Instead of resetting
phires range info, union it with oarg.

--- gcc/tree-ssa-phiopt.cc.jj   2022-12-22 12:52:36.588469821 +0100
+++ gcc/tree-ssa-phiopt.cc  2022-12-22 13:11:51.145060050 +0100
@@ -1492,11 +1492,25 @@ value_replacement (basic_block cond_bb,
break;
  }
  if (equal_p)
-   /* After the optimization PHI result can have value
-  which it couldn't have previously.
-  We could instead of resetting it union the range
-  info with oarg.  */
-   reset_flow_sensitive_info (gimple_phi_result (phi));
+   {
+ tree phires = gimple_phi_result (phi);
+ if (SSA_NAME_RANGE_INFO (phires))
+   {
+ /* After the optimization PHI result can have value
+which it couldn't have previously.  */
+ int_range_max r;
+ if (get_global_range_query ()->range_of_expr (r, phires,
+   phi))
+   {
+ int_range<2> tmp (carg, carg);
+ r.union_ (tmp);
+ reset_flow_sensitive_info (phires);
+ set_range_info (phires, r);
+   }
+ else
+   reset_flow_sensitive_info (phires);
+   }
+   }
  if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
{
  imm_use_iterator imm_iter;


Jakub



Re: [PATCH 3/3] contrib: Add dg-out-generator.pl

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 06:03, Arsen Arsenović wrote:

This script is a helper used to generate dg-output lines from an existing
program output conveniently.  It takes care of escaping Tcl and ARE stuff.

contrib/ChangeLog:

* dg-out-generator.pl: New file.
---
I updated this file to include the proper copyright header, after dkm notified
me that I got it wrong on IRC ;D

  contrib/dg-out-generator.pl | 79 +
  1 file changed, 79 insertions(+)
  create mode 100755 contrib/dg-out-generator.pl

diff --git a/contrib/dg-out-generator.pl b/contrib/dg-out-generator.pl
new file mode 100755
index 000..1e9247165b2
--- /dev/null
+++ b/contrib/dg-out-generator.pl
@@ -0,0 +1,79 @@
+#!/usr/bin/env perl
+#
+# Copyright (C) 2022 Free Software Foundation, Inc.
+# Contributed by Arsen Arsenović.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+# This script reads program output on STDIN, and out of it produces a block of
+# dg-output lines that can be yanked at the end of a file.  It will escape
+# special ARE and Tcl constructs automatically.
+#
+# Each argument passed on the standard input is treated as a string to be
+# replaced by ``.*'' in the final result.  This is intended to mask out build
+# paths, filenames, etc.
+#
+# Usage example:
+
+# $ g++-13 -fcontracts -o test \
+#  'g++.dg/contracts/contracts-access1.C' && \
+#   ./test |& dg-out-generator.pl 'g++.dg/contracts/contracts-access1.C'
+# // { dg-output {contract violation in function Base::b at .*:11: pub > 
0(\n|\r\n|\r)*} }
+# // { dg-output {\[level:default, role:default, continuation 
mode:never\](\n|\r\n|\r)*} }
+# // { dg-output {terminate called without an active exception(\n|\r\n|\r)*} }
+
+# You can now freely dump the above into your testcase.
+
+use strict;
+use warnings;
+use POSIX 'floor';
+
+my $escapees = '(' . join ('|', map { quotemeta } @ARGV) . ')';
+
+sub gboundary($)
+{
+  my $str = shift;
+  my $sz = 10.0;
+  for (;;)
+{
+  my $bnd = join '', (map chr 64 + rand 27, 1 .. floor $sz);
+  return $bnd unless index ($str, $bnd) >= 0;
+  $sz += 0.1;
+}
+}
+
+while ()
+  {
+# Escape our escapees.
+my $boundary;
+if (@ARGV) {
+  # Checking this is necessary to avoid a spurious .* between all
+  # characters if no arguments are passed.
+  $boundary = gboundary $_;
+  s/$escapees/$boundary/g;
+}
+
+# Quote stuff special in Tcl ARE.  This step also effectively nulls any
+# concern about escaping.  As long as all curly braces are escaped, the
+# string will, when passing through the braces rule of Tcl, be identical to
+# the input.
+s/([[\]*+?{}()\\])/\\$1/g;
+
+# Newlines should be more tolerant.
+s/\n$/(\\n|\\r\\n|\\r)*/;


Isn't specifically handling \\r\\n redundant with the * operator?


+# Then split out the boundary, replacing it with .*.
+s/$boundary/.*/g if defined $boundary;
+
+# Then, let's print it in a dg-output block.  If you'd prefer /* keep in
+# mind that if your string contains */ it could terminate the comment
+# early.  Maybe add an extra s!\*/!*()/!g or something.
+print "// { dg-output {$_} }\n";
+  }
+
+# File Local Vars:
+# indent-tabs-mode: nil
+# End:




Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-22 Thread Patrick Palka via Gcc-patches
On Thu, 22 Dec 2022, Jason Merrill wrote:

> On 12/22/22 11:31, Patrick Palka wrote:
> > On Wed, 21 Dec 2022, Jason Merrill wrote:
> > 
> > > On 12/21/22 09:52, Patrick Palka wrote:
> > > > Here during ahead of time checking of C{}, we indirectly call get_nsdmi
> > > > for C::m from finish_compound_literal, which in turn calls
> > > > break_out_target_exprs for C::m's (non-templated) initializer, during
> > > > which we end up building a call to A::~A and checking expr_noexcept_p
> > > > for it (from build_vec_delete_1).  But this is all done with
> > > > processing_template_decl set, so the built A::~A call is templated
> > > > (whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
> > > > expr_noexcept_p doesn't expect and we crash.
> > > > 
> > > > In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
> > > > expr_noexcept_p call with !processing_template_decl, which works here
> > > > too.  But it seems to me since the initializer we obtain in get_nsdmi is
> > > > always non-templated, it should be calling break_out_target_exprs with
> > > > processing_template_decl cleared since otherwise the function might end
> > > > up mixing templated and non-templated trees.
> > > > 
> > > > I'm not sure about this though, perhaps this is not the best fix here.
> > > > Alternatively, when processing_template_decl we could make get_nsdmi
> > > > avoid calling break_out_target_exprs at all or something.  Additionally,
> > > > perhaps break_out_target_exprs should be a no-op more generally when
> > > > processing_template_decl since we shouldn't see any TARGET_EXPRs inside
> > > > a template?
> > > 
> > > Hmm.
> > > 
> > > Any time we would call break_out_target_exprs we're dealing with
> > > non-dependent
> > > expressions; if we're in a template, we're building up an initializer or a
> > > call that we'll soon throw away, just for the purpose of checking or type
> > > computation.
> > > 
> > > Furthermore, as you say, the argument is always a non-template tree,
> > > whether
> > > in get_nsdmi or convert_default_arg.  So having processing_template_decl
> > > cleared would be correct.
> > > 
> > > I don't think we can get away with not calling break_out_target_exprs at
> > > all
> > > in a template; if nothing else, we would lose immediate invocation
> > > expansion.
> > > However, we could probably skip the bot_manip tree walk, which should
> > > avoid
> > > the problem.
> > > 
> > > Either way we end up returning non-template trees, as we do now, and
> > > callers
> > > have to deal with transient CONSTRUCTORs containing such (as we do in
> > > massage_init_elt).
> > 
> > Ah I see, makes sense.
> > 
> > > 
> > > Does convert_default_arg not run into the same problem, e.g. when calling
> > > 
> > >void g(B = {0});
> > 
> > In practice it seems not, because we don't call convert_default_arg
> > when processing_template_decl is set (verified with an assert to
> > that effect).  In build_over_call for example we exit early when
> > processing_template_decl is set, and return a templated CALL_EXPR
> > that doesn't include default arguments at all.  A consequence of
> > this is that we don't reject ahead of time a call that would use
> > an ill-formed dependent default argument, e.g.
> > 
> >template
> >void g(B = T{0});
> > 
> >template
> >void f() {
> >  g();
> >}
> > 
> > since the default argument instantiation would be the responsibility
> > of convert_default_arg.
> > 
> > Thinking hypothetically here, if we do in the future want to include default
> > arguments in the templated form of a CALL_EXPR,
> 
> We definitely do not want to; the templated form should be as close as
> possible to the source.

Ah, sounds good.

> 
> We might want to perform non-dependent conversions to get any errors (such as
> this one) before throwing away the result.  Which would be parallel to what we
> currently do in calling get_nsdmi, and would want the same behavior.

*nod*

> 
> > [snip]
> 
> > shall we go with the original approach to clear
> > processing_template_decl directly from get_nsdmi?
> 
> OK, but then we should also checking_assert !processing_template_decl in
> b_o_t_e.

Unfortunately we'd trigger that assert from maybe_constant_value, which
potentially calls b_o_t_e with processing_template_decl set.

> 
> Jason
> 
> > > ?
> > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu.
> > > > 
> > > > PR c++/108116
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * init.cc (get_nsdmi): Clear processing_template_decl before
> > > > processing the non-templated initializer.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp0x/nsdmi-template24.C: New test.
> > > > ---
> > > >gcc/cp/init.cc|  8 ++-
> > > >gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22
> > > > +++
> > > >2 files changed, 29 insertions(+), 1 deletion(-)
> > > >create mode 100644 g

Re: [PATCH 1/3] libstdc++: Improve output of default contract violation handler [PR107792]

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 06:03, Arsen Arsenović wrote:

From: Jonathan Wakely 

Make the output more readable. Don't output anything unless verbose
termination is enabled at configure-time.


LGTM if Jonathan agrees.  The testsuite changes should be applied in the 
same commit.



libstdc++-v3/ChangeLog:

PR libstdc++/107792
PR libstdc++/107778
* src/experimental/contract.cc (handle_contract_violation): Make
output more readable.
---
Heh, wouldn't be me if I forgot nothing.  Sorry about that.

How's this?

  libstdc++-v3/src/experimental/contract.cc | 50 ++-
  1 file changed, 39 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/src/experimental/contract.cc 
b/libstdc++-v3/src/experimental/contract.cc
index c8d2697eddc..2d41a6326cf 100644
--- a/libstdc++-v3/src/experimental/contract.cc
+++ b/libstdc++-v3/src/experimental/contract.cc
@@ -1,4 +1,5 @@
  // -*- C++ -*- std::experimental::contract_violation and friends
+
  // Copyright (C) 2019-2022 Free Software Foundation, Inc.
  //
  // This file is part of GCC.
@@ -23,19 +24,46 @@
  // .
  
  #include 

-#include 
+#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
+# include 
+#endif
  
  __attribute__ ((weak)) void

  handle_contract_violation (const std::experimental::contract_violation 
&violation)
  {
-  std::cerr << "default std::handle_contract_violation called: \n"
-<< " " << violation.file_name()
-<< " " << violation.line_number()
-<< " " << violation.function_name()
-<< " " << violation.comment()
-<< " " << violation.assertion_level()
-<< " " << violation.assertion_role()
-<< " " << (int)violation.continuation_mode()
-<< std::endl;
+#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
+  bool level_default_p = violation.assertion_level() == "default";
+  bool role_default_p = violation.assertion_role() == "default";
+  bool cont_mode_default_p = violation.continuation_mode()
+== std::experimental::contract_violation_continuation_mode::never_continue;
+
+  const char* modes[]{ "off", "on" }; // Must match enumerators in header.
+  std::cerr << "contract violation in function " << violation.function_name()
+<< " at " << violation.file_name() << ':' << violation.line_number()
+<< ": " << violation.comment();
+
+  const char* delimiter = "\n[";
+
+  if (!level_default_p)
+{
+  std::cerr << delimiter << "level:" << violation.assertion_level();
+  delimiter = ", ";
+}
+  if (!role_default_p)
+{
+  std::cerr << delimiter << "role:" << violation.assertion_role();
+  delimiter = ", ";
+}
+  if (!cont_mode_default_p)
+{
+  std::cerr << delimiter << "continue:"
+   << modes[(int)violation.continuation_mode() & 1];
+  delimiter = ", ";
+}
+
+  if (delimiter[0] == ',')
+std::cerr << ']';
+
+  std::cerr << std::endl;
+#endif
  }
-




Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 11:31, Patrick Palka wrote:

On Wed, 21 Dec 2022, Jason Merrill wrote:


On 12/21/22 09:52, Patrick Palka wrote:

Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we end up building a call to A::~A and checking expr_noexcept_p
for it (from build_vec_delete_1).  But this is all done with
processing_template_decl set, so the built A::~A call is templated
(whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
expr_noexcept_p doesn't expect and we crash.

In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
expr_noexcept_p call with !processing_template_decl, which works here
too.  But it seems to me since the initializer we obtain in get_nsdmi is
always non-templated, it should be calling break_out_target_exprs with
processing_template_decl cleared since otherwise the function might end
up mixing templated and non-templated trees.

I'm not sure about this though, perhaps this is not the best fix here.
Alternatively, when processing_template_decl we could make get_nsdmi
avoid calling break_out_target_exprs at all or something.  Additionally,
perhaps break_out_target_exprs should be a no-op more generally when
processing_template_decl since we shouldn't see any TARGET_EXPRs inside
a template?


Hmm.

Any time we would call break_out_target_exprs we're dealing with non-dependent
expressions; if we're in a template, we're building up an initializer or a
call that we'll soon throw away, just for the purpose of checking or type
computation.

Furthermore, as you say, the argument is always a non-template tree, whether
in get_nsdmi or convert_default_arg.  So having processing_template_decl
cleared would be correct.

I don't think we can get away with not calling break_out_target_exprs at all
in a template; if nothing else, we would lose immediate invocation expansion.
However, we could probably skip the bot_manip tree walk, which should avoid
the problem.

Either way we end up returning non-template trees, as we do now, and callers
have to deal with transient CONSTRUCTORs containing such (as we do in
massage_init_elt).


Ah I see, makes sense.



Does convert_default_arg not run into the same problem, e.g. when calling

   void g(B = {0});


In practice it seems not, because we don't call convert_default_arg
when processing_template_decl is set (verified with an assert to
that effect).  In build_over_call for example we exit early when
processing_template_decl is set, and return a templated CALL_EXPR
that doesn't include default arguments at all.  A consequence of
this is that we don't reject ahead of time a call that would use
an ill-formed dependent default argument, e.g.

   template
   void g(B = T{0});

   template
   void f() {
 g();
   }

since the default argument instantiation would be the responsibility
of convert_default_arg.

Thinking hypothetically here, if we do in the future want to include default
arguments in the templated form of a CALL_EXPR,


We definitely do not want to; the templated form should be as close as 
possible to the source.


We might want to perform non-dependent conversions to get any errors 
(such as this one) before throwing away the result.  Which would be 
parallel to what we currently do in calling get_nsdmi, and would want 
the same behavior.



[snip]



shall we go with the original approach to clear
processing_template_decl directly from get_nsdmi?


OK, but then we should also checking_assert !processing_template_decl in 
b_o_t_e.


Jason


?


Bootstrapped and regtested on x86_64-pc-linux-gnu.

PR c++/108116

gcc/cp/ChangeLog:

* init.cc (get_nsdmi): Clear processing_template_decl before
processing the non-templated initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template24.C: New test.
---
   gcc/cp/init.cc|  8 ++-
   gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22 +++
   2 files changed, 29 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 73e6547c076..c4345ebdaea 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -561,7 +561,8 @@ perform_target_ctor (tree init)
 return init;
   }
   -/* Return the non-static data initializer for FIELD_DECL MEMBER.  */
+/* Return the non-static data initializer for FIELD_DECL MEMBER.
+   The initializer returned is always non-templated.  */
 static GTY((cache)) decl_tree_cache_map *nsdmi_inst;
   @@ -670,6 +671,11 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t
complain)
 current_class_ptr = build_address (current_class_ref);
   }
   +  /* Since INIT is always non-templated clear processing_template_decl
+ before processing it so that we don't interleave templated and
+ non-templated trees.  */
+  processing_te

Re: [PATCH] c++, driver: Fix -static-libstdc++ for targets without Bstatic/dynamic.

2022-12-22 Thread Iain Sandoe



> On 22 Dec 2022, at 21:15, Jason Merrill  wrote:
> 
> On 12/4/22 11:30, Iain Sandoe wrote:
>> This fixes a long-standing problem on Darwin where we cannot independently 
>> set
>> -static-libstdc++ because the flag gets stripped by the g++ driver.
>> This patch is essentially the same as the one used for the 'D' driver and has
>> been in local use for some time.  It has also been tested on Linux.
>> OK for master?
>> backports?
>> thanks
>> Iain
>> -- >8 --
>> The current implementation for swapping between the static and shared c++
>> runtimes relies on the static linker supporting Bstatic/dynamic which is
>> not available for every target (Darwin's linker does not support this).
>> Specs substitution (%s) is an alternative solution for this (which is what
>> Darwin uses for Fortran, D and Objective-C).  However, specs substitution
>> requires that the '-static-libstdc++' be preserved in the driver's command
>> line.  The patch here arranges for this to be done when the configuration
>> determines that linker support for Bstatic/dynamic is missing.
> 
> Would it work to define LIBSTDCXX_STATIC instead?

Not without modifying the build of libstdc++.  When Darwin’s linker sees a 
convenience library with the
same name as a shared one, it will pick the shared one, so we would have to 
modify the build of libstdc++
to make the library named libstdc++-static.a or so (that was essentially what 
the Apple gcc-4.2.1 impl.
did AFAIR).

>  If not, the patch is OK.

Thanks.

> 
> Really there should be a way for lang_specific_driver to mark a flag as 
> "validated" rather than prune it.

Yes - especially since we now already three drivers (c++, d, gm2) that need to 
do the same stuff for
handing libstdc++.
Iain

> 
>> Signed-off-by: Iain Sandoe 
>> gcc/cp/ChangeLog:
>>  * g++spec.cc (lang_specific_driver): Preserve -static-libstdc++ in
>>  the driver command line for targets without -Bstatic/dynamic support
>>  in their static linker.
>> ---
>>  gcc/cp/g++spec.cc | 5 +
>>  1 file changed, 5 insertions(+)
>> diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
>> index 3d3b042dd56..f95d7965355 100644
>> --- a/gcc/cp/g++spec.cc
>> +++ b/gcc/cp/g++spec.cc
>> @@ -234,7 +234,12 @@ lang_specific_driver (struct cl_decoded_option 
>> **in_decoded_options,
>>  case OPT_static_libstdc__:
>>library = library >= 0 ? 2 : library;
>> +#ifdef HAVE_LD_STATIC_DYNAMIC
>> +  /* Remove -static-libstdc++ from the command only if target supports
>> + LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
>> + back-end target can use outfile substitution.  */
>>args[i] |= SKIPOPT;
>> +#endif
>>break;
>>  case OPT_stdlib_:
> 



Re: [PATCH] c++, driver: Fix -static-libstdc++ for targets without Bstatic/dynamic.

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/4/22 11:30, Iain Sandoe wrote:

This fixes a long-standing problem on Darwin where we cannot independently set
-static-libstdc++ because the flag gets stripped by the g++ driver.

This patch is essentially the same as the one used for the 'D' driver and has
been in local use for some time.  It has also been tested on Linux.

OK for master?
backports?
thanks
Iain

-- >8 --

The current implementation for swapping between the static and shared c++
runtimes relies on the static linker supporting Bstatic/dynamic which is
not available for every target (Darwin's linker does not support this).

Specs substitution (%s) is an alternative solution for this (which is what
Darwin uses for Fortran, D and Objective-C).  However, specs substitution
requires that the '-static-libstdc++' be preserved in the driver's command
line.  The patch here arranges for this to be done when the configuration
determines that linker support for Bstatic/dynamic is missing.


Would it work to define LIBSTDCXX_STATIC instead?  If not, the patch is OK.

Really there should be a way for lang_specific_driver to mark a flag as 
"validated" rather than prune it.



Signed-off-by: Iain Sandoe 

gcc/cp/ChangeLog:

* g++spec.cc (lang_specific_driver): Preserve -static-libstdc++ in
the driver command line for targets without -Bstatic/dynamic support
in their static linker.
---
  gcc/cp/g++spec.cc | 5 +
  1 file changed, 5 insertions(+)

diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
index 3d3b042dd56..f95d7965355 100644
--- a/gcc/cp/g++spec.cc
+++ b/gcc/cp/g++spec.cc
@@ -234,7 +234,12 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
  
  	case OPT_static_libstdc__:

  library = library >= 0 ? 2 : library;
+#ifdef HAVE_LD_STATIC_DYNAMIC
+ /* Remove -static-libstdc++ from the command only if target supports
+LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
+back-end target can use outfile substitution.  */
  args[i] |= SKIPOPT;
+#endif
  break;
  
  	case OPT_stdlib_:




[PATCH] Fortran: check for invalid uses of statement functions arguments [PR69604]

2022-12-22 Thread Harald Anlauf via Gcc-patches
Dear all,

the attached patch adds a check for statement function bodies for
invalid uses of dummy arguments.  This fixes an ICE-on invalid.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 794af0d00b7086c9f0493f3a1aaac644e1fd50f6 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 22 Dec 2022 22:03:31 +0100
Subject: [PATCH] Fortran: check for invalid uses of statement functions
 arguments [PR69604]

gcc/fortran/ChangeLog:

	PR fortran/69604
	* match.cc (chk_stmt_fcn_body): New function.  Check for invalid uses
	of statement functions arguments.
	(gfc_match_st_function): Use above.

gcc/testsuite/ChangeLog:

	PR fortran/69604
	* gfortran.dg/statement_function_4.f90: New test.
---
 gcc/fortran/match.cc  | 27 +++
 .../gfortran.dg/statement_function_4.f90  | 10 +++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/statement_function_4.f90

diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 89fb115c0f6..3d346788416 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -5915,6 +5915,30 @@ recursive_stmt_fcn (gfc_expr *e, gfc_symbol *sym)
 }


+/* Check for invalid uses of statement function dummy arguments in body.  */
+
+static bool
+chk_stmt_fcn_body (gfc_expr *e, gfc_symbol *sym, int *f ATTRIBUTE_UNUSED)
+{
+  gfc_formal_arglist *formal;
+
+  if (e == NULL || e->symtree == NULL || e->expr_type != EXPR_FUNCTION)
+return false;
+
+  for (formal = sym->formal; formal; formal = formal->next)
+{
+  if (formal->sym == e->symtree->n.sym)
+	{
+	  gfc_error ("Invalid use of statement function argument at %L",
+		 &e->where);
+	  return true;
+	}
+}
+
+  return false;
+}
+
+
 /* Match a statement function declaration.  It is so easy to match
non-statement function statements with a MATCH_ERROR as opposed to
MATCH_NO that we suppress error message in most cases.  */
@@ -5983,6 +6007,9 @@ gfc_match_st_function (void)
   return MATCH_ERROR;
 }

+  if (gfc_traverse_expr (expr, sym, chk_stmt_fcn_body, 0))
+return MATCH_ERROR;
+
   sym->value = expr;

   if ((gfc_current_state () == COMP_FUNCTION
diff --git a/gcc/testsuite/gfortran.dg/statement_function_4.f90 b/gcc/testsuite/gfortran.dg/statement_function_4.f90
new file mode 100644
index 000..6ce5951b53a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/statement_function_4.f90
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! PR fortran/69604
+! Contributed by G.Steinmetz
+
+program p
+  x(n) = 1 + n(2.0) ! { dg-error "Invalid use of statement function argument" }
+  y(k) = k()! { dg-error "Invalid use of statement function argument" }
+  z(m) = m  ! { dg-warning "Statement function" }
+  print *, x(n)
+end
--
2.35.3



Re: [PATCH] c++: template friend with variadic constraints [PR108066]

2022-12-22 Thread Jason Merrill via Gcc-patches

On 12/22/22 12:34, Patrick Palka wrote:

On Thu, 15 Dec 2022, Jason Merrill wrote:


On 12/15/22 14:31, Patrick Palka wrote:

On Thu, 15 Dec 2022, Patrick Palka wrote:


On Thu, 15 Dec 2022, Jason Merrill wrote:


On 12/12/22 12:20, Patrick Palka wrote:

When instantiating a constrained hidden template friend, we need to
substitute into its constraints for sake of declaration matching.


Hmm, we shouldn't need to do declaration matching when there's only a
single
declaration.


Oops, indeed.  It looks like in this case we're not calling
maybe_substitute_reqs_for during declaration matching, but rather from
tsubst_friend_function and specifically for the non-trailing requirements:

if (TREE_CODE (new_friend) == TEMPLATE_DECL)
  {
DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P (new_friend) = false;
DECL_USE_TEMPLATE (DECL_TEMPLATE_RESULT (new_friend)) = 0;
DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (new_friend))
  = DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (decl));

/* Substitute TEMPLATE_PARMS_CONSTRAINTS so that parameter levels
will
   match in decls_match.  */
tree parms = DECL_TEMPLATE_PARMS (new_friend);
tree treqs = TEMPLATE_PARMS_CONSTRAINTS (parms);
treqs = maybe_substitute_reqs_for (treqs, new_friend);
if (treqs != TEMPLATE_PARMS_CONSTRAINTS (parms))
  {
TEMPLATE_PARMS_CONSTRAINTS (parms) = treqs;
/* As well as each TEMPLATE_PARM_CONSTRAINTS.  */
tsubst_each_template_parm_constraints (parms, args,
   tf_warning_or_error);
  }
  }

I'll adjust the commit message appropriately.




For
this substitution we use a full argument vector whose outer levels
correspond to the class's arguments and innermost level corresponds to
the template's level-lowered generic arguments.

But for A::f here, for which the relevant argument vector is
{{int}, {Us...}}, this substitution triggers the assert in
use_pack_expansion_extra_args_p since one argument is a pack expansion
and the other isn't.

And for A::f, for which the relevant argument vector is
{{int, int}, {Us...}}, the use_pack_expansion_extra_args_p assert
would
also trigger but we first get a bogus "mismatched argument pack
lengths"
error from tsubst_pack_expansion.

These might ultimately be bugs in tsubst_pack_expansion, but it seems
we can work around them by tweaking the constraint substitution in
maybe_substitute_reqs_for to only use the friend's outer arguments in
the case of a template friend.


Yes, this is how we normally substitute a member template during class
instantiation: with the class' template args, not its own.  The assert
seems
to be enforcing that.


Ah, makes snese.




This should be otherwise equivalent to
substituting using the full arguments, since the template's innermost
arguments are just its generic arguments with level=1.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for
trunk/12?

PR c++/108066
PR c++/108067

gcc/cp/ChangeLog:

* constraint.cc (maybe_substitute_reqs_for): For a template
friend, substitute using only its outer arguments.  Remove
dead uses_template_parms test.


I don't see any removal?


Oops, I reverted that change before posting the patch but forgot to adjust
the
ChangeLog as well.  Removing the uses_template_parms test seems to be safe
but
it's not necessary to fix the bug.




gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend12.C: New test.
---
gcc/cp/constraint.cc  |  8 +++
.../g++.dg/cpp2a/concepts-friend12.C  | 22
+++
2 files changed, 30 insertions(+)
create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-friend12.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index d4cd92ec3b4..f9d1009c9fe 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -1352,6 +1352,14 @@ maybe_substitute_reqs_for (tree reqs,
const_tree
decl)
  tree tmpl = DECL_TI_TEMPLATE (decl);
  tree gargs = generic_targs_for (tmpl);
  processing_template_decl_sentinel s;
+  if (PRIMARY_TEMPLATE_P (tmpl))
+   {
+ if (TEMPLATE_ARGS_DEPTH (gargs) == 1)
+   return reqs;
+ ++processing_template_decl;
+ gargs = copy_node (gargs);
+ --TREE_VEC_LENGTH (gargs);


Maybe instead of messing with TEMPLATE_ARGS_DEPTH we want to grab the
targs
for DECL_FRIEND_CONTEXT instead of decl itself?


IIUC DECL_FRIEND_CONTEXT wouldn't give us the right args if the template
friend
is declared inside a partial specialization:

template
concept C = __is_same(T, U);

template
struct A;

template
struct A {
  template
requires (__is_same(T, U))
  friend void f() { };
};

template struct A;

The DECL_FRIEND_CONTEXT of f is A, whose TYPE_TI_ARGS are {int*},
relative to the primary template instead of the partial sp

[PATCH] regression tests for 103770 fixed on trunk

2022-12-22 Thread Martin Uecker via Gcc-patches


This adds regression tests for an ICE on valid code that
seems gone on trunk, but the cause is still unclear.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103770



regressions tests for PR103770

This adds tests from bugzilla for PR103770 and duplicates.

testsuite/gcc.dg/
* pr103770.c: New test.
* pr103859.c: New test.
* pr105065.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr103770.c b/gcc/testsuite/gcc.dg/pr103770.c
new file mode 100644
index 000..f7867d1284c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103770.c
@@ -0,0 +1,27 @@
+/* PR middle-end/103770 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+struct struct_s {
+   void* ptr;
+   void* ptr2;
+   void* ptr3;
+};
+
+struct struct_s struct_create(int N, const long vla[N]);
+
+void fun(int N)
+{
+   long vla[N];
+   struct struct_s st = struct_create(N, vla);
+}
+
+
+extern _Complex float g(int N, int dims[N]);
+
+void f(void)
+{
+   int dims[1];
+   _Complex float val = g(1, dims);
+}
+
diff --git a/gcc/testsuite/gcc.dg/pr103859.c b/gcc/testsuite/gcc.dg/pr103859.c
new file mode 100644
index 000..c58be5c15af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103859.c
@@ -0,0 +1,23 @@
+/* PR middle-end/103859 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+typedef struct dcmplx dcmplx;
+
+struct dcmplx {
+  double re;
+  double im;
+};
+
+dcmplx horner(int n, dcmplx p[n], dcmplx x);
+
+int main(void)
+{
+  int i, n;
+  dcmplx x[n + 1], f[n + 1];
+
+  horner(n + 1, f, x[i]);
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/pr105065.c b/gcc/testsuite/gcc.dg/pr105065.c
new file mode 100644
index 000..da46d2bb4d8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr105065.c
@@ -0,0 +1,16 @@
+/* PR middle-end/105065 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+typedef struct
+{
+   char filler[17];
+} big_struct;
+
+big_struct dummy(int size, char array[size]);
+
+int main()
+{
+   dummy(0, 0);
+}
+




Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-22 Thread Joseph Myers
On Thu, 22 Dec 2022, Segher Boessenkool wrote:

> Hi!
> 
> On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> > On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> > > > --- a/gcc/tree.cc
> > > > +++ b/gcc/tree.cc
> > > > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> > > >if (!targetm.floatn_mode (n, extended).exists (&mode))
> > > > continue;
> > > >int precision = GET_MODE_PRECISION (mode);
> > > > -  /* Work around the rs6000 KFmode having precision 113 not
> > > > -128.  */
> > > 
> > > It has precision 126 now fwiw.
> > > 
> > > Joseph: what do you think about this patch?  Is the workaround it
> > > removes still useful in any way, do we need to do that some other way if
> > > we remove this?
> 
> You didn't address these questions.  We don't see negative effects from
> removing this workaround, but it isn't clear (to me) what problems were
> there that caused you to do this workaround.  Do you remember maybe?  Or
> can we just delete it and try to forget such worries :-)

The purpose was to ensure that _Float128's TYPE_PRECISION was at least as 
large as that of long double, in the case where they both have binary128 
format.  I think at that time, in GCC 7, it was possible for _Float128 to 
be KFmode and long double to be TFmode, with those being different modes 
with the same format.

In my view, it would be best not to have different modes with the same 
format - not simply ensure types with the same format have the same mode, 
but avoid multiple modes with the same format existing in the compiler at 
all.  That is, TFmode should be the same mode as one of KFmode and IFmode 
(one name should be defined as a macro for the other name, or something 
similar).  If you don't have different modes with the same format, many of 
the problems go away.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] phiopt: Adjust instead of reset phires range

2022-12-22 Thread Aldy Hernandez via Gcc-patches
On Thu, Dec 22, 2022 at 1:54 PM Jakub Jelinek  wrote:
>
> On Thu, Dec 22, 2022 at 01:09:21PM +0100, Aldy Hernandez wrote:
> >  INTEGER_CST singleton and
> > > union that into the SSA_NAMEs range and then do set_range_info
> > > with the altered range I guess.
> > >
> >
> > Note that set_range_info is an intersect operation. It should really be
> > called update_range_info. Up to now, there were no users that wanted to
> > clobber old ranges completely.
>
> Thanks.
> That would be then (I've committed the previous patch, also for reasons of
> backporting) following incremental patch.
>
> For the just committed testcase, it does the right thing,
> # RANGE [irange] int [-INF, -1][1, +INF]
> # iftmp.2_9 = PHI 
> is before the range (using -fdump-tree-all-alias) and r below is
> [irange] int [-INF, -1][1, +INF],
> unioned with carg of 0 into VARYING.
> If I try attached testcase though (which just uses signed char d instead of
> int d to give more interesting range info), then I see:
> # RANGE [irange] int [-128, -1][1, 127]
> # iftmp.2_10 = PHI 
> but strangely r I get from range_of_expr is
> [irange] int [-128, 127]
> rather than the expected [irange] int [-128, -1][1, 127].
> Sure, it is later unioned with 0, so it doesn't change anything, but I
> wonder what is the difference.  Note, this is before actually replacing
> the phi arg 8(5) with iftmp.3_11(5).
> At that point bb4 is:
>  [local count: 966367640]:
> # RANGE [irange] int [-128, 127]
> # iftmp.3_11 = PHI 
> if (iftmp.3_11 != 0)
>   goto ; [56.25%]
> else
>   goto ; [43.75%]
> and bb 5 is empty forwarder, so [-128, -1][1, 127] is actually correct.
> Either iftmp.3_11 is non-zero, then iftmp.2_10 is that value and its range, or
> it is zero and then iftmp.2_10 is 8, so [-128, -1][1, 127] U [8, 8], but
> more importantly SSA_NAME_RANGE_INFO should be at least according to what
> is printed be without 0.
>
> 2022-12-22  Jakub Jelinek  
> Aldy Hernandez  
>
> * tree-ssa-phiopt.cc (value_replacement): Instead of resetting
> phires range info, union it with oarg.
>
> --- gcc/tree-ssa-phiopt.cc.jj   2022-12-22 12:52:36.588469821 +0100
> +++ gcc/tree-ssa-phiopt.cc  2022-12-22 13:11:51.145060050 +0100
> @@ -1492,11 +1492,25 @@ value_replacement (basic_block cond_bb, basic_block 
> middle_bb,
> break;
>   }
>   if (equal_p)
> -   /* After the optimization PHI result can have value
> -  which it couldn't have previously.
> -  We could instead of resetting it union the range
> -  info with oarg.  */
> -   reset_flow_sensitive_info (gimple_phi_result (phi));
> +   {
> + tree phires = gimple_phi_result (phi);
> + if (SSA_NAME_RANGE_INFO (phires))
> +   {
> + /* After the optimization PHI result can have value
> +which it couldn't have previously.  */
> + value_range r;

I haven't looked at your problem above, but have you tried using
int_range_max (or even int_range<2>) instead of value_range above?

value_range is deprecated and uses the legacy anti-range business,
which has a really hard time representing complex ranges, as well as
union/intersecting them.

> + if (get_global_range_query ()->range_of_expr (r, phires,
> +   phi))
> +   {
> + int_range<2> tmp (carg, carg);
> + r.union_ (tmp);

Here you are taking the legacy value_range and unioning into it.
That's bound to lose precision.

Ideally you should use int_range_max for intermediate calculations.
Then set_range_info() will take care of squishing things down into
whatever we allow into a global range (I think it's a 6-sub range
object ??).

Note, that if "r" can contain non integer/pointers (i.e. floats), you
should use:

  // Range of .
  Value_Range r ();

The goal is for Value_Range to become value_range, and for it to be
used for anything not explicitly an integer/pointer.  Thus the camel
case for this release.

Aldy

> + reset_flow_sensitive_info (phires);
> + set_range_info (phires, r);
> +   }
> + else
> +   reset_flow_sensitive_info (phires);
> +   }
> +   }
>   if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
> {
>   imm_use_iterator imm_iter;
>
>
> Jakub



Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-22 Thread Qing Zhao via Gcc-patches



> On Dec 22, 2022, at 12:56 PM, Alexander Monakov  wrote:
> 
> 
> On Thu, 22 Dec 2022, Jose E. Marchesi via Gcc-patches wrote:
> 
>> The first instruction scheduler pass reorders instructions in the TRY
>> block in a way `b=true' gets executed before the call to the function
>> `f'.  This optimization is wrong, because `main' calls setjmp and `f'
>> is known to call longjmp.
>> 
>> As discussed in BZ 57067, the root cause for this is the fact that
>> setjmp is not properly modeled in RTL, and therefore the backend
>> passes have no normalized way to handle this situation.
>> 
>> As Alexander Monakov noted in the BZ, many RTL passes refuse to touch
>> functions that call setjmp.  This includes for example gcse,
>> store_motion and cprop.  This patch adds the sched1 pass to that list.
>> 
>> Note that the other instruction scheduling passes are still allowed to
>> run on these functions, since they reorder instructions within basic
>> blocks, and therefore they cannot cross function calls.
>> 
>> This doesn't fix the fundamental issue, but at least assures that
>> sched1 wont perform invalid transformation in correct C programs.
> 
> I think scheduling across calls in the pre-RA scheduler is simply an 
> oversight,
> we do not look at dataflow information and with 50% chance risk extending
> lifetime of a pseudoregister across a call, causing higher register pressure 
> at
> the point of the call, and potentially an extra spill.

I am a little confused, you mean pre-RA scheduler does not look at the data flow
 information at all when scheduling insns across calls currently?

Qing


> 
> Therefore I would suggest to indeed solve the root cause, with (untested):
> 
> diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
> index 948aa0c3b..343fe2bfa 100644
> --- a/gcc/sched-deps.cc
> +++ b/gcc/sched-deps.cc
> @@ -3688,7 +3688,13 @@ deps_analyze_insn (class deps_desc *deps, rtx_insn 
> *insn)
> 
>   CANT_MOVE (insn) = 1;
> 
> -  if (find_reg_note (insn, REG_SETJMP, NULL))
> +  if (!reload_completed)
> +   {
> + /* Do not schedule across calls, this is prone to extending lifetime
> +of a pseudo and causing extra spill later on.  */
> + reg_pending_barrier = MOVE_BARRIER;
> +   }
> +  else if (find_reg_note (insn, REG_SETJMP, NULL))
> {
>   /* This is setjmp.  Assume that all registers, not just
>  hard registers, may be clobbered by this call.  */
> 
> Alexander



[Ping^1] [PATCH] c++, driver: Fix -static-libstdc++ for targets without Bstatic/dynamic.

2022-12-22 Thread Iain Sandoe
Hi
this has become more important since it seems I can no longer link a working 
gnat1 without
it,
thanks
Iain

> On 4 Dec 2022, at 16:30, Iain Sandoe via Gcc-patches 
>  wrote:
> 
> This fixes a long-standing problem on Darwin where we cannot independently set
> -static-libstdc++ because the flag gets stripped by the g++ driver.
> 
> This patch is essentially the same as the one used for the 'D' driver and has
> been in local use for some time.  It has also been tested on Linux.
> 
> OK for master?
> backports?
> thanks
> Iain
> 
> -- >8 --
> 
> The current implementation for swapping between the static and shared c++
> runtimes relies on the static linker supporting Bstatic/dynamic which is
> not available for every target (Darwin's linker does not support this).
> 
> Specs substitution (%s) is an alternative solution for this (which is what
> Darwin uses for Fortran, D and Objective-C).  However, specs substitution
> requires that the '-static-libstdc++' be preserved in the driver's command
> line.  The patch here arranges for this to be done when the configuration
> determines that linker support for Bstatic/dynamic is missing.
> 
> Signed-off-by: Iain Sandoe 
> 
> gcc/cp/ChangeLog:
> 
>   * g++spec.cc (lang_specific_driver): Preserve -static-libstdc++ in
>   the driver command line for targets without -Bstatic/dynamic support
>   in their static linker.
> ---
> gcc/cp/g++spec.cc | 5 +
> 1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
> index 3d3b042dd56..f95d7965355 100644
> --- a/gcc/cp/g++spec.cc
> +++ b/gcc/cp/g++spec.cc
> @@ -234,7 +234,12 @@ lang_specific_driver (struct cl_decoded_option 
> **in_decoded_options,
> 
>   case OPT_static_libstdc__:
> library = library >= 0 ? 2 : library;
> +#ifdef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libstdc++ from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
> 
>   case OPT_stdlib_:
> -- 
> 2.37.1 (Apple Git-137.1)
> 



Re: [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-22 Thread Segher Boessenkool
On Wed, Dec 21, 2022 at 11:41:58AM +0800, Kewen.Lin wrote:
> on 2022/12/20 21:19, Segher Boessenkool wrote:
> > Sure, I understand that.  What I don't like is the generator program is
> > much too big and unstructured already, and this doesn't help at all; it
> > makes it quite a bit worse even.

> >> Good point, and I just noticed that we should check tune setting instead
> >> of TARGET_POWER10 here?  Something like:
> >>
> >> if (!(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION))
> >>   {
> >> if (processor_target_table[tune_index].processor == PROCESSOR_POWER10)
> >>   rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> >> else
> >>   rs6000_isa_flags &= ~OPTION_MASK_P10_FUSION;
> >>   }
> > 
> > Yeah that looks better :-)
> 
> I'm going to test this and commit it first.  :)

Thanks!

> > Maybe you can restructure the Perl code a bit in a first patch, and then
> > add the insn condition?  If you're not comfortable with Perl, I'll deal
> > with it, just update the patch.
> 
> OK, I'll give it a try, TBH I just fixed the place for insn condition, didn't
> look into this script, with a quick look, I'm going to factor out the main
> body from the multiple level loop, do you have some suggestions on which other
> candidates to be restructured?

Anything that makes the code easier to understand, basically.

This stuff is by nature pretty hard to read, but making the code shorter
and/or less nested should make it easier to understand.  You will need
to have fewer local variables per function than there are total now,
that will help.

Btw, this script isn't so big at all, but the patches are hard to review
without converting this to a side-by-side comparison first.  There must
be some way to improve that, that is what I'm looking for :-)


Segher


Re: [PATCH] loading float member of parameter stored via int registers

2022-12-22 Thread Segher Boessenkool
On Thu, Dec 22, 2022 at 11:28:01AM +, Richard Biener wrote:
> On Thu, 22 Dec 2022, Jiufu Guo wrote:
> > To reduce risk, I'm just draft straightforward patches for
> > special cases currently, Like:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
> > and this patch.
> 
> Heh, yes - though I'm not fond of special-casing things.  RTL
> expansion is already full of special cases :/

And many of those are not useful at all (would be done by later passes),
or are actively harmful.  Not to mention that expand is currently one of
the most impregnable and undebuggable RTL passes.

But there are also many things done during expand that although they
should be done somewhat later, aren't actually done later at all
currently.  So that needs fixing.

Maybe things should go via an intermediate step, where all the decisions
can be made, and then later we just have to translate the "low Gimple"
or "RTL-Gimple" ("Rimple"?) to RTL.  A format that is looser in many
ways than either RTL or Gimple.  A bit like Generic in that way.


Segher


Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2022-12-22 Thread Segher Boessenkool
Hi!

On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> > > --- a/gcc/tree.cc
> > > +++ b/gcc/tree.cc
> > > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> > >if (!targetm.floatn_mode (n, extended).exists (&mode))
> > >   continue;
> > >int precision = GET_MODE_PRECISION (mode);
> > > -  /* Work around the rs6000 KFmode having precision 113 not
> > > -  128.  */
> > 
> > It has precision 126 now fwiw.
> > 
> > Joseph: what do you think about this patch?  Is the workaround it
> > removes still useful in any way, do we need to do that some other way if
> > we remove this?

You didn't address these questions.  We don't see negative effects from
removing this workaround, but it isn't clear (to me) what problems were
there that caused you to do this workaround.  Do you remember maybe?  Or
can we just delete it and try to forget such worries :-)

> I think it's best for the TYPE_PRECISION, for any type with the binary128 
> format, to be 128 (not 126).

Well, but why?  Of course it looks nicer, and it is a gross workaround
to have different precisions for the different 128-bit FP modes, more so
if two modes are really the same, but in none of the ways floating point
precision is defined would it be 128 for any 128-bit mode.

> It's necessary that _Float128, _Float64x and long double all have the same 
> TYPE_PRECISION when they have the same (binary128) format,

Yes, agreed.  Or even if it would be not necessary it is the only sane
thing to do.


Segher


Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-22 Thread Alexander Monakov via Gcc-patches


On Thu, 22 Dec 2022, Jose E. Marchesi via Gcc-patches wrote:

> The first instruction scheduler pass reorders instructions in the TRY
> block in a way `b=true' gets executed before the call to the function
> `f'.  This optimization is wrong, because `main' calls setjmp and `f'
> is known to call longjmp.
> 
> As discussed in BZ 57067, the root cause for this is the fact that
> setjmp is not properly modeled in RTL, and therefore the backend
> passes have no normalized way to handle this situation.
> 
> As Alexander Monakov noted in the BZ, many RTL passes refuse to touch
> functions that call setjmp.  This includes for example gcse,
> store_motion and cprop.  This patch adds the sched1 pass to that list.
> 
> Note that the other instruction scheduling passes are still allowed to
> run on these functions, since they reorder instructions within basic
> blocks, and therefore they cannot cross function calls.
> 
> This doesn't fix the fundamental issue, but at least assures that
> sched1 wont perform invalid transformation in correct C programs.

I think scheduling across calls in the pre-RA scheduler is simply an oversight,
we do not look at dataflow information and with 50% chance risk extending
lifetime of a pseudoregister across a call, causing higher register pressure at
the point of the call, and potentially an extra spill.

Therefore I would suggest to indeed solve the root cause, with (untested):

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index 948aa0c3b..343fe2bfa 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -3688,7 +3688,13 @@ deps_analyze_insn (class deps_desc *deps, rtx_insn *insn)

   CANT_MOVE (insn) = 1;

-  if (find_reg_note (insn, REG_SETJMP, NULL))
+  if (!reload_completed)
+   {
+ /* Do not schedule across calls, this is prone to extending lifetime
+of a pseudo and causing extra spill later on.  */
+ reg_pending_barrier = MOVE_BARRIER;
+   }
+  else if (find_reg_note (insn, REG_SETJMP, NULL))
 {
   /* This is setjmp.  Assume that all registers, not just
  hard registers, may be clobbered by this call.  */

Alexander


[C PATCH] (for STAGE 1) UBSan instrumentation for assignment of VM types

2022-12-22 Thread Martin Uecker via Gcc-patches


Here is a first patch to add UBSan instrumentation to
assignment, return, initialization of pointers
to variably modified types. This is based on the
other patch I just sent. Separating these should make
reviewing easier.

Here, I did not add tests for function arguments as
this is more complicated, but this will follow...




c: UBSan instrumentation for assignment of VM types

This adds instrumentation that checks that corresponding
size expression in variably modified types evaluate to
the same value in assignment, function return, and
initialization.

gcc/c-family/
* c-ubsan.cc (ubsan_instrument_vm_assign): New.

gcc/c/
* c-typeck.cc (comptypes_check_enum_int_instr,
comptypes_check_enum_int): New interface for
instrumentation.
(comptypes_internal,convert_for_assignment):
Add instrumentation.
(comp_target_types_instr,comp_target_types): Add
new interface for instrumentation.

gcc/testsuite/gcc.dg/ubsan/
* vm-bounds-1.c: New test.
* vm-bounds-2.c: New test.

 

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 360ba82250c..7de4e6e7057 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -334,6 +334,48 @@ ubsan_instrument_vla (location_t loc, tree size)
   return t;
 }
 
+/* Instrument assignment of variably modified types.  */
+
+tree
+ubsan_instrument_vm_assign (location_t loc, tree a, tree b)
+{
+  tree t, tt;
+
+  gcc_assert (TREE_CODE (a) == ARRAY_TYPE);
+  gcc_assert (TREE_CODE (b) == ARRAY_TYPE);
+
+  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (a));
+  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (b));
+
+  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
+  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
+
+  t = build2 (NE_EXPR, boolean_type_node, as, bs);
+  if (flag_sanitize_trap & SANITIZE_VLA)
+tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
+  else
+{
+  tree data = ubsan_create_data ("__ubsan_vm_data", 1, &loc,
+ubsan_type_descriptor (a, 
UBSAN_PRINT_ARRAY),
+ubsan_type_descriptor (b, 
UBSAN_PRINT_ARRAY),
+ubsan_type_descriptor (sizetype),
+NULL_TREE, NULL_TREE);
+  data = build_fold_addr_expr_loc (loc, data);
+  enum built_in_function bcode
+   = (flag_sanitize_recover & SANITIZE_VLA)
+ ? BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH
+ : BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH_ABORT;
+  tt = builtin_decl_explicit (bcode);
+  tt = build_call_expr_loc (loc, tt, 3, data,
+   ubsan_encode_value (as),
+   ubsan_encode_value (bs));
+}
+  t = build3 (COND_EXPR, void_type_node, t, tt, void_node);
+
+  return t;
+}
+
+
 /* Instrument missing return in C++ functions returning non-void.  */
 
 tree
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index 2f31ba36df4..327313c6684 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -26,6 +26,7 @@ extern tree ubsan_instrument_shift (location_t, enum 
tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
 extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
+extern tree ubsan_instrument_vm_assign (location_t, tree, tree);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
 extern void ubsan_maybe_instrument_reference (tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ebc9ba88afe..a58b96083e9 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -93,6 +93,7 @@ static tree qualify_type (tree, tree);
 struct comptypes_data;
 static int tagged_types_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
+static int comp_target_types_instr (location_t, tree, tree, tree *);
 static int function_types_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
 static int type_lists_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
 static tree lookup_field (tree, tree);
@@ -1053,6 +1054,9 @@ struct comptypes_data {
 
   bool enum_and_int_p;
   bool different_types_p;
+
+  location_t loc;
+  tree instrument_expr;
 };
 
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
@@ -1075,20 +1079,35 @@ comptypes (tree type1, tree type2)
 /* Like comptypes, but if it returns non-zero because enum and int are
compatible, it sets *ENUM_AND_INT_P to true.  */
 
-int
-comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
+static int
+comptypes_check_enum_int_instr (tree type1, tree type2, bool *enum_and_int_p, 
location_t loc, tree *instrument_expr)
 {
   int val;
 
   struct comptypes_data data = { };
+
+  data.loc = loc;
+
+  if (NULL != instrument_expr)

[PATCH 9/8] middle-end: Allow build_popcount_expr to use an IFN

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
x86_64-pc-linux-gnu - ok to merge?

gcc/ChangeLog:

* tree-ssa-loop-niter.cc (build_popcount_expr): Add IFN support.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr86544.C: Add .POPCOUNT to tree scan regex.
* gcc.dg/tree-ssa/popcount.c: Likewise.
* gcc.dg/tree-ssa/popcount2.c: Likewise.
* gcc.dg/tree-ssa/popcount3.c: Likewise.
* gcc.target/aarch64/popcount4.c: Likewise.
* gcc.target/i386/pr95771.c: Likewise, and...
* gcc.target/i386/pr95771-2.c: ...split int128 test from above,
since this would emit just a single IFN if a TI optab is added.

---

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr86544.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
index 
ef438916a8019320564f444ace08e2f4b4190684..50befb36bac75de1cfa282e38358278b3288bd1c
 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
@@ -12,5 +12,5 @@ int PopCount (long b) {
 return c;
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
"optimized" } } */
 /* { dg-final { scan-tree-dump-times "if" 0 "phiopt4" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c 
b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
index 
b4694109411a4631697463519acbe7d9df65bf6e..efd906a0f5447f0beb3752eded3756999b02e6e6
 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
@@ -39,4 +39,4 @@ void PopCount3 (long b1) {
   }
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 3 
"optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
index 
ef73e345573de721833e98e89c252640a55f7c60..ae38a329bd4d868a762300d3218d68864c0fc4be
 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
@@ -26,4 +26,4 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
"optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
index 
ef438916a8019320564f444ace08e2f4b4190684..50befb36bac75de1cfa282e38358278b3288bd1c
 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
@@ -12,5 +12,5 @@ int PopCount (long b) {
 return c;
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
"optimized" } } */
 /* { dg-final { scan-tree-dump-times "if" 0 "phiopt4" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/popcount4.c 
b/gcc/testsuite/gcc.target/aarch64/popcount4.c
index 
ee55b2e335223053ca024e95b7a13aa4af32550e..8aa15ff018d4b5fc6bb59e52af20d5c33cea2ee0
 100644
--- a/gcc/testsuite/gcc.target/aarch64/popcount4.c
+++ b/gcc/testsuite/gcc.target/aarch64/popcount4.c
@@ -11,4 +11,4 @@ int PopCount (long b) {
 return c;
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_popcount" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 0 
"optimized" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr95771-2.c 
b/gcc/testsuite/gcc.target/i386/pr95771-2.c
new file mode 100644
index 
..1db9dc94d0b66477667624012221d6844c141a26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95771-2.c
@@ -0,0 +1,17 @@
+/* PR tree-optimization/95771 */
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2 -mpopcnt -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump " = __builtin_popcount| = \\.POPCOUNT" 
"optimized" } } */
+
+int
+corge (unsigned __int128 x)
+{
+  int i = 0;
+  while (x)
+{
+  x &= x - 1;
+  ++i;
+}
+  return i;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr95771.c 
b/gcc/testsuite/gcc.target/i386/pr95771.c
index 
d7b67017800b705b9854f561916c20901ea76803..d41be445f4a68613a082b8956fea3ceaf33d7e0f
 100644
--- a/gcc/testsuite/gcc.target/i386/pr95771.c
+++ b/gcc/testsuite/gcc.target/i386/pr95771.c
@@ -1,8 +1,7 @@
 /* PR tree-optimization/95771 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mpopcnt -fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump-times " = __builtin_popcount" 6 "optimized" { 
target int128 } } } */
-/* { dg-final { scan-tree-dump-times " = __builtin_popcount" 4 "optimized" { 
target { ! int128 } } } } */
+/* { dg-final { scan-tree-dump-times " = __builtin_popcount| = \\.POPCOUNT" 4 
"optimized" } } */
 
 int
 foo (unsigned char x)
@@ -51,17 +50,3 @@ qux (unsigned long long x)
 }
   return i;
 }
-
-#ifdef __SIZEOF_INT128__
-int
-corge (unsigned __int

[PATCH 6/8 v2] docs: Add popcount, clz and ctz target attributes

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Updated to reflect Sphinx revert; I'll commit this once the
cltz_complement patch is merged.

gcc/ChangeLog:

* doc/sourcebuild.texi: Add missing target attributes.

---

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 
ffe69d6fcb9c46cf97ba570e85b56e586a0c9b99..1036b185ee289bbf7883bd14956a41da9a6d677b
 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2511,6 +2511,24 @@ Target supports the execution of @code{amx-fp16} 
instructions.
 @item cell_hw
 Test system can execute AltiVec and Cell PPU instructions.
 
+@item clz
+Target supports a clz optab on int.
+
+@item clzl
+Target supports a clz optab on long.
+
+@item clzll
+Target supports a clz optab on long long.
+
+@item ctz
+Target supports a ctz optab on int.
+
+@item ctzl
+Target supports a ctz optab on long.
+
+@item ctzll
+Target supports a ctz optab on long long.
+
 @item cmpccxadd
 Target supports the execution of @code{cmpccxadd} instructions.
 
@@ -2532,6 +2550,15 @@ Target does not require strict alignment.
 @item pie_copyreloc
 The x86-64 target linker supports PIE with copy reloc.
 
+@item popcount
+Target supports a popcount optab on int.
+
+@item popcountl
+Target supports a popcount optab on long.
+
+@item popcountll
+Target supports a popcount optab on long long.
+
 @item prefetchi
 Target supports the execution of @code{prefetchi} instructions.
 


[PATCH 5/8 v2] middle-end: Add cltz_complement idiom recognition

2022-12-22 Thread Andrew Carlotti via Gcc-patches
On Thu, Nov 24, 2022 at 11:41:31AM +0100, Richard Biener wrote:
> Note we do have CTZ and CLZ
> optabs and internal functions - in case there's a HImode CLZ this
> could be elided.  More general you can avoid using the __builtin_
> functions with their fixed types in favor of using IFN_C[TL]Z which
> are type agnostic (but require optab support - you should be able
> to check this via direct_internal_fn_supported_p).

IFN support added. I've also renamed the defined_at_zero parameter to
define_at_zero, since this is a request for the expression to define it,
rather than a guarantee that it is already defined.

New patch below, bootstrapped and regression tested on
aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu - ok to merge?

---

This recognises patterns of the form:
while (n) { n >>= 1 }

This patch results in improved (but still suboptimal) codegen:

foo (unsigned int b) {
int c = 0;

while (b) {
b >>= 1;
c++;
}

return c;
}

foo:
.LFB11:
.cfi_startproc
cbz w0, .L3
clz w1, w0
tst x0, 1
mov w0, 32
sub w0, w0, w1
cselw0, w0, wzr, ne
ret

The conditional is unnecessary. phiopt could recognise a redundant csel
(using cond_removal_in_builtin_zero_pattern) when one of the inputs is a
clz call, but it cannot recognise the redunancy when the input is (e.g.)
(32 - clz).

I could perhaps extend this function to recognise this pattern in a later
patch, if this is a good place to recognise more patterns.

gcc/ChangeLog:

PR tree-optimization/94793
* tree-scalar-evolution.cc (expression_expensive_p): Add checks
for c[lt]z optabs.
* tree-ssa-loop-niter.cc (build_cltz_expr): New.
(number_of_iterations_cltz_complement): New.
(number_of_iterations_bitcount): Add call to the above.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_clz)
(check_effective_target_clzl, check_effective_target_clzll)
(check_effective_target_ctz, check_effective_target_clzl)
(check_effective_target_ctzll): New.
* gcc.dg/tree-ssa/cltz-complement-max.c: New test.
* gcc.dg/tree-ssa/clz-complement-char.c: New test.
* gcc.dg/tree-ssa/clz-complement-int.c: New test.
* gcc.dg/tree-ssa/clz-complement-long-long.c: New test.
* gcc.dg/tree-ssa/clz-complement-long.c: New test.
* gcc.dg/tree-ssa/ctz-complement-char.c: New test.
* gcc.dg/tree-ssa/ctz-complement-int.c: New test.
* gcc.dg/tree-ssa/ctz-complement-long-long.c: New test.
* gcc.dg/tree-ssa/ctz-complement-long.c: New test.

---

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c
new file mode 100644
index 
..1a29ca52e42e50822e4e3213b2cb008b766d0318
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-loop-optimize -fdump-tree-optimized" } */
+
+#define PREC (__CHAR_BIT__)
+
+int clz_complement_count1 (unsigned char b) {
+int c = 0;
+
+while (b) {
+   b >>= 1;
+   c++;
+}
+if (c <= PREC)
+  return 0;
+else
+  return 34567;
+}
+
+int clz_complement_count2 (unsigned char b) {
+int c = 0;
+
+while (b) {
+   b >>= 1;
+   c++;
+}
+if (c <= PREC - 1)
+  return 0;
+else
+  return 76543;
+}
+
+int ctz_complement_count1 (unsigned char b) {
+int c = 0;
+
+while (b) {
+   b <<= 1;
+   c++;
+}
+if (c <= PREC)
+  return 0;
+else
+  return 23456;
+}
+
+int ctz_complement_count2 (unsigned char b) {
+int c = 0;
+
+while (b) {
+   b <<= 1;
+   c++;
+}
+if (c <= PREC - 1)
+  return 0;
+else
+  return 65432;
+}
+/* { dg-final { scan-tree-dump-times "34567" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "76543" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "23456" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "65432" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c 
b/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c
new file mode 100644
index 
..2ebe8fabcaf0ce88f3a6a46e9ba4ba79b7d3672e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-require-effective-target clz } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define PREC (__CHAR_BIT__)
+
+int
+__attribute__ ((noinline, noclone))
+foo (unsigned char b) {
+int c = 0;
+
+while (b) {
+   b >>= 1;
+   c++;
+}
+
+return c;
+}
+
+int main()
+{
+  if (foo(0) != 0)
+__builtin_abort ();
+  if (foo(5) != 3)
+__builtin_abort ();
+  if (foo(255) != 8)
+__builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree

[C PATCH] (for STAGE 1) Reorganize comptypes and related functions

2022-12-22 Thread Martin Uecker via Gcc-patches




Because I want to add another argument to comptypes
and co. for UBSan instrumentation and this then
starts to become a bit unwiedly, here is a patch to
reorganize and simplify this a bit. This can wait
until stage 1. (The cache can be simplified further
by allocating it on the stack, but this can be done
later).



c: Reorganize comptypes and related functions.

Move common arguments and a global variable (the
cache for type compatibility) used by comptypes
and childen into a single structure and pass a
pointer to it.

gcc/c/ 
* c/c-typeck.cc: (comptypes,comptypes_check_enum_int,
comptypes_check_different_types,comptypes_internal,
tagged_types_tu_compatible_p,function_types_compatible_p,
type_lists_compatible_p): Introduce a structure argument.
(alloc_tagged_tu_seen_cache,free_all_tagged_tu_seen_up_to):
Reorganize cache and remove tu from function names.



diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 47fa36f4ec8..ebc9ba88afe 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -90,12 +90,11 @@ static bool require_constexpr_value;
 
 static bool null_pointer_constant_p (const_tree);
 static tree qualify_type (tree, tree);
-static int tagged_types_tu_compatible_p (const_tree, const_tree, bool *,
-bool *);
+struct comptypes_data;
+static int tagged_types_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
-static int function_types_compatible_p (const_tree, const_tree, bool *,
-   bool *);
-static int type_lists_compatible_p (const_tree, const_tree, bool *, bool *);
+static int function_types_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
+static int type_lists_compatible_p (const_tree, const_tree, struct 
comptypes_data *);
 static tree lookup_field (tree, tree);
 static int convert_arguments (location_t, vec, tree,
  vec *, vec *, tree,
@@ -125,7 +124,7 @@ static tree find_init_member (tree, struct obstack *);
 static void readonly_warning (tree, enum lvalue_use);
 static int lvalue_or_else (location_t, const_tree, enum lvalue_use);
 static void record_maybe_used_decl (tree);
-static int comptypes_internal (const_tree, const_tree, bool *, bool *);
+static int comptypes_internal (const_tree, const_tree, struct comptypes_data 
*data);
 
 /* Return true if EXP is a null pointer constant, false otherwise.  */
 
@@ -189,17 +188,16 @@ remove_c_maybe_const_expr (tree expr)
 
 /* This is a cache to hold if two types are compatible or not.  */
 
-struct tagged_tu_seen_cache {
-  const struct tagged_tu_seen_cache * next;
+struct tagged_seen_cache {
+  const struct tagged_seen_cache * next;
   const_tree t1;
   const_tree t2;
-  /* The return value of tagged_types_tu_compatible_p if we had seen
+  /* The return value of tagged_types_compatible_p if we had seen
  these two types already.  */
   int val;
 };
 
-static const struct tagged_tu_seen_cache * tagged_tu_seen_base;
-static void free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache 
*);
+static void free_all_tagged_seen (const struct tagged_seen_cache *);
 
 /* Do `exp = require_complete_type (loc, exp);' to make sure exp
does not have an incomplete type.  (That includes void types.)
@@ -1049,6 +1047,14 @@ common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+struct comptypes_data {
+
+  const struct tagged_seen_cache *seen_base;
+
+  bool enum_and_int_p;
+  bool different_types_p;
+};
+
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
or various other operations.  Return 2 if they are compatible
but a warning may be needed if you use them together.  */
@@ -1056,11 +1062,12 @@ common_type (tree t1, tree t2)
 int
 comptypes (tree type1, tree type2)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, NULL, NULL);
-  free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, &data);
+
+  free_all_tagged_seen (data.seen_base);
 
   return val;
 }
@@ -1071,11 +1078,13 @@ comptypes (tree type1, tree type2)
 int
 comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, enum_and_int_p, NULL);
-  free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, &data);
+  *enum_and_int_p = data.enum_and_int_p;
+
+  free_all_tagged_seen (data.seen_base);
 
   return val;
 }
@@ -1087,31 +1096,33 @@ int
 comptypes_check_different_types (tree type1, tree type2,
 bool *different_types_p)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;

RE: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2022-12-22 Thread Joshi, Tejas Sanjay via Gcc-patches
[Public]

Hello,

I have addressed all your comments in this revision of the patch, please find 
attached and inlined.

* I have updated all the latencies with Agner's measurements.
* Incorrect pipelines, loads/stores are addressed.
* The double pumped avx512 insns take one cycle for 256 half and the next cycle 
for remaining 256-bit half in the same pipeline, thus pipe*2.

Is this ok for trunk?

Thanks and Regards,
Tejas

gcc/ChangeLog:

* gcc/common/config/i386/i386-common.cc (processor_alias_table):
Use CPU_ZNVER4 for znver4.
* config/i386/i386.md: Add znver4.md.
* config/i386/znver4.md: New.

Change-Id: Iea39c1c01d4992cf7ac476bd6de65887910bbcbe
---
 gcc/common/config/i386/i386-common.cc |2 +-
 gcc/config/i386/i386.md   |1 +
 gcc/config/i386/znver4.md | 1068 +
 3 files changed, 1070 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/i386/znver4.md

diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index 660a977b68b..c7adea57683 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -2215,7 +2215,7 @@ const pta processor_alias_table[] =
   {"znver3", PROCESSOR_ZNVER3, CPU_ZNVER3,
 PTA_ZNVER3,
 M_CPU_SUBTYPE (AMDFAM19H_ZNVER3), P_PROC_AVX2},
-  {"znver4", PROCESSOR_ZNVER4, CPU_ZNVER3,
+  {"znver4", PROCESSOR_ZNVER4, CPU_ZNVER4,
 PTA_ZNVER4,
 M_CPU_SUBTYPE (AMDFAM19H_ZNVER4), P_PROC_AVX512F},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 9451883396c..3a88f16a21a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1319,6 +1319,7 @@
 (include "bdver3.md")
 (include "btver2.md")
 (include "znver.md")
+(include "znver4.md")
 (include "geode.md")
 (include "atom.md")
 (include "slm.md")
diff --git a/gcc/config/i386/znver4.md b/gcc/config/i386/znver4.md
new file mode 100644
index 000..d0b239822a8
--- /dev/null
+++ b/gcc/config/i386/znver4.md
@@ -0,0 +1,1068 @@
+;; Copyright (C) 2012-2022 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+;;
+
+
+(define_attr "znver4_decode" "direct,vector,double"
+  (const_string "direct"))
+
+;; AMD znver4 Scheduling
+;; Modeling automatons for zen decoders, integer execution pipes,
+;; AGU pipes, branch, floating point execution and fp store units.
+(define_automaton "znver4, znver4_ieu, znver4_idiv, znver4_fdiv, znver4_agu, 
znver4_fpu, znver4_fp_store")
+
+;; Decoders unit has 4 decoders and all of them can decode fast path
+;; and vector type instructions.
+(define_cpu_unit "znver4-decode0" "znver4")
+(define_cpu_unit "znver4-decode1" "znver4")
+(define_cpu_unit "znver4-decode2" "znver4")
+(define_cpu_unit "znver4-decode3" "znver4")
+
+;; Currently blocking all decoders for vector path instructions as
+;; they are dispatched separetely as microcode sequence.
+(define_reservation "znver4-vector" 
"znver4-decode0+znver4-decode1+znver4-decode2+znver4-decode3")
+
+;; Direct instructions can be issued to any of the four decoders.
+(define_reservation "znver4-direct" 
"znver4-decode0|znver4-decode1|znver4-decode2|znver4-decode3")
+
+;; Fix me: Need to revisit this later to simulate fast path double behavior.
+(define_reservation "znver4-double" "znver4-direct")
+
+
+;; Integer unit 4 ALU pipes.
+(define_cpu_unit "znver4-ieu0" "znver4_ieu")
+(define_cpu_unit "znver4-ieu1" "znver4_ieu")
+(define_cpu_unit "znver4-ieu2" "znver4_ieu")
+(define_cpu_unit "znver4-ieu3" "znver4_ieu")
+;; Znver4 has an additional branch unit.
+(define_cpu_unit "znver4-bru0" "znver4_ieu")
+(define_reservation "znver4-ieu" 
"znver4-ieu0|znver4-ieu1|znver4-ieu2|znver4-ieu3")
+
+;; 3 AGU pipes in znver4
+(define_cpu_unit "znver4-agu0" "znver4_agu")
+(define_cpu_unit "znver4-agu1" "znver4_agu")
+(define_cpu_unit "znver4-agu2" "znver4_agu")
+(define_reservation "znver4-agu-reserve" "znver4-agu0|znver4-agu1|znver4-agu2")
+
+;; Load is 4 cycles. We do not model reservation of load unit.
+(define_reservation "znver4-load" "znver4-agu-reserve")
+(define_reservation "znver4-store" "znver4-agu-reserve")
+
+;; vectorpath (microcoded) instructions are single issue instructions.
+;; So, they occupy all the integer units.
+(define_reservation "znver4-ivector" "znver4-ieu0+znver4-ieu1
+ 

Re: [PATCH] c++: template friend with variadic constraints [PR108066]

2022-12-22 Thread Patrick Palka via Gcc-patches
On Thu, 15 Dec 2022, Jason Merrill wrote:

> On 12/15/22 14:31, Patrick Palka wrote:
> > On Thu, 15 Dec 2022, Patrick Palka wrote:
> > 
> > > On Thu, 15 Dec 2022, Jason Merrill wrote:
> > > 
> > > > On 12/12/22 12:20, Patrick Palka wrote:
> > > > > When instantiating a constrained hidden template friend, we need to
> > > > > substitute into its constraints for sake of declaration matching.
> > > > 
> > > > Hmm, we shouldn't need to do declaration matching when there's only a
> > > > single
> > > > declaration.
> > > 
> > > Oops, indeed.  It looks like in this case we're not calling
> > > maybe_substitute_reqs_for during declaration matching, but rather from
> > > tsubst_friend_function and specifically for the non-trailing requirements:
> > > 
> > >if (TREE_CODE (new_friend) == TEMPLATE_DECL)
> > >  {
> > >DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P (new_friend) = false;
> > >DECL_USE_TEMPLATE (DECL_TEMPLATE_RESULT (new_friend)) = 0;
> > >DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (new_friend))
> > >  = DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (decl));
> > > 
> > >/* Substitute TEMPLATE_PARMS_CONSTRAINTS so that parameter levels
> > > will
> > >   match in decls_match.  */
> > >tree parms = DECL_TEMPLATE_PARMS (new_friend);
> > >tree treqs = TEMPLATE_PARMS_CONSTRAINTS (parms);
> > >treqs = maybe_substitute_reqs_for (treqs, new_friend);
> > >if (treqs != TEMPLATE_PARMS_CONSTRAINTS (parms))
> > >  {
> > >TEMPLATE_PARMS_CONSTRAINTS (parms) = treqs;
> > >/* As well as each TEMPLATE_PARM_CONSTRAINTS.  */
> > >tsubst_each_template_parm_constraints (parms, args,
> > >   tf_warning_or_error);
> > >  }
> > >  }
> > > 
> > > I'll adjust the commit message appropriately.
> > > 
> > > > 
> > > > > For
> > > > > this substitution we use a full argument vector whose outer levels
> > > > > correspond to the class's arguments and innermost level corresponds to
> > > > > the template's level-lowered generic arguments.
> > > > > 
> > > > > But for A::f here, for which the relevant argument vector is
> > > > > {{int}, {Us...}}, this substitution triggers the assert in
> > > > > use_pack_expansion_extra_args_p since one argument is a pack expansion
> > > > > and the other isn't.
> > > > > 
> > > > > And for A::f, for which the relevant argument vector is
> > > > > {{int, int}, {Us...}}, the use_pack_expansion_extra_args_p assert
> > > > > would
> > > > > also trigger but we first get a bogus "mismatched argument pack
> > > > > lengths"
> > > > > error from tsubst_pack_expansion.
> > > > > 
> > > > > These might ultimately be bugs in tsubst_pack_expansion, but it seems
> > > > > we can work around them by tweaking the constraint substitution in
> > > > > maybe_substitute_reqs_for to only use the friend's outer arguments in
> > > > > the case of a template friend.
> > > > 
> > > > Yes, this is how we normally substitute a member template during class
> > > > instantiation: with the class' template args, not its own.  The assert
> > > > seems
> > > > to be enforcing that.
> > > 
> > > Ah, makes snese.
> > > 
> > > > 
> > > > > This should be otherwise equivalent to
> > > > > substituting using the full arguments, since the template's innermost
> > > > > arguments are just its generic arguments with level=1.
> > > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > > for
> > > > > trunk/12?
> > > > > 
> > > > >   PR c++/108066
> > > > >   PR c++/108067
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > >   * constraint.cc (maybe_substitute_reqs_for): For a template
> > > > >   friend, substitute using only its outer arguments.  Remove
> > > > >   dead uses_template_parms test.
> > > > 
> > > > I don't see any removal?
> > > 
> > > Oops, I reverted that change before posting the patch but forgot to adjust
> > > the
> > > ChangeLog as well.  Removing the uses_template_parms test seems to be safe
> > > but
> > > it's not necessary to fix the bug.
> > > 
> > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >   * g++.dg/cpp2a/concepts-friend12.C: New test.
> > > > > ---
> > > > >gcc/cp/constraint.cc  |  8 +++
> > > > >.../g++.dg/cpp2a/concepts-friend12.C  | 22
> > > > > +++
> > > > >2 files changed, 30 insertions(+)
> > > > >create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-friend12.C
> > > > > 
> > > > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > > > > index d4cd92ec3b4..f9d1009c9fe 100644
> > > > > --- a/gcc/cp/constraint.cc
> > > > > +++ b/gcc/cp/constraint.cc
> > > > > @@ -1352,6 +1352,14 @@ maybe_substitute_reqs_for (tree reqs,
> > > > > const_tree
> > > > > decl)
> > > > >  tree tmpl = DECL_TI_TEMPLATE (decl);
> > > > >  tree gargs = generic_targs_for (tmpl);
> > > > > 

[PATCH V2] Disable sched1 in functions that call setjmp

2022-12-22 Thread Jose E. Marchesi via Gcc-patches
When the following testcase is built with -fschedule-insns in either
x86_64 or aarch64:

  #include 
  #include 
  #include 

  jmp_buf ex_buf__;

   #define TRY do{ if( !setjmp(ex_buf__) ){
   #define CATCH } else {
   #define ETRY } }while(0)
   #define THROW longjmp(ex_buf__, 1)

  int f(int x)
  {
int arr[] = {1,2,6,8,9,10};
int lo=0;
int hi=5;

while(lo<=hi) {
  int mid=(lo+hi)/2;

  if(arr[mid]==x) {
THROW;
  } else if(arr[mid]x) {
hi=mid-1;
  }
}

return -1;
  }

  int
  main(int argc, char** argv)
  {
int a=2;
bool b=false;

TRY
{
 a=f(a);
 b=true;
}
CATCH
{
 printf("a : %d\n",a);
 printf("Got Exception!\n");
}
ETRY;

if(b) {
  printf("b is true!\n");
}
return 0;
  }

The first instruction scheduler pass reorders instructions in the TRY
block in a way `b=true' gets executed before the call to the function
`f'.  This optimization is wrong, because `main' calls setjmp and `f'
is known to call longjmp.

As discussed in BZ 57067, the root cause for this is the fact that
setjmp is not properly modeled in RTL, and therefore the backend
passes have no normalized way to handle this situation.

As Alexander Monakov noted in the BZ, many RTL passes refuse to touch
functions that call setjmp.  This includes for example gcse,
store_motion and cprop.  This patch adds the sched1 pass to that list.

Note that the other instruction scheduling passes are still allowed to
run on these functions, since they reorder instructions within basic
blocks, and therefore they cannot cross function calls.

This doesn't fix the fundamental issue, but at least assures that
sched1 wont perform invalid transformation in correct C programs.

regtested in aarch64-linux-gnu.

gcc/ChangeLog:

PR rtl-optimization/57067
* sched-rgn.cc (pass_sched::gate): Disable pass if current
function calls setjmp.
---
 gcc/sched-rgn.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
index 420c45dffb4..c536d0b8dea 100644
--- a/gcc/sched-rgn.cc
+++ b/gcc/sched-rgn.cc
@@ -3847,7 +3847,8 @@ bool
 pass_sched::gate (function *)
 {
 #ifdef INSN_SCHEDULING
-  return optimize > 0 && flag_schedule_insns && dbg_cnt (sched_func);
+  return optimize > 0 && flag_schedule_insns
+&& !cfun->calls_setjmp && dbg_cnt (sched_func);
 #else
   return 0;
 #endif
-- 
2.30.2



[PING] Re: [PATCH 2/2] Corrected pr25521.c target matching.

2022-12-22 Thread Cupertino Miranda via Gcc-patches


Cupertino Miranda via Gcc-patches writes:

> gentle ping
>
> Cupertino Miranda writes:
>
>>> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
 This commit is a follow up of bugzilla #107181.
 The commit /a0aafbc/ changed the default implementation of the
 SELECT_SECTION hook in order to match clang/llvm behaviour w.r.t the
 placement of `const volatile' objects.
 However, the following targets use target-specific selection functions
 and they choke on the testcase pr25521.c:
   *rx - target sets its const variables as '.section C,"a",@progbits'.
>>> That's presumably a constant section.  We should instead twiddle the test to
>>> recognize that section.
>>
>> Although @progbits is indeed a constant section, I believe it is
>> more interesting to detect if the `rx' starts selecting more
>> standard sections instead of the current @progbits.
>> That was the reason why I opted to XFAIL instead of PASSing it.
>> Can I keep it as such ?
>>
>>>
   *powerpc - its 32bit version is eager to allocate globals in .sdata
  sections.
 Normally, one can expect for the variable to be allocated in .srodata,
 however, in case of powerpc-*-* or powerpc64-*-* (with -m32)
 'targetm.have_srodata_section == false' and the code in
 categorize_decl_for_section(varasm.cc), forces it to allocate in .sdata.
/* If the target uses small data sections, select it.  */
else if (targetm.in_small_data_p (decl))
  {
if (ret == SECCAT_BSS)
ret = SECCAT_SBSS;
else if targetm.have_srodata_section && ret == SECCAT_RODATA)
ret = SECCAT_SRODATA;
else
ret = SECCAT_SDATA;
  }
>>> I'd just skip the test for 32bit ppc.  There should be suitable 
>>> effective-target
>>> tests you can use.
>>>
>>> jeff


[PING] Re: [PATCH 1/2] select .rodata for const volatile variables.

2022-12-22 Thread Cupertino Miranda via Gcc-patches


Cupertino Miranda via Gcc-patches writes:

> gentle ping
>
> Cupertino Miranda writes:
>
>> Hi Jeff,
>>
>> First of all thanks for your quick review.
>> Apologies for the delay replying, the message got lost in my inbox.
>>
>>> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
 Changed target code to select .rodata section for 'const volatile'
 defined variables.
 This change is in the context of the bugzilla #170181.
 gcc/ChangeLog:
v850.c(v850_select_section): Changed function.
>>> I'm not sure this is safe/correct.  ISTM that you need to look at the 
>>> underlying
>>> TREE_TYPE to check for const-volatile rather than TREE_SIDE_EFFECTS.
>>
>> I believe this was asked by Jose when he first sent the generic patches.
>> Please notice my change is influenced by his original patch that does
>> the same and was approved.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599348.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602374.html
>>
>>>
>>> Of secondary importance is the ChangeLog.  Just saying "Changed function"
>>> provides no real information.  Something like this would be better:
>>>
>>> * config/v850/v850.c (v850_select_section): Put const volatile
>>> objects into read-only sections.
>>>
>>>
>>> Jeff
>>>
>>>
>>>
>>>
 ---
   gcc/config/v850/v850.cc | 1 -
   1 file changed, 1 deletion(-)
 diff --git a/gcc/config/v850/v850.cc b/gcc/config/v850/v850.cc
 index c7d432990ab..e66893fede4 100644
 --- a/gcc/config/v850/v850.cc
 +++ b/gcc/config/v850/v850.cc
 @@ -2865,7 +2865,6 @@ v850_select_section (tree exp,
   {
 int is_const;
 if (!TREE_READONLY (exp)
 -|| TREE_SIDE_EFFECTS (exp)
  || !DECL_INITIAL (exp)
  || (DECL_INITIAL (exp) != error_mark_node
  && !TREE_CONSTANT (DECL_INITIAL (exp


Re: [committed] docs: Fix peephole paragraph ordering

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Patches attached to the wrong email - this patch was actually:

On Thu, Dec 22, 2022 at 05:06:13PM +, Andrew Carlotti via Gcc-patches wrote:
> The documentation for the DONE and FAIL macros was incorrectly inserted
> between example code, and a remark attached to that example.
> 
> Committed as obvious.
> 
> gcc/ChangeLog:
> 
>   * doc/md.texi: Move example code remark next to it's code block.
> 
> ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
cc28f868fc85b5148450548a54d69a39ecc4f03a..c1d3ae2060d800bbaa9751fcf841d7417af1e37d
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9321,6 +9321,11 @@ so here's a silly made-up example:
   "")
 @end smallexample
 
+@noindent
+If we had not added the @code{(match_dup 4)} in the middle of the input
+sequence, it might have been the case that the register we chose at the
+beginning of the sequence is killed by the first or second @code{set}.
+
 There are two special macros defined for use in the preparation statements:
 @code{DONE} and @code{FAIL}.  Use them with a following semicolon,
 as a statement.
@@ -9348,11 +9353,6 @@ If the preparation falls through (invokes neither 
@code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
-@noindent
-If we had not added the @code{(match_dup 4)} in the middle of the input
-sequence, it might have been the case that the register we chose at the
-beginning of the sequence is killed by the first or second @code{set}.
-
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes


Re: [committed] docs: Link to correct section for constraint modifiers

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Patches attached in to the wrong emails - this patch was actually:

On Thu, Dec 22, 2022 at 05:05:51PM +, Andrew Carlotti via Gcc-patches wrote:
> Committed as obvious.
> 
> gcc/ChangeLog:
> 
>   * doc/md.texi: Fix incorrect pxref.
> 
> ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
482e86f15d8b312c67d4962510ce879fb5cbc541..78dc6d720700ca409677e44a34a60d4b7fceb046
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1511,7 +1511,7 @@ operand 1 (meaning it must match operand 0), and 
@samp{dKs} for operand
 2.  The second alternative has @samp{d} (data register) for operand 0,
 @samp{0} for operand 1, and @samp{dmKs} for operand 2.  The @samp{=} and
 @samp{%} in the constraints apply to all the alternatives; their
-meaning is explained in the next section (@pxref{Class Preferences}).
+meaning is explained in a later section (@pxref{Modifiers}).
 
 If all the operands fit any one alternative, the instruction is valid.
 Otherwise, for each alternative, the compiler counts how many instructions


[PATCH 12/15 V5] arm: implement bti injection

2022-12-22 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

> On 14/12/2022 17:00, Richard Earnshaw via Gcc-patches wrote:
>> On 14/12/2022 16:40, Andrea Corallo via Gcc-patches wrote:
>>> Hi Richard,
>>>
>>> thanks for reviewing.
>>>
>>> Richard Earnshaw  writes:
>>>
 On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:
> Hi all,
> please find attached the third iteration of this patch addresing
> review
> comments.
> Thanks
>     Andrea
>

 @@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
     return "";
   }

 -static bool
 -aarch_bti_enabled ()
 -{
 -  return false;
 -}
 -
   /* Generate the prologue instructions for entry into an ARM or Thumb-2
  function.  */
   void
 @@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
     && !crtl->is_leaf));
   }

 +/* Return TRUE if Branch Target Identification Mechanism is
 enabled.  */
 +bool
 +aarch_bti_enabled (void)
 +{
 +  return aarch_enable_bti == 1;
 +}

 See comment in earlier patch about the location of this function
 moving.   Can aarch_enable_bti take values other than 0 and 1?
>>>
>>> Yes default is 2.
>> It shouldn't be by this point, because, hopefully you've gone
>> through the equivalent of this hunk (from aarch64) somewhere in
>> arm_override_options:
>>     if (aarch_enable_bti == 2)
>>   {
>>   #ifdef TARGET_ENABLE_BTI
>>     aarch_enable_bti = 1;
>>   #else
>>     aarch_enable_bti = 0;
>>   #endif
>>   }
>> And after this point the '2' should never be seen again.  We use
>> this trick to permit the user to force a default that differs from
>> the configuration.
>> However, I don't see a hunk to do this in patch 3, so perhaps that
>> needs updating to fix this.
>
> I've just remembered that the above is to support a configure-time
> option of the compiler to enable branch protection.  But perhaps we
> don't want to have that in AArch32, in which case it would be better
> not to have the default be 2 anyway, just default to off (0).
>
> R.

Done in 1/15 (needs approval again now).

>>
>>> [...]
>>>
 +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
 UNSPEC_BTI_NOP;

 I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
 separate enums in the backend, so UNSPEC_BIT_NOP should really be
 VUNSPEC_BTI_NOP and defined in the enum "unspecv".
>>>
>>> Done
>>>
 +aarch_pac_insn_p (rtx x)
 +{
 +  if (!x || !INSN_P (x))
 +    return false;
 +
 +  rtx pat = PATTERN (x);
 +
 +  if (GET_CODE (pat) == SET)
 +    {
 +  rtx tmp = XEXP (pat, 1);
 +  if (tmp
 +  && GET_CODE (tmp) == UNSPEC
 +  && (XINT (tmp, 1) == UNSPEC_PAC_NOP
 +  || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
 +    return true;
 +    }
 +

 This will also need updating (see review on earlier patch) because
 PACBTI needs to be unspec_volatile, while PAC doesn't.
>>>
>>> Done
>>>
 +/* The following two functions are for code compatibility with aarch64
 +   code, this even if in arm we have only one bti instruction.  */
 +

 I'd just write
   /* Target specific mapping for aarch_gen_bti_c and
   aarch_gen_bti_j. For Arm, both of these map to a simple BTI
 instruction.  */
>>>
>>> Done
>>>

 @@ -162,6 +162,7 @@ (define_c_enum "unspec" [
     UNSPEC_PAC_NOP    ; Represents PAC signing LR
     UNSPEC_PACBTI_NOP    ; Represents PAC signing LR + valid landing pad
     UNSPEC_AUT_NOP    ; Represents PAC verifying LR
 +  UNSPEC_BTI_NOP    ; Represent BTI
   ])

 BTI is an unspec volatile, so this should be in the "vunspec" enum and
 renamed accordingly (see above).
>>>
>>> Done.
>>>
>>> Please find attached the updated version of this patch.
>>>
>>> BR
>>>
>>>    Andrea
>>>
>> Apart from that, this is OK.
>> R.

Cool, attached the updated patch.

Also I added some error handling not to run the bti pass if the march
selected does not support bti.

BR

  Andrea

>From afd54e771268733b7f1f4945c9b2cdabe1d6a6e5 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 7 Apr 2022 11:51:56 +0200
Subject: [PATCH] [PATCH 12/15] arm: implement bti injection

Hi all,

this patch enables Branch Target Identification Armv8.1-M Mechanism
[1].

This is achieved by using the bti pass made common with Aarch64.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Best Regards

  Andrea

[1]


gcc/ChangeLog

2022-04-07  Andrea Corallo  

* config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
* config/arm/arm-protos.h: Upd

[PATCH] Disable sched1 in functions that call setjmp

2022-12-22 Thread Jose E. Marchesi via Gcc-patches
When the following testcase is built with -fschedule-insns in either
x86_64 or aarch64:



jmp_buf ex_buf__;

int f(int x)
{
  int arr[] = {1,2,6,8,9,10};
  int lo=0;
  int hi=5;

  while(lo<=hi) {
int mid=(lo+hi)/2;

if(arr[mid]==x) {
  THROW;
} else if(arr[mid]x) {
  hi=mid-1;
}
  }

  return -1;
}

int
main(int argc, char** argv)
{
  int a=2;
  bool b=false;

  TRY
  {
   a=f(a);
   b=true;
  }
  CATCH
  {
   printf("a : %d\n",a);
   printf("Got Exception!\n");
  }
  ETRY;

  if(b) {
printf("b is true!\n");
  }
  return 0;
}


The first instruction scheduler pass reorders instructions in the TRY
block in a way `b=true' gets executed before the call to the function
`f'.  This optimization is wrong, because `main' calls setjmp and `f'
is known to call longjmp.

As discussed in BZ 57067, the root cause for this is the fact that
setjmp is not properly modeled in RTL, and therefore the backend
passes have no normalized way to handle this situation.

As Alexander Monakov noted in the BZ, many RTL passes refuse to touch
functions that call setjmp.  This includes for example gcse,
store_motion and cprop.  This patch adds the sched1 pass to that list.

Note that the other instruction scheduling passes are still allowed to
run on these functions, since they reorder instructions within basic
blocks, and therefore they cannot cross function calls.

This doesn't fix the fundamental issue, but at least assures that
sched1 wont perform invalid transformation in correct C programs.

regtested in aarch64-linux-gnu.

gcc/ChangeLog:

PR rtl-optimization/57067
* sched-rgn.cc (pass_sched::gate): Disable pass if current
function calls setjmp.
---
 gcc/sched-rgn.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
index 420c45dffb4..c536d0b8dea 100644
--- a/gcc/sched-rgn.cc
+++ b/gcc/sched-rgn.cc
@@ -3847,7 +3847,8 @@ bool
 pass_sched::gate (function *)
 {
 #ifdef INSN_SCHEDULING
-  return optimize > 0 && flag_schedule_insns && dbg_cnt (sched_func);
+  return optimize > 0 && flag_schedule_insns
+&& !cfun->calls_setjmp && dbg_cnt (sched_func);
 #else
   return 0;
 #endif
-- 
2.30.2



[committed] docs: Fix peephole paragraph ordering

2022-12-22 Thread Andrew Carlotti via Gcc-patches
The documentation for the DONE and FAIL macros was incorrectly inserted
between example code, and a remark attached to that example.

Committed as obvious.

gcc/ChangeLog:

* doc/md.texi: Move example code remark next to it's code block.

---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
482e86f15d8b312c67d4962510ce879fb5cbc541..78dc6d720700ca409677e44a34a60d4b7fceb046
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1511,7 +1511,7 @@ operand 1 (meaning it must match operand 0), and 
@samp{dKs} for operand
 2.  The second alternative has @samp{d} (data register) for operand 0,
 @samp{0} for operand 1, and @samp{dmKs} for operand 2.  The @samp{=} and
 @samp{%} in the constraints apply to all the alternatives; their
-meaning is explained in the next section (@pxref{Class Preferences}).
+meaning is explained in a later section (@pxref{Modifiers}).
 
 If all the operands fit any one alternative, the instruction is valid.
 Otherwise, for each alternative, the compiler counts how many instructions


[committed] docs: Fix inconsistent example predicate name

2022-12-22 Thread Andrew Carlotti via Gcc-patches
It is unclear why the example C function was renamed to
`commutative_integer_operator` as part of ec8e098d in 2004, while the
text and the example md were both left as `commutative_operator`. The
latter name appears to be more accurate, so revert the 2004 change.

Committed as obvious.

gcc/ChangeLog:

* doc/md.texi: Fix inconsistent example name.

---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
78dc6d720700ca409677e44a34a60d4b7fceb046..cc28f868fc85b5148450548a54d69a39ecc4f03a
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -377,7 +377,7 @@ commutative arithmetic operators of RTL and whose mode is 
@var{mode}:
 
 @smallexample
 int
-commutative_integer_operator (x, mode)
+commutative_operator (x, mode)
  rtx x;
  machine_mode mode;
 @{


[committed] docs: Link to correct section for constraint modifiers

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Committed as obvious.

gcc/ChangeLog:

* doc/md.texi: Fix incorrect pxref.

---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
cc28f868fc85b5148450548a54d69a39ecc4f03a..c1d3ae2060d800bbaa9751fcf841d7417af1e37d
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9321,6 +9321,11 @@ so here's a silly made-up example:
   "")
 @end smallexample
 
+@noindent
+If we had not added the @code{(match_dup 4)} in the middle of the input
+sequence, it might have been the case that the register we chose at the
+beginning of the sequence is killed by the first or second @code{set}.
+
 There are two special macros defined for use in the preparation statements:
 @code{DONE} and @code{FAIL}.  Use them with a following semicolon,
 as a statement.
@@ -9348,11 +9353,6 @@ If the preparation falls through (invokes neither 
@code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
-@noindent
-If we had not added the @code{(match_dup 4)} in the middle of the input
-sequence, it might have been the case that the register we chose at the
-beginning of the sequence is killed by the first or second @code{set}.
-
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes


[PATCH 1/15 V2] arm: Make mbranch-protection opts parsing common to AArch32/64

2022-12-22 Thread Andrea Corallo via Gcc-patches
Hi all,

respinning this as a rebase was necessary, also now is setting
'aarch_enable_bti' to zero as default for arm as suggested during the
review of 12/15.

Best Regards

  Andrea


>From 6c765818542cc7b40701e8adae2cbe077d5982cc Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Mon, 6 Dec 2021 11:34:35 +0100
Subject: [PATCH] [PATCH 1/15] arm: Make mbranch-protection opts parsing common
 to AArch32/64

Hi all,

This change refactors all the mbranch-protection option parsing code and
types to make it common to both AArch32 and AArch64 backends.

This change also pulls in some supporting types from AArch64 to make
it common (aarch_parse_opt_result).

The significant changes in this patch are the movement of all branch
protection parsing routines from aarch64.c to aarch-common.c and
supporting data types and static data structures.

This patch also pre-declares variables and types required in the
aarch32 back-end for moved variables for function sign scope and key
to prepare for the impending series of patches that support parsing
the feature mbranch-protection in the aarch32 back-end.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.cc: Include aarch-common.h.
(all_architectures): Fix comment.
(aarch64_parse_extension): Rename return type, enum value names.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Rename
factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
Also rename corresponding enum values.
* config/aarch64/aarch64-opts.h (aarch64_function_type): Factor
out aarch64_function_type and move it to common code as
aarch_function_type in aarch-common.h.
* config/aarch64/aarch64-protos.h: Include common types header,
move out types aarch64_parse_opt_result and aarch64_key_type to
aarch-common.h
* config/aarch64/aarch64.cc: Move mbranch-protection parsing types
and functions out into aarch-common.h and aarch-common.cc.  Fix up
all the name changes resulting from the move.
* config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name change
and enum value.
* config/aarch64/aarch64.opt: Include aarch-common.h to import
type move.  Fix up name changes from factoring out common code and
data.
* config/arm/aarch-common-protos.h: Export factored out routines to both
backends.
* config/arm/aarch-common.cc: Include newly factored out types.
Move all mbranch-protection code and data structures from
aarch64.cc.
* config/arm/aarch-common.h: New header that declares types shared
between aarch32 and aarch64 backends.
* config/arm/arm-protos.h: Declare types and variables that are
made common to aarch64 and aarch32 backends - aarch_ra_sign_key,
aarch_ra_sign_scope and aarch_enable_bti.

Co-Authored-By: Tejas Belagod  
---
 gcc/common/config/aarch64/aarch64-common.cc |  13 +-
 gcc/config/aarch64/aarch64-c.cc |   8 +-
 gcc/config/aarch64/aarch64-opts.h   |  10 -
 gcc/config/aarch64/aarch64-protos.h |  21 +-
 gcc/config/aarch64/aarch64.cc   | 360 +---
 gcc/config/aarch64/aarch64.md   |   2 +-
 gcc/config/aarch64/aarch64.opt  |  15 +-
 gcc/config/arm/aarch-common-protos.h|   6 +
 gcc/config/arm/aarch-common.cc  | 185 ++
 gcc/config/arm/aarch-common.h   |  73 
 gcc/config/arm/arm-protos.h |   2 +
 gcc/config/arm/arm.cc   |   7 +
 gcc/config/arm/arm.opt  |   9 +
 13 files changed, 390 insertions(+), 321 deletions(-)
 create mode 100644 gcc/config/arm/aarch-common.h

diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
b/gcc/common/config/aarch64/aarch64-common.cc
index 61007839d35..18b0b72c012 100644
--- a/gcc/common/config/aarch64/aarch64-common.cc
+++ b/gcc/common/config/aarch64/aarch64-common.cc
@@ -31,6 +31,7 @@
 #include "flags.h"
 #include "diagnostic.h"
 #include "config/aarch64/aarch64-feature-deps.h"
+#include "config/arm/aarch-common.h"
 
 #ifdef  TARGET_BIG_ENDIAN_DEFAULT
 #undef  TARGET_DEFAULT_TARGET_FLAGS
@@ -191,13 +192,13 @@ static constexpr arch_to_arch_name all_architectures[] =
 
 /* Parse the architecture extension string STR and update ISA_FLAGS
with the architecture features turned on or off.  Return a
-   aarch64_parse_opt_result describing the result.
+   aarch_parse_opt_result describing the result.
When the STR string contains an invalid extension,
a copy of the string is created and stored to INVALID_EXTENSION.  */
 
-enum aarch64_parse_opt_result
+enum aarch_parse_opt_result
 aarch64_parse_extension (const char *str, aarch64_feature_flags *isa_flags,
-std::string *invalid_extension)
+ std::string *invalid_extension)
 {
   /* The extension string is parsed left to right.  */
   cons

Re: gcc-13/changes.html: Mention -fstrict-flex-arrays and its impact

2022-12-22 Thread Qing Zhao via Gcc-patches


> On Dec 22, 2022, at 2:09 AM, Richard Biener  wrote:
> 
> On Wed, 21 Dec 2022, Qing Zhao wrote:
> 
>> Hi, Richard,
>> 
>> Thanks a lot for your comments.
>> 
>>> On Dec 21, 2022, at 2:12 AM, Richard Biener  wrote:
>>> 
>>> On Tue, 20 Dec 2022, Qing Zhao wrote:
>>> 
 Hi,
 
 This is the patch for mentioning -fstrict-flex-arrays and -Warray-bounds=2 
 changes in gcc-13/changes.html.
 
 Let me know if you have any comment or suggestions.
>>> 
>>> Some copy editing below
>>> 
 Thanks.
 
 Qing.
 
 ===
 From c022076169b4f1990b91f7daf4cc52c6c5535228 Mon Sep 17 00:00:00 2001
 From: Qing Zhao 
 Date: Tue, 20 Dec 2022 16:13:04 +
 Subject: [PATCH] gcc-13/changes: Mention -fstrict-flex-arrays and its 
 impact.
 
 ---
 htdocs/gcc-13/changes.html | 15 +++
 1 file changed, 15 insertions(+)
 
 diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
 index 689178f9..47b3d40f 100644
 --- a/htdocs/gcc-13/changes.html
 +++ b/htdocs/gcc-13/changes.html
 @@ -39,6 +39,10 @@ a work-in-progress.
Legacy debug info compression option -gz=zlib-gnu was 
 removed
  and the option is ignored right now.
New debug info compression option value -gz=zstd has 
 been added.
 +-Warray-bounds=2 will no longer issue warnings for 
 out of bounds
 +  accesses to trailing struct members of one-element array type 
 anymore. Please
 +  add -fstrict-flex-arrays=level to control how the 
 compiler treat
 +  trailing arrays of structures as flexible array members. 
>>> 
>>> "Instead it diagnoses accesses to trailing arrays according to 
>>> -fstrict-flex-arrays."
>> 
>> Okay.
>>> 
 
 
 
 @@ -409,6 +413,17 @@ a work-in-progress.
 Other significant improvements
 
 
 +Treating trailing arrays as flexible array 
 members
 +
 +
 + GCC can now control when to treat the trailing array of a structure 
 as a 
 + flexible array member for the purpose of accessing the elements of 
 such
 + an array. By default, all trailing arrays of structures are treated 
 as
>>> 
>>> all trailing arrays in aggregates are treated
>> Okay.
>>> 
 + flexible array members. Use the new command-line option
 + -fstrict-flex-array=level to control how GCC treats the 
 trailing
 + array of a structure as a flexible array member at different levels.
>>> 
>>> -fstrict-flex-arrays to control which trailing array
>>> members are streated as flexible arrays.
>> 
>> Okay.
>> 
>>> 
>>> I've also just now noticed that there's now a flag_strict_flex_arrays
>>> check in the middle-end (in array bound diagnostics) but this option
>>> isn't streamed or handled with LTO.  I think you want to replace that
>>> with the appropriate DECL_NOT_FLEXARRAY check.
>> 
>> We need to know the level value of the strict_flex_arrays on the struct 
>> field to issue proper warnings at different levels. DECL_NOT_FLEXARRAY 
>> does not include such info. So, what should I do? Streaming the 
>> flag_strict_flex_arrays with LTO?
> 
> But you do
> 
>  if (compref)
>{
>  /* Try to determine special array member type for this 
> COMPONENT_REF.  */
>  sam = component_ref_sam_type (arg);
>  /* Get the level of strict_flex_array for this array field.  */
>  tree afield_decl = TREE_OPERAND (arg, 1);
>  strict_flex_array_level = strict_flex_array_level_of (afield_decl);
> 
> I see that function doesn't look at DECL_NOT_FLEXARRAY but just
> checks attributes (those are streamed in LTO).

Yes, checked both flag_strict_flex_arrays and attributes. 

There are two places in middle end calling “strict_flex_array_level_of” 
function, 
one inside “array_bounds_checker::check_array_ref”, another one inside 
“component_ref_size”.
Shall we check DECL_NOT_FLEXARRAY field instead of calling 
“strict_flex_array_level_of” in both places?

> 
> OK, so I suppose the diagnostic itself would become just less precise
> as in "trailing array %qT should not be used as a flexible array member"
> without the "for level N and above" part of the diagnostic?

Yes, that might be the major impact.

If only check DECL_NOT_FLEXARRAY, we will lose such information. Does that 
matter?

> 
>>> We might also want
>>> to see how inlining accesses from TUs with different -fstrict-flex-arrays
>>> setting behaves when accessing the same structure (and whether we might
>>> want to issue an ODR style diagnostic there).
> 
> This mixing also means streaming -fstrict-flex-arrays won't be of much
> help in general.

Then under such situation, i.e, different -fstrict-flex-arrays levels for the 
same structure from different TUs, how should we handle it? 
> 
>> Yes, good point, I will check on this part.
>> 
>> BTW, a stupid question: what does ODR mean?
> 
> It's the One-Definition-Rule (of C++).  Basic

Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-22 Thread Patrick Palka via Gcc-patches
On Wed, 21 Dec 2022, Jason Merrill wrote:

> On 12/21/22 09:52, Patrick Palka wrote:
> > Here during ahead of time checking of C{}, we indirectly call get_nsdmi
> > for C::m from finish_compound_literal, which in turn calls
> > break_out_target_exprs for C::m's (non-templated) initializer, during
> > which we end up building a call to A::~A and checking expr_noexcept_p
> > for it (from build_vec_delete_1).  But this is all done with
> > processing_template_decl set, so the built A::~A call is templated
> > (whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
> > expr_noexcept_p doesn't expect and we crash.
> > 
> > In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
> > expr_noexcept_p call with !processing_template_decl, which works here
> > too.  But it seems to me since the initializer we obtain in get_nsdmi is
> > always non-templated, it should be calling break_out_target_exprs with
> > processing_template_decl cleared since otherwise the function might end
> > up mixing templated and non-templated trees.
> > 
> > I'm not sure about this though, perhaps this is not the best fix here.
> > Alternatively, when processing_template_decl we could make get_nsdmi
> > avoid calling break_out_target_exprs at all or something.  Additionally,
> > perhaps break_out_target_exprs should be a no-op more generally when
> > processing_template_decl since we shouldn't see any TARGET_EXPRs inside
> > a template?
> 
> Hmm.
> 
> Any time we would call break_out_target_exprs we're dealing with non-dependent
> expressions; if we're in a template, we're building up an initializer or a
> call that we'll soon throw away, just for the purpose of checking or type
> computation.
> 
> Furthermore, as you say, the argument is always a non-template tree, whether
> in get_nsdmi or convert_default_arg.  So having processing_template_decl
> cleared would be correct.
> 
> I don't think we can get away with not calling break_out_target_exprs at all
> in a template; if nothing else, we would lose immediate invocation expansion.
> However, we could probably skip the bot_manip tree walk, which should avoid
> the problem.
> 
> Either way we end up returning non-template trees, as we do now, and callers
> have to deal with transient CONSTRUCTORs containing such (as we do in
> massage_init_elt).

Ah I see, makes sense.

> 
> Does convert_default_arg not run into the same problem, e.g. when calling
> 
>   void g(B = {0});

In practice it seems not, because we don't call convert_default_arg
when processing_template_decl is set (verified with an assert to
that effect).  In build_over_call for example we exit early when
processing_template_decl is set, and return a templated CALL_EXPR
that doesn't include default arguments at all.  A consequence of
this is that we don't reject ahead of time a call that would use
an ill-formed dependent default argument, e.g.

  template
  void g(B = T{0});

  template
  void f() {
g();
  }

since the default argument instantiation would be the responsibility
of convert_default_arg.

Thinking hypothetically here, if we do in the future want to include default
arguments in the templated form of a CALL_EXPR, we'd probably have to
instantiate them with processing_template_decl set so that the result is
templated.  And we'd subsequently want to call break_out_target_exprs on
the result also with processing_template_decl set IIUC, to perform
immediate invocation expansion.  This seems to be a potential use case
for being able to call break_out_target_exprs on templated trees, and so
unconditionally clearing p_t_d from break_out_target_exprs might not be
future proof.

In light of this, shall we go with the original approach to clear
processing_template_decl directly from get_nsdmi?

> 
> ?
> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu.
> > 
> > PR c++/108116
> > 
> > gcc/cp/ChangeLog:
> > 
> > * init.cc (get_nsdmi): Clear processing_template_decl before
> > processing the non-templated initializer.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/nsdmi-template24.C: New test.
> > ---
> >   gcc/cp/init.cc|  8 ++-
> >   gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C | 22 +++
> >   2 files changed, 29 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-template24.C
> > 
> > diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
> > index 73e6547c076..c4345ebdaea 100644
> > --- a/gcc/cp/init.cc
> > +++ b/gcc/cp/init.cc
> > @@ -561,7 +561,8 @@ perform_target_ctor (tree init)
> > return init;
> >   }
> >   -/* Return the non-static data initializer for FIELD_DECL MEMBER.  */
> > +/* Return the non-static data initializer for FIELD_DECL MEMBER.
> > +   The initializer returned is always non-templated.  */
> > static GTY((cache)) decl_tree_cache_map *nsdmi_inst;
> >   @@ -670,6 +671,11 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t
> > complain)
> > current_class_pt

Re: [PATCH] bootstrap/106482 - document minimal GCC version

2022-12-22 Thread Jakub Jelinek via Gcc-patches
On Thu, Dec 22, 2022 at 03:54:03PM +0100, Richard Biener wrote:
> There's no explicit mention of what GCC compiler supports C++11
> and the cross compiler build requirement mentions GCC 4.8 but not
> GCC 4.8.3 which is the earliest known version to not run into
> C++11 implementation bugs.  The following adds explicit wording.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
>   PR bootstrap/106482
>   * doc/install.texi (ISO C++11 Compiler): Document GCC version
>   known to work.

LGTM, thanks.

Jakub



[PATCH] bootstrap/106482 - document minimal GCC version

2022-12-22 Thread Richard Biener via Gcc-patches
There's no explicit mention of what GCC compiler supports C++11
and the cross compiler build requirement mentions GCC 4.8 but not
GCC 4.8.3 which is the earliest known version to not run into
C++11 implementation bugs.  The following adds explicit wording.

OK for trunk?

Thanks,
Richard.

PR bootstrap/106482
* doc/install.texi (ISO C++11 Compiler): Document GCC version
known to work.
---
 gcc/doc/install.texi | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 5c0214b4e62..fc3a3cba552 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -227,7 +227,9 @@ described below.
 @heading Tools/packages necessary for building GCC
 @table @asis
 @item ISO C++11 compiler
-Necessary to bootstrap GCC.
+Necessary to bootstrap GCC.  GCC 4.8.3 or newer has sufficient
+support for used C++11 features, with earlier GCC versions you
+might run into implementation bugs.
 
 Versions of GCC prior to 11 also allow bootstrapping with an ISO C++98
 compiler, versions of GCC prior to 4.8 also allow bootstrapping with a
@@ -236,7 +238,7 @@ bootstrapping with a traditional (K&R) C compiler.
 
 To build all languages in a cross-compiler or other configuration where
 3-stage bootstrap is not performed, you need to start with an existing
-GCC binary (version 4.8 or later) because source code for language
+GCC binary (version 4.8.3 or later) because source code for language
 frontends other than C might use GCC extensions.
 
 @item C standard library and headers
-- 
2.35.3


[PATCH] testsuite/107809 - fix vect-recurr testcases

2022-12-22 Thread Richard Biener via Gcc-patches
This adds a missing effective target check for the permute
recurrence vectorization requires.

Tested on x86_64-unknown-linux-gnu, pushed.

PR testsuite/107809
* gcc.dg/vect/vect-recurr-1.c: Require vect_perm.
* gcc.dg/vect/vect-recurr-2.c: Likewise.
* gcc.dg/vect/vect-recurr-3.c: Likewise.
* gcc.dg/vect/vect-recurr-4.c: Likewise.
* gcc.dg/vect/vect-recurr-5.c: Likewise.
* gcc.dg/vect/vect-recurr-6.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/vect-recurr-1.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-recurr-2.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-recurr-3.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-recurr-4.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-recurr-5.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-recurr-6.c | 1 +
 6 files changed, 6 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-1.c
index 6eb59fdf854..64de22a1db4 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-1.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-2.c
index 97efaaa38bc..086b48d9087 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-2.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-3.c
index 621a5d8a257..3389736ead9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-3.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-4.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-4.c
index f6dbc494a62..c0b73cd8f33 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-4.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-5.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-5.c
index 19c56df9e83..7327883cc31 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-5.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-5.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-recurr-6.c 
b/gcc/testsuite/gcc.dg/vect/vect-recurr-6.c
index e7712680853..f678b326f10 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-recurr-6.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-recurr-6.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
-- 
2.35.3


[PATCH] phiopt: Adjust instead of reset phires range

2022-12-22 Thread Jakub Jelinek via Gcc-patches
On Thu, Dec 22, 2022 at 01:09:21PM +0100, Aldy Hernandez wrote:
>  INTEGER_CST singleton and
> > union that into the SSA_NAMEs range and then do set_range_info
> > with the altered range I guess.
> >
> 
> Note that set_range_info is an intersect operation. It should really be
> called update_range_info. Up to now, there were no users that wanted to
> clobber old ranges completely.

Thanks.
That would be then (I've committed the previous patch, also for reasons of
backporting) following incremental patch.

For the just committed testcase, it does the right thing,
# RANGE [irange] int [-INF, -1][1, +INF]
# iftmp.2_9 = PHI 
is before the range (using -fdump-tree-all-alias) and r below is
[irange] int [-INF, -1][1, +INF],
unioned with carg of 0 into VARYING.
If I try attached testcase though (which just uses signed char d instead of
int d to give more interesting range info), then I see:
# RANGE [irange] int [-128, -1][1, 127]
# iftmp.2_10 = PHI 
but strangely r I get from range_of_expr is
[irange] int [-128, 127]
rather than the expected [irange] int [-128, -1][1, 127].
Sure, it is later unioned with 0, so it doesn't change anything, but I
wonder what is the difference.  Note, this is before actually replacing
the phi arg 8(5) with iftmp.3_11(5).
At that point bb4 is:
 [local count: 966367640]:
# RANGE [irange] int [-128, 127]
# iftmp.3_11 = PHI 
if (iftmp.3_11 != 0)
  goto ; [56.25%]
else
  goto ; [43.75%]
and bb 5 is empty forwarder, so [-128, -1][1, 127] is actually correct.
Either iftmp.3_11 is non-zero, then iftmp.2_10 is that value and its range, or
it is zero and then iftmp.2_10 is 8, so [-128, -1][1, 127] U [8, 8], but
more importantly SSA_NAME_RANGE_INFO should be at least according to what
is printed be without 0.

2022-12-22  Jakub Jelinek  
Aldy Hernandez  

* tree-ssa-phiopt.cc (value_replacement): Instead of resetting
phires range info, union it with oarg.

--- gcc/tree-ssa-phiopt.cc.jj   2022-12-22 12:52:36.588469821 +0100
+++ gcc/tree-ssa-phiopt.cc  2022-12-22 13:11:51.145060050 +0100
@@ -1492,11 +1492,25 @@ value_replacement (basic_block cond_bb, basic_block 
middle_bb,
break;
  }
  if (equal_p)
-   /* After the optimization PHI result can have value
-  which it couldn't have previously.
-  We could instead of resetting it union the range
-  info with oarg.  */
-   reset_flow_sensitive_info (gimple_phi_result (phi));
+   {
+ tree phires = gimple_phi_result (phi);
+ if (SSA_NAME_RANGE_INFO (phires))
+   {
+ /* After the optimization PHI result can have value
+which it couldn't have previously.  */
+ value_range r;
+ if (get_global_range_query ()->range_of_expr (r, phires,
+   phi))
+   {
+ int_range<2> tmp (carg, carg);
+ r.union_ (tmp);
+ reset_flow_sensitive_info (phires);
+ set_range_info (phires, r);
+   }
+ else
+   reset_flow_sensitive_info (phires);
+   }
+   }
  if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
{
  imm_use_iterator imm_iter;


Jakub
// PR tree-optimization/108166
// { dg-do run }

bool a, b;
signed char d;
int c;

const int &
foo (const int &f, const int &g)
{
  return !f ? f : g;
}

__attribute__((noipa)) void
bar (int)
{
}

int
main ()
{
  c = foo (b, 0) > ((b ? d : b) ?: 8);
  a = b ? d : b;
  bar (a);
  if (a != 0)
__builtin_abort ();
}


Re: Adding a new thread model to GCC

2022-12-22 Thread i.nixman--- via Gcc-patches

On 2022-12-22 12:21, Jonathan Yong wrote:

hello,


On 12/16/22 19:20, Eric Botcazou wrote:

The libgcc parts look reasonable to me, but I can't approve them.
Maybe Jonathan Yong can approve those parts as mingw-w64 target
maintainer, or maybe a libgcc approver can do so.


OK.


The libstdc++ parts are OK for trunk. IIUC they could go in
separately, they just wouldn't be very much use without the libgcc
changes.


Sure thing.



Ping, need help to commit it?


yes, it would be great if we can merge the path into gcc-13!

I've tested it on gcc-12-branch and gcc-master for i686/x86_64 windows, 
with msvcrt and ucrt runtime - works as it should!


Eric ^^^



best!


Re: Adding a new thread model to GCC

2022-12-22 Thread Jonathan Yong via Gcc-patches

On 12/16/22 19:20, Eric Botcazou wrote:

The libgcc parts look reasonable to me, but I can't approve them.
Maybe Jonathan Yong can approve those parts as mingw-w64 target
maintainer, or maybe a libgcc approver can do so.


OK.


The libstdc++ parts are OK for trunk. IIUC they could go in
separately, they just wouldn't be very much use without the libgcc
changes.


Sure thing.



Ping, need help to commit it?



Re: [PATCH] ipa: silent -Wodr notes with -w

2022-12-22 Thread Martin Liška
PING^2

On 12/9/22 09:27, Martin Liška wrote:
> PING^1
> 
> On 12/2/22 12:27, Martin Liška wrote:
>> If -w is used, warn_odr properly sets *warned = false and
>> so it should be preserved when calling warn_types_mismatch.
>>
>> Noticed that during a LTO reduction where I used -w.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>>  * ipa-devirt.cc (odr_types_equivalent_p): Respect *warned
>>  value if set.
>> ---
>>  gcc/ipa-devirt.cc | 12 ++--
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc
>> index 265d07bb354..bcdc50c5bd7 100644
>> --- a/gcc/ipa-devirt.cc
>> +++ b/gcc/ipa-devirt.cc
>> @@ -1300,7 +1300,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>warn_odr (t1, t2, NULL, NULL, warn, warned,
>>  G_("it is defined as a pointer to different type "
>> "in another translation unit"));
>> -  if (warn && warned)
>> +  if (warn && (warned == NULL || *warned))
>>  warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2),
>>   loc1, loc2);
>>return false;
>> @@ -1315,7 +1315,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>warn_odr (t1, t2, NULL, NULL, warn, warned,
>>  G_("a different type is defined "
>> "in another translation unit"));
>> -  if (warn && warned)
>> +  if (warn && (warned == NULL || *warned))
>>  warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
>>return false;
>>  }
>> @@ -1333,7 +1333,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>  warn_odr (t1, t2, NULL, NULL, warn, warned,
>>G_("a different type is defined in another "
>>   "translation unit"));
>> -if (warn && warned)
>> +if (warn && (warned == NULL || *warned))
>>warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
>>}
>>  gcc_assert (TYPE_STRING_FLAG (t1) == TYPE_STRING_FLAG (t2));
>> @@ -1375,7 +1375,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>warn_odr (t1, t2, NULL, NULL, warn, warned,
>>  G_("has different return value "
>> "in another translation unit"));
>> -  if (warn && warned)
>> +  if (warn && (warned == NULL || *warned))
>>  warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
>>return false;
>>  }
>> @@ -1398,7 +1398,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>warn_odr (t1, t2, NULL, NULL, warn, warned,
>>  G_("has different parameters in another "
>> "translation unit"));
>> -  if (warn && warned)
>> +  if (warn && (warned == NULL || *warned))
>>  warn_types_mismatch (TREE_VALUE (parms1),
>>   TREE_VALUE (parms2), loc1, loc2);
>>return false;
>> @@ -1484,7 +1484,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
>> bool *warned,
>>  warn_odr (t1, t2, f1, f2, warn, warned,
>>G_("a field of same name but different type "
>>   "is defined in another translation unit"));
>> -if (warn && warned)
>> +if (warn && (warned == NULL || *warned))
>>warn_types_mismatch (TREE_TYPE (f1), TREE_TYPE (f2), 
>> loc1, loc2);
>>  return false;
>>}
> 



Re: [PATCH] phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166]

2022-12-22 Thread Aldy Hernandez via Gcc-patches
On Thu, Dec 22, 2022, 12:33 Richard Biener  wrote:

> On Thu, 22 Dec 2022, Jakub Jelinek wrote:
>
> > Hi!
> >
> > The following place in value_replacement is after proving that
> > x == cst1 ? cst2 : x
> > phi result is only used in a comparison with constant which doesn't
> > care if it compares cst1 or cst2 and replaces it with x.
> > The testcase is miscompiled because we have after the replacement
> > incorrect range info for the phi result, we would need to
> > effectively union the phi result range with cst1 (oarg in the code)
> > because previously that constant might be missing in the range, but
> > newly it can appear (we've just verified that the single use stmt
> > of the phi result doesn't care about that value in particular).
> >
> > The following patch just resets the info, bootstrapped/regtested
> > on x86_64-linux and i686-linux, ok for trunk?
>
> OK.
>
> > Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO
> > with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO
> > (including adjusting non-zero bits and the like)?
>
> There's no get_range_info on SSA_NAMEs (anymore?) but you can
> construct a value_range from the


This is my fault. When we added get_global_range_query, I removed
get_range_info. I should've left that entry point. I'll look into adding it
next cycle for readability's sake


 INTEGER_CST singleton and
> union that into the SSA_NAMEs range and then do set_range_info
> with the altered range I guess.
>

Note that set_range_info is an intersect operation. It should really be
called update_range_info. Up to now, there were no users that wanted to
clobber old ranges completely.

Aldy


> Richard.
>
> > 2022-12-22  Jakub Jelinek  
> >
> >   PR tree-optimization/108166
> >   * tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p
> >   case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result.
> >
> >   * g++.dg/torture/pr108166.C: New test.
> >
> > --- gcc/tree-ssa-phiopt.cc.jj 2022-10-28 11:00:53.970243821 +0200
> > +++ gcc/tree-ssa-phiopt.cc2022-12-21 14:27:58.118326548 +0100
> > @@ -1491,6 +1491,12 @@ value_replacement (basic_block cond_bb,
> > default:
> >   break;
> > }
> > +   if (equal_p)
> > + /* After the optimization PHI result can have value
> > +which it couldn't have previously.
> > +We could instead of resetting it union the range
> > +info with oarg.  */
> > + reset_flow_sensitive_info (gimple_phi_result (phi));
> > if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
> >   {
> > imm_use_iterator imm_iter;
> > --- gcc/testsuite/g++.dg/torture/pr108166.C.jj2022-12-21
> 14:31:02.638661322 +0100
> > +++ gcc/testsuite/g++.dg/torture/pr108166.C   2022-12-21
> 14:30:45.441909725 +0100
> > @@ -0,0 +1,26 @@
> > +// PR tree-optimization/108166
> > +// { dg-do run }
> > +
> > +bool a, b;
> > +int d, c;
> > +
> > +const int &
> > +foo (const int &f, const int &g)
> > +{
> > +  return !f ? f : g;
> > +}
> > +
> > +__attribute__((noipa)) void
> > +bar (int)
> > +{
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  c = foo (b, 0) > ((b ? d : b) ?: 8);
> > +  a = b ? d : b;
> > +  bar (a);
> > +  if (a != 0)
> > +__builtin_abort ();
> > +}
> >
> >   Jakub
> >
> >
>
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
> HRB 36809 (AG Nuernberg)
>
>


Re: [PATCH] phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166]

2022-12-22 Thread Richard Biener via Gcc-patches
On Thu, 22 Dec 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following place in value_replacement is after proving that
> x == cst1 ? cst2 : x
> phi result is only used in a comparison with constant which doesn't
> care if it compares cst1 or cst2 and replaces it with x.
> The testcase is miscompiled because we have after the replacement
> incorrect range info for the phi result, we would need to
> effectively union the phi result range with cst1 (oarg in the code)
> because previously that constant might be missing in the range, but
> newly it can appear (we've just verified that the single use stmt
> of the phi result doesn't care about that value in particular).
> 
> The following patch just resets the info, bootstrapped/regtested
> on x86_64-linux and i686-linux, ok for trunk?

OK.

> Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO
> with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO
> (including adjusting non-zero bits and the like)?

There's no get_range_info on SSA_NAMEs (anymore?) but you can
construct a value_range from the INTEGER_CST singleton and
union that into the SSA_NAMEs range and then do set_range_info
with the altered range I guess.

Richard.

> 2022-12-22  Jakub Jelinek  
> 
>   PR tree-optimization/108166
>   * tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p
>   case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result.
> 
>   * g++.dg/torture/pr108166.C: New test.
> 
> --- gcc/tree-ssa-phiopt.cc.jj 2022-10-28 11:00:53.970243821 +0200
> +++ gcc/tree-ssa-phiopt.cc2022-12-21 14:27:58.118326548 +0100
> @@ -1491,6 +1491,12 @@ value_replacement (basic_block cond_bb,
> default:
>   break;
> }
> +   if (equal_p)
> + /* After the optimization PHI result can have value
> +which it couldn't have previously.
> +We could instead of resetting it union the range
> +info with oarg.  */
> + reset_flow_sensitive_info (gimple_phi_result (phi));
> if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
>   {
> imm_use_iterator imm_iter;
> --- gcc/testsuite/g++.dg/torture/pr108166.C.jj2022-12-21 
> 14:31:02.638661322 +0100
> +++ gcc/testsuite/g++.dg/torture/pr108166.C   2022-12-21 14:30:45.441909725 
> +0100
> @@ -0,0 +1,26 @@
> +// PR tree-optimization/108166
> +// { dg-do run }
> +
> +bool a, b;
> +int d, c;
> +
> +const int &
> +foo (const int &f, const int &g)
> +{
> +  return !f ? f : g;
> +}
> +
> +__attribute__((noipa)) void
> +bar (int)
> +{
> +}
> +
> +int
> +main ()
> +{
> +  c = foo (b, 0) > ((b ? d : b) ?: 8);
> +  a = b ? d : b;
> +  bar (a);
> +  if (a != 0)
> +__builtin_abort ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] cse: Fix up CSE const_anchor handling [PR108193]

2022-12-22 Thread Richard Biener via Gcc-patches
On Thu, 22 Dec 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on aarch64, because insert_const_anchor
> inserts invalid CONST_INT into the CSE tables - 0x8000 for SImode.
> The second hunk of the patch fixes that, the first one is to avoid
> triggering undefined behavior at compile time during compute_const_anchors
> computations - performing those additions and subtractions in
> HOST_WIDE_INT means it can overflow for certain constants.
> 
> Bootstrapped/regtested on aarch64-linux (which does have nonzero
> target.const_anchor) and x86_64-linux and i686-linux (which have it
> zero).  Ok for trunk?
> 
> 2022-12-22  Jakub Jelinek  
> 
>   PR rtl-optimization/108193
>   * cse.cc (compute_const_anchors): Change n type to
>   unsigned HOST_WIDE_INT, adjust comparison against it to avoid
>   warnings.  Formatting fix.
>   (insert_const_anchor): Use gen_int_mode instead of GEN_INT.
> 
>   * gfortran.dg/pr108193.f90: New test.
> 
> --- gcc/cse.cc.jj 2022-06-28 13:03:30.699692752 +0200
> +++ gcc/cse.cc2022-12-21 12:58:24.277945065 +0100
> @@ -1169,14 +1169,14 @@ compute_const_anchors (rtx cst,
>  HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
>  HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
>  {
> -  HOST_WIDE_INT n = INTVAL (cst);
> +  unsigned HOST_WIDE_INT n = INTVAL (cst);

UINTVAL?

Otherwise OK.

Thanks,
Richard.

>  
>*lower_base = n & ~(targetm.const_anchor - 1);
> -  if (*lower_base == n)
> +  if ((unsigned HOST_WIDE_INT) *lower_base == n)
>  return false;
>  
> -  *upper_base =
> -(n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
> +  *upper_base = ((n + (targetm.const_anchor - 1))
> +  & ~(targetm.const_anchor - 1));
>*upper_offs = n - *upper_base;
>*lower_offs = n - *lower_base;
>return true;
> @@ -1193,7 +1193,7 @@ insert_const_anchor (HOST_WIDE_INT ancho
>rtx anchor_exp;
>rtx exp;
>  
> -  anchor_exp = GEN_INT (anchor);
> +  anchor_exp = gen_int_mode (anchor, mode);
>hash = HASH (anchor_exp, mode);
>elt = lookup (anchor_exp, hash, mode);
>if (!elt)
> --- gcc/testsuite/gfortran.dg/pr108193.f90.jj 2022-12-21 13:02:21.925513332 
> +0100
> +++ gcc/testsuite/gfortran.dg/pr108193.f902022-12-21 13:01:38.595139040 
> +0100
> @@ -0,0 +1,24 @@
> +! PR rtl-optimization/108193
> +! { dg-do compile { target pthread } }
> +! { dg-options "-O2 -fsplit-loops -ftree-parallelize-loops=2 
> -fno-tree-dominator-opts" }
> +
> +subroutine foo (n, r)
> +  implicit none
> +  integer :: i, j, n
> +  real :: s, r(*)
> +
> +  s = 0.0
> +
> +  do j = 1, 2
> + do i = j, n
> +s = r(i)
> + end do
> +  end do
> +
> +  do i = 1, n
> + do j = i, n
> +s = s + 1
> + end do
> + r(i) = s
> +  end do
> +end subroutine foo
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] loading float member of parameter stored via int registers

2022-12-22 Thread Richard Biener via Gcc-patches
On Thu, 22 Dec 2022, Jiufu Guo wrote:

> 
> Hi,
> 
> Richard Biener  writes:
> 
> > On Thu, 22 Dec 2022, guojiufu wrote:
> >
> >> Hi,
> >> 
> >> On 2022-12-21 15:30, Richard Biener wrote:
> >> > On Wed, 21 Dec 2022, Jiufu Guo wrote:
> >> > 
> >> >> Hi,
> >> >> 
> >> >> This patch is fixing an issue about parameter accessing if the
> >> >> parameter is struct type and passed through integer registers, and
> >> >> there is floating member is accessed. Like below code:
> >> >> 
> >> >> typedef struct DF {double a[4]; long l; } DF;
> >> >> double foo_df (DF arg){return arg.a[3];}
> >> >> 
> >> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
> >> >> generated.  While instruction "mtvsrd 1, 6" would be enough for
> >> >> this case.
> >> > 
> >> > So why do we end up spilling for PPC?
> >> 
> >> Good question! According to GCC source code (in function.cc/expr.cc),
> >> it is common behavior: using "word_mode" to store the parameter to stack,
> >> And using the field's mode (e.g. float mode) to load from the stack.
> >> But with some tries, I fail to construct cases on many platforms.
> >> So, I convert the fix to a target hook and implemented the rs6000 part
> >> first.
> >> 
> >> > 
> >> > struct X { int i; float f; };
> >> > 
> >> > float foo (struct X x)
> >> > {
> >> >   return x.f;
> >> > }
> >> > 
> >> > does pass the structure in $RDI on x86_64 and we manage (with
> >> > optimization, with -O0 we spill) to generate
> >> > 
> >> > shrq$32, %rdi
> >> > movd%edi, %xmm0
> >> > 
> >> > and RTL expansion generates
> >> > 
> >> > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> >> > (insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
> >> > (reg:DI 5 di [ x ])) "t.c":4:1 -1
> >> >  (nil))
> >> > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> >> > (insn 6 3 7 2 (parallel [
> >> > (set (reg:DI 85)
> >> > (ashiftrt:DI (reg/v:DI 83 [ x ])
> >> > (const_int 32 [0x20])))
> >> > (clobber (reg:CC 17 flags))
> >> > ]) "t.c":5:11 -1
> >> >  (nil))
> >> > (insn 7 6 8 2 (set (reg:SI 86)
> >> > (subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
> >> >  (nil))
> >> > 
> >> > I would imagine that for the ppc case we only see the subreg here
> >> > which should be even easier to optimize.
> >> > 
> >> > So how's this not fixable by providing proper patterns / subreg
> >> > capabilities?  Looking a bit at the RTL we have the issue might
> >> > be that nothing seems to handle CSE of
> >> > 
> >> 
> >> This case is also related to 'parameter on struct', PR89310 is
> >> just for this case. On trunk, it is fixed.
> >> One difference: the parameter is in DImode, and passed via an
> >> integer register for "{int i; float f;}".
> >> But for "{double a[4]; long l;}", the parameter is in BLKmode,
> >> and stored to stack during the argument setup.
> >
> > OK, so this would be another case for "heuristics" to use
> > sth different than word_mode for storing, but of course
> > the arguments are in integer registers and using different
> > modes can for example prohibit store-multiple instruction use.
> >
> > As I said in the related thread an RTL expansion time "SRA"
> > with the incoming argument assignment in mind could make
> > more optimal decisions for these kind of special-cases.
> 
> Thanks a lot for your comments!
> 
> Yeap! Using SRA-like analysis during expansion for parameter
> and returns (and may also some field accessing) would be a
> generic improvement for this kind of issue (PR101926 collected
> a lot of them).
> While we may still need some work for various ABIs and different
> targets, to analyze where the 'struct field' come from
> (int/float/vector/.. registers, or stack) and how the struct
> need to be handled (keep in pseudo or store in the stack).
> This may indicate a mount of changes for param_setup code.
> 
> To reduce risk, I'm just draft straightforward patches for
> special cases currently, Like:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
> and this patch.

Heh, yes - though I'm not fond of special-casing things.  RTL
expansion is already full of special cases :/

So basically what I'd do is SRA analysis as if the aggregates
were initialized by copies from the argument registers (and
stack slots) and then try to avoid expanding the aggregate
PARM_DECL itself but aim at fully scalarizing that with only
the copies from the argument space remaining.

> >
> >> > (note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> >> > (insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
> >> > (const_int 56 [0x38])) [2 arg+24 S8 A64])
> >> > (reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
> >> >  (expr_list:REG_DEAD (reg:DI 6 6)
> >> > (nil)))
> >> > (note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
> >> > (note 10 7 15 2 NOTE_INSN_DELETED)
> >> > (insn 15 10 16 2 (set (reg/i:DF 33 1)
> >> > (mem/c:DF (plus:DI (reg/f:DI 110 sfp)
> >> > (const_int 56 [0x38])

[PATCH] tree-optimization/107451 - SLP load vectorization issue

2022-12-22 Thread Richard Biener via Gcc-patches
When vectorizing SLP loads with permutations we can access excess
elements when the load vector type is bigger than the group size
and the vectorization factor covers less groups than necessary
to fill it.  Since we know the code will only access up to
group_size * VF elements in the unpermuted vector we can simply
fill the rest of the vector with whatever we want.  For simplicity
this patch chooses to repeat the last group.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107451
* tree-vect-stmts.c (vectorizable_load): Avoid loading
SLP group members from group numbers in excess of the
vectorization factor.

* gcc.dg/torture/pr107451.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr107451.c | 27 +
 gcc/tree-vect-stmts.cc  | 20 --
 2 files changed, 41 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr107451.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr107451.c 
b/gcc/testsuite/gcc.dg/torture/pr107451.c
new file mode 100644
index 000..a17574c6896
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr107451.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-vectorize -fno-vect-cost-model" } */
+/* { dg-additional-options "-mavx2" { target avx2_runtime } } */
+
+double getdot(int n, const double *x, int inc_x, const double *y)
+{
+  int i, ix = 0;
+  double dot[4] = { 0.0, 0.0, 0.0, 0.0 } ;
+
+  for(i = 0; i < n; i++) {
+  dot[0] += x[ix]   * y[ix]   ;
+  dot[1] += x[ix+1] * y[ix+1] ;
+  dot[2] += x[ix]   * y[ix+1] ;
+  dot[3] += x[ix+1] * y[ix]   ;
+  ix += inc_x ;
+  }
+
+  return dot[0] + dot[1] + dot[2] + dot[3];
+}
+
+int main()
+{
+  double x[2] = {0, 0}, y[2] = {0, 0};
+  if (getdot(1, x, 4096*4096, y) != 0.)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 5485da58b38..8f8deaf82bc 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9235,6 +9235,7 @@ vectorizable_load (vec_info *vinfo,
   unsigned int group_el = 0;
   unsigned HOST_WIDE_INT
elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+  unsigned int n_groups = 0;
   for (j = 0; j < ncopies; j++)
{
  if (nloads > 1)
@@ -9256,12 +9257,19 @@ vectorizable_load (vec_info *vinfo,
  if (! slp
  || group_el == group_size)
{
- tree newoff = copy_ssa_name (running_off);
- gimple *incr = gimple_build_assign (newoff, POINTER_PLUS_EXPR,
- running_off, stride_step);
- vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi);
-
- running_off = newoff;
+ n_groups++;
+ /* When doing SLP make sure to not load elements from
+the next vector iteration, those will not be accessed
+so just use the last element again.  See PR107451.  */
+ if (!slp || known_lt (n_groups, vf))
+   {
+ tree newoff = copy_ssa_name (running_off);
+ gimple *incr
+   = gimple_build_assign (newoff, POINTER_PLUS_EXPR,
+  running_off, stride_step);
+ vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi);
+ running_off = newoff;
+   }
  group_el = 0;
}
}
-- 
2.35.3


Re: [PATCH] phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166]

2022-12-22 Thread Aldy Hernandez via Gcc-patches
On Thu, Dec 22, 2022 at 11:30 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The following place in value_replacement is after proving that
> x == cst1 ? cst2 : x
> phi result is only used in a comparison with constant which doesn't
> care if it compares cst1 or cst2 and replaces it with x.
> The testcase is miscompiled because we have after the replacement
> incorrect range info for the phi result, we would need to
> effectively union the phi result range with cst1 (oarg in the code)
> because previously that constant might be missing in the range, but
> newly it can appear (we've just verified that the single use stmt
> of the phi result doesn't care about that value in particular).
>
> The following patch just resets the info, bootstrapped/regtested
> on x86_64-linux and i686-linux, ok for trunk?
>
> Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO
> with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO
> (including adjusting non-zero bits and the like)?

if (get_global_range_query ()->range_of_expr (r, , )) {
  int_range<2> tmp (, );
  r.union_ (tmp);
  set_range_info (, r);
}

Note that the  is unused, as the global range doesn't have
context.  But it is good form to pass it since we could decide at a
later time to replace get_global_range_query() with a context-aware
get_range_query(), or a ranger instance.  But  is not strictly
necessary.

Hmmm, set_range_info() is an intersect operation, because we always
update what's already there, never replace it.  If want to replace the
global range, throwing away whatever range we had there, you may want
to call this first:

/* Reset all flow sensitive data on NAME such as range-info, nonzero
   bits and alignment.  */

void
reset_flow_sensitive_info (tree name)
{
}

Aldy

>
> 2022-12-22  Jakub Jelinek  
>
> PR tree-optimization/108166
> * tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p
> case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result.
>
> * g++.dg/torture/pr108166.C: New test.
>
> --- gcc/tree-ssa-phiopt.cc.jj   2022-10-28 11:00:53.970243821 +0200
> +++ gcc/tree-ssa-phiopt.cc  2022-12-21 14:27:58.118326548 +0100
> @@ -1491,6 +1491,12 @@ value_replacement (basic_block cond_bb,
>   default:
> break;
>   }
> + if (equal_p)
> +   /* After the optimization PHI result can have value
> +  which it couldn't have previously.
> +  We could instead of resetting it union the range
> +  info with oarg.  */
> +   reset_flow_sensitive_info (gimple_phi_result (phi));
>   if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
> {
>   imm_use_iterator imm_iter;
> --- gcc/testsuite/g++.dg/torture/pr108166.C.jj  2022-12-21 14:31:02.638661322 
> +0100
> +++ gcc/testsuite/g++.dg/torture/pr108166.C 2022-12-21 14:30:45.441909725 
> +0100
> @@ -0,0 +1,26 @@
> +// PR tree-optimization/108166
> +// { dg-do run }
> +
> +bool a, b;
> +int d, c;
> +
> +const int &
> +foo (const int &f, const int &g)
> +{
> +  return !f ? f : g;
> +}
> +
> +__attribute__((noipa)) void
> +bar (int)
> +{
> +}
> +
> +int
> +main ()
> +{
> +  c = foo (b, 0) > ((b ? d : b) ?: 8);
> +  a = b ? d : b;
> +  bar (a);
> +  if (a != 0)
> +__builtin_abort ();
> +}
>
> Jakub
>



[PATCH 3/3] contrib: Add dg-out-generator.pl

2022-12-22 Thread Arsen Arsenović via Gcc-patches
This script is a helper used to generate dg-output lines from an existing
program output conveniently.  It takes care of escaping Tcl and ARE stuff.

contrib/ChangeLog:

* dg-out-generator.pl: New file.
---
I updated this file to include the proper copyright header, after dkm notified
me that I got it wrong on IRC ;D

 contrib/dg-out-generator.pl | 79 +
 1 file changed, 79 insertions(+)
 create mode 100755 contrib/dg-out-generator.pl

diff --git a/contrib/dg-out-generator.pl b/contrib/dg-out-generator.pl
new file mode 100755
index 000..1e9247165b2
--- /dev/null
+++ b/contrib/dg-out-generator.pl
@@ -0,0 +1,79 @@
+#!/usr/bin/env perl
+#
+# Copyright (C) 2022 Free Software Foundation, Inc.
+# Contributed by Arsen Arsenović.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+# This script reads program output on STDIN, and out of it produces a block of
+# dg-output lines that can be yanked at the end of a file.  It will escape
+# special ARE and Tcl constructs automatically.
+#
+# Each argument passed on the standard input is treated as a string to be
+# replaced by ``.*'' in the final result.  This is intended to mask out build
+# paths, filenames, etc.
+#
+# Usage example:
+
+# $ g++-13 -fcontracts -o test \
+#  'g++.dg/contracts/contracts-access1.C' && \
+#   ./test |& dg-out-generator.pl 'g++.dg/contracts/contracts-access1.C'
+# // { dg-output {contract violation in function Base::b at .*:11: pub > 
0(\n|\r\n|\r)*} }
+# // { dg-output {\[level:default, role:default, continuation 
mode:never\](\n|\r\n|\r)*} }
+# // { dg-output {terminate called without an active exception(\n|\r\n|\r)*} }
+
+# You can now freely dump the above into your testcase.
+
+use strict;
+use warnings;
+use POSIX 'floor';
+
+my $escapees = '(' . join ('|', map { quotemeta } @ARGV) . ')';
+
+sub gboundary($)
+{
+  my $str = shift;
+  my $sz = 10.0;
+  for (;;)
+{
+  my $bnd = join '', (map chr 64 + rand 27, 1 .. floor $sz);
+  return $bnd unless index ($str, $bnd) >= 0;
+  $sz += 0.1;
+}
+}
+
+while ()
+  {
+# Escape our escapees.
+my $boundary;
+if (@ARGV) {
+  # Checking this is necessary to avoid a spurious .* between all
+  # characters if no arguments are passed.
+  $boundary = gboundary $_;
+  s/$escapees/$boundary/g;
+}
+
+# Quote stuff special in Tcl ARE.  This step also effectively nulls any
+# concern about escaping.  As long as all curly braces are escaped, the
+# string will, when passing through the braces rule of Tcl, be identical to
+# the input.
+s/([[\]*+?{}()\\])/\\$1/g;
+
+# Newlines should be more tolerant.
+s/\n$/(\\n|\\r\\n|\\r)*/;
+
+# Then split out the boundary, replacing it with .*.
+s/$boundary/.*/g if defined $boundary;
+
+# Then, let's print it in a dg-output block.  If you'd prefer /* keep in
+# mind that if your string contains */ it could terminate the comment
+# early.  Maybe add an extra s!\*/!*()/!g or something.
+print "// { dg-output {$_} }\n";
+  }
+
+# File Local Vars:
+# indent-tabs-mode: nil
+# End:
-- 
2.39.0



[PATCH 2/3] contracts: Update testsuite against new default viol. handler format

2022-12-22 Thread Arsen Arsenović via Gcc-patches
This change was almost entirely mechanical.  Save for two files which had very
short matches, these changes were produced by two seds and a Perl script, for
the more involved cases.  The latter will be added in a subsequent commit.  The
former are as follows:

sed -E -i "/dg-output/s/default std::handle_contract_violation called: \
(\S+) (\S+) (\S+(<[A-Za-z0-9, ]*)?>?)\
/contract violation in function \3 at \1:\2: /" *.C
sed -i '/dg-output/s/  */ /g'

Whichever files remained failing after the above changes were checked-out,
re-ran, with output extracted, and ran through dg-out-generator.pl.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/contracts-access1.C: Convert to new default
violation handler.
* g++.dg/contracts/contracts-config1.C: Ditto.
* g++.dg/contracts/contracts-constexpr1.C: Ditto.
* g++.dg/contracts/contracts-ctor-dtor1.C: Ditto.
* g++.dg/contracts/contracts-deduced2.C: Ditto.
* g++.dg/contracts/contracts-friend1.C: Ditto.
* g++.dg/contracts/contracts-multiline1.C: Ditto.
* g++.dg/contracts/contracts-post3.C: Ditto.
* g++.dg/contracts/contracts-pre10.C: Ditto.
* g++.dg/contracts/contracts-pre2.C: Ditto.
* g++.dg/contracts/contracts-pre2a2.C: Ditto.
* g++.dg/contracts/contracts-pre3.C: Ditto.
* g++.dg/contracts/contracts-pre4.C: Ditto.
* g++.dg/contracts/contracts-pre5.C: Ditto.
* g++.dg/contracts/contracts-pre7.C: Ditto.
* g++.dg/contracts/contracts-pre9.C: Ditto.
* g++.dg/contracts/contracts-redecl3.C: Ditto.
* g++.dg/contracts/contracts-redecl4.C: Ditto.
* g++.dg/contracts/contracts-redecl6.C: Ditto.
* g++.dg/contracts/contracts-redecl7.C: Ditto.
* g++.dg/contracts/contracts-tmpl-spec1.C: Ditto.
* g++.dg/contracts/contracts-tmpl-spec2.C: Ditto.
* g++.dg/contracts/contracts-tmpl-spec3.C: Ditto.
* g++.dg/contracts/contracts10.C: Ditto.
* g++.dg/contracts/contracts19.C: Ditto.
* g++.dg/contracts/contracts25.C: Ditto.
* g++.dg/contracts/contracts3.C: Ditto.
* g++.dg/contracts/contracts35.C: Ditto.
* g++.dg/contracts/contracts5.C: Ditto.
* g++.dg/contracts/contracts7.C: Ditto.
* g++.dg/contracts/contracts9.C: Ditto.
---
 .../g++.dg/contracts/contracts-access1.C  |  36 +--
 .../g++.dg/contracts/contracts-config1.C  |  30 ++-
 .../g++.dg/contracts/contracts-constexpr1.C   |  16 +-
 .../g++.dg/contracts/contracts-ctor-dtor1.C   |  96 
 .../g++.dg/contracts/contracts-deduced2.C |  20 +-
 .../g++.dg/contracts/contracts-friend1.C  |  10 +-
 .../g++.dg/contracts/contracts-multiline1.C   |   2 +-
 .../g++.dg/contracts/contracts-post3.C|   2 +-
 .../g++.dg/contracts/contracts-pre10.C| 122 ++
 .../g++.dg/contracts/contracts-pre2.C |  36 +--
 .../g++.dg/contracts/contracts-pre2a2.C   |   6 +-
 .../g++.dg/contracts/contracts-pre3.C | 156 ++--
 .../g++.dg/contracts/contracts-pre4.C |  12 +-
 .../g++.dg/contracts/contracts-pre5.C |  24 +-
 .../g++.dg/contracts/contracts-pre7.C |  24 +-
 .../g++.dg/contracts/contracts-pre9.C |  24 +-
 .../g++.dg/contracts/contracts-redecl3.C  |  36 +--
 .../g++.dg/contracts/contracts-redecl4.C  |  24 +-
 .../g++.dg/contracts/contracts-redecl6.C  |  36 +--
 .../g++.dg/contracts/contracts-redecl7.C  |  18 +-
 .../g++.dg/contracts/contracts-tmpl-spec1.C   |  26 +-
 .../g++.dg/contracts/contracts-tmpl-spec2.C   | 230 +++---
 .../g++.dg/contracts/contracts-tmpl-spec3.C   |  27 +-
 gcc/testsuite/g++.dg/contracts/contracts10.C  |  16 +-
 gcc/testsuite/g++.dg/contracts/contracts19.C  |   4 +-
 gcc/testsuite/g++.dg/contracts/contracts25.C  |   8 +-
 gcc/testsuite/g++.dg/contracts/contracts3.C   |   2 +-
 gcc/testsuite/g++.dg/contracts/contracts35.C  |  16 +-
 gcc/testsuite/g++.dg/contracts/contracts5.C   |   2 +-
 gcc/testsuite/g++.dg/contracts/contracts7.C   |   2 +-
 gcc/testsuite/g++.dg/contracts/contracts9.C   |  24 +-
 31 files changed, 594 insertions(+), 493 deletions(-)

diff --git a/gcc/testsuite/g++.dg/contracts/contracts-access1.C 
b/gcc/testsuite/g++.dg/contracts/contracts-access1.C
index a3a29821017..414b29a1613 100644
--- a/gcc/testsuite/g++.dg/contracts/contracts-access1.C
+++ b/gcc/testsuite/g++.dg/contracts/contracts-access1.C
@@ -107,22 +107,22 @@ int main()
   return 0;
 }
 
-// { dg-output "default std::handle_contract_violation called: .*.C 11 Base::b 
.*(\n|\r\n|\r)*" }
-// { dg-output "default std::handle_contract_violation called: .*.C 12 Base::b 
.*(\n|\r\n|\r)*" }
-// { dg-output "default std::handle_contract_violation called: .*.C 13 Base::b 
.*(\n|\r\n|\r)*" }
-// { dg-output "default std::handle_contract_violation called: .*.C 26 
Child::fun .*(\n|\r\n|\r)*" }
-// { dg-output "default std::handle_contract_violation called: .*.C 27 
Child::fun .*(\n|\r\n|\r)*"

[PATCH 1/3] libstdc++: Improve output of default contract violation handler [PR107792]

2022-12-22 Thread Arsen Arsenović via Gcc-patches
From: Jonathan Wakely 

Make the output more readable. Don't output anything unless verbose
termination is enabled at configure-time.

libstdc++-v3/ChangeLog:

PR libstdc++/107792
PR libstdc++/107778
* src/experimental/contract.cc (handle_contract_violation): Make
output more readable.
---
Heh, wouldn't be me if I forgot nothing.  Sorry about that.

How's this?

 libstdc++-v3/src/experimental/contract.cc | 50 ++-
 1 file changed, 39 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/src/experimental/contract.cc 
b/libstdc++-v3/src/experimental/contract.cc
index c8d2697eddc..2d41a6326cf 100644
--- a/libstdc++-v3/src/experimental/contract.cc
+++ b/libstdc++-v3/src/experimental/contract.cc
@@ -1,4 +1,5 @@
 // -*- C++ -*- std::experimental::contract_violation and friends
+
 // Copyright (C) 2019-2022 Free Software Foundation, Inc.
 //
 // This file is part of GCC.
@@ -23,19 +24,46 @@
 // .
 
 #include 
-#include 
+#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
+# include 
+#endif
 
 __attribute__ ((weak)) void
 handle_contract_violation (const std::experimental::contract_violation 
&violation)
 {
-  std::cerr << "default std::handle_contract_violation called: \n"
-<< " " << violation.file_name()
-<< " " << violation.line_number()
-<< " " << violation.function_name()
-<< " " << violation.comment()
-<< " " << violation.assertion_level()
-<< " " << violation.assertion_role()
-<< " " << (int)violation.continuation_mode()
-<< std::endl;
+#if _GLIBCXX_HOSTED && _GLIBCXX_VERBOSE
+  bool level_default_p = violation.assertion_level() == "default";
+  bool role_default_p = violation.assertion_role() == "default";
+  bool cont_mode_default_p = violation.continuation_mode()
+== std::experimental::contract_violation_continuation_mode::never_continue;
+
+  const char* modes[]{ "off", "on" }; // Must match enumerators in header.
+  std::cerr << "contract violation in function " << violation.function_name()
+<< " at " << violation.file_name() << ':' << violation.line_number()
+<< ": " << violation.comment();
+
+  const char* delimiter = "\n[";
+
+  if (!level_default_p)
+{
+  std::cerr << delimiter << "level:" << violation.assertion_level();
+  delimiter = ", ";
+}
+  if (!role_default_p)
+{
+  std::cerr << delimiter << "role:" << violation.assertion_role();
+  delimiter = ", ";
+}
+  if (!cont_mode_default_p)
+{
+  std::cerr << delimiter << "continue:"
+   << modes[(int)violation.continuation_mode() & 1];
+  delimiter = ", ";
+}
+
+  if (delimiter[0] == ',')
+std::cerr << ']';
+
+  std::cerr << std::endl;
+#endif
 }
-
-- 
2.39.0



Re: [PATCH] Backport gcc-12: jobserver FIFO support

2022-12-22 Thread Martin Liška
On 12/12/22 13:05, Martin Liška wrote:
> On 12/12/22 12:42, Jakub Jelinek wrote:
>> On Mon, Dec 12, 2022 at 12:39:36PM +0100, Martin Liška wrote:
 I'm fine with backporting the whole
 series to GCC 12 but I wonder if earlier still maintained versions are also
 affected (noting that the series also addresses WPA streaming which is
 not part of the "troubles" here).
>>>
>>> Yes, they are also affected.
>>>
>>> So ready for all active branches?
>>
>> Can't you backport instead a minimal fix that will not change other stuff?
> 
> Yes, then it would be exactly the following 2 patches:
> 
> 1270ccda70ca09f7d4fe76b5156dca8992bd77a6
> 53e3b2bf16a486c15c20991c6095f7be09012b55

I'm going to push that to all release branches.

Cheers,
Martin

> 
> Martin
> 
>>
>>  Jakub
>>
> 



[PATCH] c, c++, cgraphunit: Prevent duplicated -Wunused-value warnings [PR108079]

2022-12-22 Thread Jakub Jelinek via Gcc-patches
Hi!

On the following testcase, we warn with -Wunused-value twice, once
in the FEs and later on cgraphunit again with slightly different
wording.

The following patch fixes that by registering a warning suppression in the
FEs when we warn and not warning in cgraphunit anymore if that happened.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-22  Jakub Jelinek  

PR c/108079
gcc/
* cgraphunit.cc (check_global_declaration): Don't warn for unused
variables which have OPT_Wunused_variable warning suppressed.
gcc/c/
* c-decl.cc (pop_scope): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/cp/
* decl.cc (poplevel): Suppress OPT_Wunused_variable warning
after diagnosing it.
gcc/testsuite/
* c-c++-common/Wunused-var-18.c: New test.

--- gcc/cgraphunit.cc.jj2022-10-18 10:38:48.0 +0200
+++ gcc/cgraphunit.cc   2022-12-21 15:14:34.687939477 +0100
@@ -1122,6 +1122,7 @@ check_global_declaration (symtab_node *s
   && (TREE_CODE (decl) != FUNCTION_DECL
  || (!DECL_STATIC_CONSTRUCTOR (decl)
  && !DECL_STATIC_DESTRUCTOR (decl)))
+  && (! VAR_P (decl) || !warning_suppressed_p (decl, OPT_Wunused_variable))
   /* Otherwise, ask the language.  */
   && lang_hooks.decls.warn_unused_global (decl))
 warning_at (DECL_SOURCE_LOCATION (decl),
--- gcc/c/c-decl.cc.jj  2022-12-19 11:08:31.500766238 +0100
+++ gcc/c/c-decl.cc 2022-12-21 14:52:40.251919370 +0100
@@ -1310,7 +1310,10 @@ pop_scope (void)
  && scope != external_scope)
{
  if (!TREE_USED (p))
-   warning (OPT_Wunused_variable, "unused variable %q+D", p);
+   {
+ warning (OPT_Wunused_variable, "unused variable %q+D", p);
+ suppress_warning (p, OPT_Wunused_variable);
+   }
  else if (DECL_CONTEXT (p) == current_function_decl)
warning_at (DECL_SOURCE_LOCATION (p),
OPT_Wunused_but_set_variable,
--- gcc/cp/decl.cc.jj   2022-12-21 09:03:45.437566855 +0100
+++ gcc/cp/decl.cc  2022-12-21 14:51:07.043265263 +0100
@@ -693,6 +693,7 @@ poplevel (int keep, int reverse, int fun
else
  warning_at (DECL_SOURCE_LOCATION (decl),
  OPT_Wunused_variable, "unused variable %qD", 
decl);
+   suppress_warning (decl, OPT_Wunused_variable);
  }
else if (DECL_CONTEXT (decl) == current_function_decl
 // For -Wunused-but-set-variable leave references alone.
--- gcc/testsuite/c-c++-common/Wunused-var-18.c.jj  2022-12-21 
15:28:03.112273963 +0100
+++ gcc/testsuite/c-c++-common/Wunused-var-18.c 2022-12-21 15:27:05.246107581 
+0100
@@ -0,0 +1,10 @@
+/* PR c/108079 */
+/* { dg-do compile } */
+/* { dg-options "-Wunused-variable" } */
+
+int
+main ()
+{
+  static int x;/* { dg-warning "unused variable 'x'" } */
+   /* { dg-bogus "'x' defined but not used" "" { target *-*-* } 
.-1 } */
+}

Jakub



[PATCH] phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166]

2022-12-22 Thread Jakub Jelinek via Gcc-patches
Hi!

The following place in value_replacement is after proving that
x == cst1 ? cst2 : x
phi result is only used in a comparison with constant which doesn't
care if it compares cst1 or cst2 and replaces it with x.
The testcase is miscompiled because we have after the replacement
incorrect range info for the phi result, we would need to
effectively union the phi result range with cst1 (oarg in the code)
because previously that constant might be missing in the range, but
newly it can appear (we've just verified that the single use stmt
of the phi result doesn't care about that value in particular).

The following patch just resets the info, bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk?

Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO
with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO
(including adjusting non-zero bits and the like)?

2022-12-22  Jakub Jelinek  

PR tree-optimization/108166
* tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p
case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result.

* g++.dg/torture/pr108166.C: New test.

--- gcc/tree-ssa-phiopt.cc.jj   2022-10-28 11:00:53.970243821 +0200
+++ gcc/tree-ssa-phiopt.cc  2022-12-21 14:27:58.118326548 +0100
@@ -1491,6 +1491,12 @@ value_replacement (basic_block cond_bb,
  default:
break;
  }
+ if (equal_p)
+   /* After the optimization PHI result can have value
+  which it couldn't have previously.
+  We could instead of resetting it union the range
+  info with oarg.  */
+   reset_flow_sensitive_info (gimple_phi_result (phi));
  if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
{
  imm_use_iterator imm_iter;
--- gcc/testsuite/g++.dg/torture/pr108166.C.jj  2022-12-21 14:31:02.638661322 
+0100
+++ gcc/testsuite/g++.dg/torture/pr108166.C 2022-12-21 14:30:45.441909725 
+0100
@@ -0,0 +1,26 @@
+// PR tree-optimization/108166
+// { dg-do run }
+
+bool a, b;
+int d, c;
+
+const int &
+foo (const int &f, const int &g)
+{
+  return !f ? f : g;
+}
+
+__attribute__((noipa)) void
+bar (int)
+{
+}
+
+int
+main ()
+{
+  c = foo (b, 0) > ((b ? d : b) ?: 8);
+  a = b ? d : b;
+  bar (a);
+  if (a != 0)
+__builtin_abort ();
+}

Jakub



[PATCH] cse: Fix up CSE const_anchor handling [PR108193]

2022-12-22 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs on aarch64, because insert_const_anchor
inserts invalid CONST_INT into the CSE tables - 0x8000 for SImode.
The second hunk of the patch fixes that, the first one is to avoid
triggering undefined behavior at compile time during compute_const_anchors
computations - performing those additions and subtractions in
HOST_WIDE_INT means it can overflow for certain constants.

Bootstrapped/regtested on aarch64-linux (which does have nonzero
target.const_anchor) and x86_64-linux and i686-linux (which have it
zero).  Ok for trunk?

2022-12-22  Jakub Jelinek  

PR rtl-optimization/108193
* cse.cc (compute_const_anchors): Change n type to
unsigned HOST_WIDE_INT, adjust comparison against it to avoid
warnings.  Formatting fix.
(insert_const_anchor): Use gen_int_mode instead of GEN_INT.

* gfortran.dg/pr108193.f90: New test.

--- gcc/cse.cc.jj   2022-06-28 13:03:30.699692752 +0200
+++ gcc/cse.cc  2022-12-21 12:58:24.277945065 +0100
@@ -1169,14 +1169,14 @@ compute_const_anchors (rtx cst,
   HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
   HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
 {
-  HOST_WIDE_INT n = INTVAL (cst);
+  unsigned HOST_WIDE_INT n = INTVAL (cst);
 
   *lower_base = n & ~(targetm.const_anchor - 1);
-  if (*lower_base == n)
+  if ((unsigned HOST_WIDE_INT) *lower_base == n)
 return false;
 
-  *upper_base =
-(n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
+  *upper_base = ((n + (targetm.const_anchor - 1))
+& ~(targetm.const_anchor - 1));
   *upper_offs = n - *upper_base;
   *lower_offs = n - *lower_base;
   return true;
@@ -1193,7 +1193,7 @@ insert_const_anchor (HOST_WIDE_INT ancho
   rtx anchor_exp;
   rtx exp;
 
-  anchor_exp = GEN_INT (anchor);
+  anchor_exp = gen_int_mode (anchor, mode);
   hash = HASH (anchor_exp, mode);
   elt = lookup (anchor_exp, hash, mode);
   if (!elt)
--- gcc/testsuite/gfortran.dg/pr108193.f90.jj   2022-12-21 13:02:21.925513332 
+0100
+++ gcc/testsuite/gfortran.dg/pr108193.f90  2022-12-21 13:01:38.595139040 
+0100
@@ -0,0 +1,24 @@
+! PR rtl-optimization/108193
+! { dg-do compile { target pthread } }
+! { dg-options "-O2 -fsplit-loops -ftree-parallelize-loops=2 
-fno-tree-dominator-opts" }
+
+subroutine foo (n, r)
+  implicit none
+  integer :: i, j, n
+  real :: s, r(*)
+
+  s = 0.0
+
+  do j = 1, 2
+ do i = j, n
+s = r(i)
+ end do
+  end do
+
+  do i = 1, n
+ do j = i, n
+s = s + 1
+ end do
+ r(i) = s
+  end do
+end subroutine foo

Jakub



[committed] libstdc++: Define and use variable templates in

2022-12-22 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Thi defines a variable template for the internal __is_duration helper
trait, defines a new __is_time_point_v variable template (to be used in
a subsequent commit), and adds explicit specializations of the standard
chrono::treat_as_floating_point trait for common types.

A fast path is added to chrono::duration_cast for the no-op case where
no conversion is needed.

Finally, some SFINAE constraints are simplified by using the
__enable_if_t alias, or by using variable templates.

libstdc++-v3/ChangeLog:

* include/bits/chrono.h (__is_duration_v, __is_time_point_v):
New variable templates.
(duration_cast): Add simplified definition for noconv case.
(treat_as_floating_point_v): Add explicit specializations.
(duration::operator%=, floor, ceil, round): Simplify SFINAE
constraints.
---
 libstdc++-v3/include/bits/chrono.h | 62 ++
 1 file changed, 45 insertions(+), 17 deletions(-)

diff --git a/libstdc++-v3/include/bits/chrono.h 
b/libstdc++-v3/include/bits/chrono.h
index 22c0be3fbe6..56751d1c3a0 100644
--- a/libstdc++-v3/include/bits/chrono.h
+++ b/libstdc++-v3/include/bits/chrono.h
@@ -244,6 +244,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using __disable_if_is_duration
= typename enable_if::value, _Tp>::type;
 
+#if __cpp_variable_templates
+template
+  inline constexpr bool __is_duration_v = false;
+template
+  inline constexpr bool __is_duration_v> = true;
+template
+  inline constexpr bool __is_time_point_v = false;
+template
+  inline constexpr bool __is_time_point_v> = true;
+#endif
+
 /// @endcond
 
 /** Convert a `duration` to type `ToDur`.
@@ -261,13 +272,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr __enable_if_is_duration<_ToDur>
   duration_cast(const duration<_Rep, _Period>& __d)
   {
-   typedef typename _ToDur::period __to_period;
-   typedef typename _ToDur::rep__to_rep;
-   typedef ratio_divide<_Period, __to_period>  __cf;
-   typedef typename common_type<__to_rep, _Rep, intmax_t>::type __cr;
-   typedef  __duration_cast_impl<_ToDur, __cf, __cr,
- __cf::num == 1, __cf::den == 1> __dc;
-   return __dc::__cast(__d);
+#if __cpp_inline_variables && __cpp_if_constexpr
+   if constexpr (is_same_v<_ToDur, duration<_Rep, _Period>>)
+ return __d;
+   else
+#endif
+   {
+ using __to_period = typename _ToDur::period;
+ using __to_rep = typename _ToDur::rep;
+ using __cf = ratio_divide<_Period, __to_period>;
+ using __cr = typename common_type<__to_rep, _Rep, intmax_t>::type;
+ using __dc = __duration_cast_impl<_ToDur, __cf, __cr,
+   __cf::num == 1, __cf::den == 1>;
+ return __dc::__cast(__d);
+   }
   }
 
 /** Trait indicating whether to treat a type as a floating-point type.
@@ -290,6 +308,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template 
   inline constexpr bool treat_as_floating_point_v =
treat_as_floating_point<_Rep>::value;
+
+template<>
+  inline constexpr bool treat_as_floating_point_v = false;
+template<>
+  inline constexpr bool treat_as_floating_point_v = false;
+template<>
+  inline constexpr bool treat_as_floating_point_v = false;
+template<>
+  inline constexpr bool treat_as_floating_point_v = true;
+template<>
+  inline constexpr bool treat_as_floating_point_v = true;
+template<>
+  inline constexpr bool treat_as_floating_point_v = true;
 #endif // C++17
 
 #if __cplusplus > 201703L
@@ -632,8 +663,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// DR 934.
template
  _GLIBCXX17_CONSTEXPR
- typename enable_if::value,
-duration&>::type
+ __enable_if_t::value, duration&>
  operator%=(const rep& __rhs)
  {
__r %= __rhs;
@@ -642,8 +672,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
template
  _GLIBCXX17_CONSTEXPR
- typename enable_if::value,
-duration&>::type
+ __enable_if_t::value, duration&>
  operator%=(const duration& __d)
  {
__r %= __d.count();
@@ -1019,7 +1048,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  */
 template
   [[nodiscard]] constexpr
-  enable_if_t<__is_duration<_ToDur>::value, time_point<_Clock, _ToDur>>
+  enable_if_t<__is_duration_v<_ToDur>, time_point<_Clock, _ToDur>>
   floor(const time_point<_Clock, _Dur>& __tp)
   {
return time_point<_Clock, _ToDur>{
@@ -1040,7 +1069,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  */
 template
   [[nodiscard]] constexpr
-  enable_if_t<__is_duration<_ToDur>::value, time_point<_Clock, _ToDur>>
+  enable_if_t<__is_duration_v<_ToDur>, time_point<_C

[committed] libstdc++: Add [[nodiscard]] in

2022-12-22 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/std/chrono: Use nodiscard attribute.
---
 libstdc++-v3/include/std/chrono | 46 +
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 4c5fbfaeb83..33653f8efb1 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -128,11 +128,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using time_point= chrono::time_point;
   static constexpr bool is_steady = false;
 
+  [[nodiscard]]
   static time_point
   now()
   { return from_sys(system_clock::now()); }
 
   template
+   [[nodiscard]]
static sys_time>
to_sys(const utc_time<_Duration>& __t)
{
@@ -145,6 +147,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
   template
+   [[nodiscard]]
static utc_time>
from_sys(const sys_time<_Duration>& __t)
{
@@ -171,11 +174,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr bool is_steady = false; // XXX true for CLOCK_TAI?
 
   // TODO move into lib, use CLOCK_TAI on linux, add extension point.
+  [[nodiscard]]
   static time_point
   now()
   { return from_utc(utc_clock::now()); }
 
   template
+   [[nodiscard]]
static utc_time>
to_utc(const tai_time<_Duration>& __t)
{
@@ -184,6 +189,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
   template
+   [[nodiscard]]
static tai_time>
from_utc(const utc_time<_Duration>& __t)
{
@@ -208,11 +214,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr bool is_steady = false; // XXX
 
   // TODO move into lib, add extension point.
+  [[nodiscard]]
   static time_point
   now()
   { return from_utc(utc_clock::now()); }
 
   template
+   [[nodiscard]]
static utc_time>
to_utc(const gps_time<_Duration>& __t)
{
@@ -221,6 +229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
   template
+   [[nodiscard]]
static gps_time>
from_utc(const utc_time<_Duration>& __t)
{
@@ -394,6 +403,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 /// Convert a time point to a different clock.
 template
+  [[nodiscard]]
   inline auto
   clock_cast(const time_point<_SourceClock, _Duration>& __t)
   requires __detail::__clock_convs<_DestClock, _SourceClock, _Duration>
@@ -2620,6 +2630,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   leap_second(const leap_second&) = default;
   leap_second& operator=(const leap_second&) = default;
 
+  [[nodiscard]]
   constexpr sys_seconds
   date() const noexcept
   {
@@ -2628,6 +2639,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return sys_seconds(-_M_s);
   }
 
+  [[nodiscard]]
   constexpr seconds
   value() const noexcept
   {
@@ -2638,71 +2650,71 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // This can be defaulted because the database will never contain two
   // leap_second objects with the same date but different signs.
-  friend constexpr bool
+  [[nodiscard]] friend constexpr bool
   operator==(const leap_second&, const leap_second&) noexcept = default;
 
-  friend constexpr strong_ordering
+  [[nodiscard]] friend constexpr strong_ordering
   operator<=>(const leap_second& __x, const leap_second& __y) noexcept
   { return __x.date() <=> __y.date(); }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator==(const leap_second& __x,
   const sys_time<_Duration>& __y) noexcept
{ return __x.date() == __y; }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator<(const leap_second& __x,
  const sys_time<_Duration>& __y) noexcept
{ return __x.date() < __y; }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator<(const sys_time<_Duration>& __x,
  const leap_second& __y) noexcept
{ return __x < __y.date(); }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator>(const leap_second& __x,
  const sys_time<_Duration>& __y) noexcept
{ return __y < __x.date(); }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator>(const sys_time<_Duration>& __x,
  const leap_second& __y) noexcept
{ return __y.date() < __x; }
 
   template
-   friend constexpr bool
+   [[nodiscard]] friend constexpr bool
operator<=(const leap_second& __x,
- const sys_time<_Duration>& __y) noexcept
+  const sys_time<_Duration>& __y) noexcept
{ return !(__y < __x.date()); }
 
 

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-22 Thread Jiufu Guo via Gcc-patches


Hi,

Richard Biener  writes:

> On Thu, 22 Dec 2022, guojiufu wrote:
>
>> Hi,
>> 
>> On 2022-12-21 15:30, Richard Biener wrote:
>> > On Wed, 21 Dec 2022, Jiufu Guo wrote:
>> > 
>> >> Hi,
>> >> 
>> >> This patch is fixing an issue about parameter accessing if the
>> >> parameter is struct type and passed through integer registers, and
>> >> there is floating member is accessed. Like below code:
>> >> 
>> >> typedef struct DF {double a[4]; long l; } DF;
>> >> double foo_df (DF arg){return arg.a[3];}
>> >> 
>> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
>> >> generated.  While instruction "mtvsrd 1, 6" would be enough for
>> >> this case.
>> > 
>> > So why do we end up spilling for PPC?
>> 
>> Good question! According to GCC source code (in function.cc/expr.cc),
>> it is common behavior: using "word_mode" to store the parameter to stack,
>> And using the field's mode (e.g. float mode) to load from the stack.
>> But with some tries, I fail to construct cases on many platforms.
>> So, I convert the fix to a target hook and implemented the rs6000 part
>> first.
>> 
>> > 
>> > struct X { int i; float f; };
>> > 
>> > float foo (struct X x)
>> > {
>> >   return x.f;
>> > }
>> > 
>> > does pass the structure in $RDI on x86_64 and we manage (with
>> > optimization, with -O0 we spill) to generate
>> > 
>> > shrq$32, %rdi
>> > movd%edi, %xmm0
>> > 
>> > and RTL expansion generates
>> > 
>> > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>> > (insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
>> > (reg:DI 5 di [ x ])) "t.c":4:1 -1
>> >  (nil))
>> > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>> > (insn 6 3 7 2 (parallel [
>> > (set (reg:DI 85)
>> > (ashiftrt:DI (reg/v:DI 83 [ x ])
>> > (const_int 32 [0x20])))
>> > (clobber (reg:CC 17 flags))
>> > ]) "t.c":5:11 -1
>> >  (nil))
>> > (insn 7 6 8 2 (set (reg:SI 86)
>> > (subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
>> >  (nil))
>> > 
>> > I would imagine that for the ppc case we only see the subreg here
>> > which should be even easier to optimize.
>> > 
>> > So how's this not fixable by providing proper patterns / subreg
>> > capabilities?  Looking a bit at the RTL we have the issue might
>> > be that nothing seems to handle CSE of
>> > 
>> 
>> This case is also related to 'parameter on struct', PR89310 is
>> just for this case. On trunk, it is fixed.
>> One difference: the parameter is in DImode, and passed via an
>> integer register for "{int i; float f;}".
>> But for "{double a[4]; long l;}", the parameter is in BLKmode,
>> and stored to stack during the argument setup.
>
> OK, so this would be another case for "heuristics" to use
> sth different than word_mode for storing, but of course
> the arguments are in integer registers and using different
> modes can for example prohibit store-multiple instruction use.
>
> As I said in the related thread an RTL expansion time "SRA"
> with the incoming argument assignment in mind could make
> more optimal decisions for these kind of special-cases.

Thanks a lot for your comments!

Yeap! Using SRA-like analysis during expansion for parameter
and returns (and may also some field accessing) would be a
generic improvement for this kind of issue (PR101926 collected
a lot of them).
While we may still need some work for various ABIs and different
targets, to analyze where the 'struct field' come from
(int/float/vector/.. registers, or stack) and how the struct
need to be handled (keep in pseudo or store in the stack).
This may indicate a mount of changes for param_setup code.

To reduce risk, I'm just draft straightforward patches for
special cases currently, Like:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
and this patch.

>
>> > (note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>> > (insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
>> > (const_int 56 [0x38])) [2 arg+24 S8 A64])
>> > (reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
>> >  (expr_list:REG_DEAD (reg:DI 6 6)
>> > (nil)))
>> > (note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
>> > (note 10 7 15 2 NOTE_INSN_DELETED)
>> > (insn 15 10 16 2 (set (reg/i:DF 33 1)
>> > (mem/c:DF (plus:DI (reg/f:DI 110 sfp)
>> > (const_int 56 [0x38])) [1 arg.a[3]+0 S8 A64])) "t.c":2:40
>> > 576 {*movdf_hardfloat64}
>> >  (nil))
>> > 
>> > Possibly because the store and load happen in a different mode?  Can
>> > you see why CSE doesn't handle this (producing a subreg)?  On
>> 
>> Yes, exactly! For "{double a[4]; long l;}", because the store and load
>> are using a different mode, and then CSE does not optimize it.  This
>> patch makes the store and load using the same mode (DImode), and then
>> leverage CSE to handle it.
>
> So can we instead fix CSE to consider replacing insn 15 above with
>
>  (insn 15 (set (reg/i:DF 33 1)
>(subreg:DF (reg/f:DI 6 6)))
>

Thanks for your suggestion! I will

[PATCH] Compare DECL_NOT_FLEXARRAY for LTO tree merging

2022-12-22 Thread Richard Biener via Gcc-patches
This was missing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

gcc/lto/
* lto-common.cc (compare_tree_sccs_1): Compare DECL_NOT_FLEXARRAY.
---
 gcc/lto/lto-common.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index 125064ba47e..958b417ee79 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -1189,6 +1189,7 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
  compare_values (DECL_FIELD_ABI_IGNORED);
  compare_values (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD);
  compare_values (DECL_OFFSET_ALIGN);
+ compare_values (DECL_NOT_FLEXARRAY);
}
   else if (code == VAR_DECL)
{
-- 
2.35.3


Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-22 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 22, 2022 at 5:40 AM Hongtao Liu  wrote:
>
> On Thu, Dec 22, 2022 at 6:46 AM Jakub Jelinek  wrote:
> >
> > On Wed, Dec 21, 2022 at 02:43:43PM -0800, H.J. Lu wrote:
> > > > > > > > >  Target RejectNegative
> > > > > > > > >  Set 80387 floating-point precision to 80-bit.
> > > > > > > > >
> > > > > > > > > +mdaz-ftz
> > > > > > > > > +Target
> > > > > > > >
> > > > > > > > s/Target/Driver/
> > > > > > > Change to Driver and Got error like:cc1: error: command-line 
> > > > > > > option
> > > > > > > ‘-mdaz-ftz’ is valid for the driver but not for C.
> > > > > > Hi Jakub:
> > > > > >   I didn't find a good solution to handle this error after changing
> > > > > > *Target* to *Driver*, Could you give some hints how to solve this
> > > > > > problem?
> > > > > > Or is it ok for you to mark this as *Target*(there won't be any save
> > > > > > and restore in cfun since there's no variable defined here.)
> > > > >
> > > > > Since all -m* options are passed to cc1, -mdaz-ftz can't be marked
> > > > > as Driver.  We need to give it a different name to mark it as Driver.
> > > >
> > > > It is ok like that.
> > > >
> > > > Jakub
> > > >
> > >
> > > The GCC driver handles -mno-XXX automatically for -mXXX.  Use
> > > a different name needs to handle the negation.   Or we can do something
> > > like this to check for CL_DRIVER before passing it to cc1.
> >
> > I meant I'm ok with -m{,no-}daz-ftz option being Target rather than Driver.
> >
> Thanks.
> Uros, Is the patch for you?

The original patch is then OK.

Thanks,
Uros.