Re: [pushed][PATCH] LoongArch: Improve lasx_xvpermi_q_ insn pattern

2024-01-05 Thread chenglulu
Pushed to r14-6968. 在 2024/1/5 下午3:37, Jiahao Xu 写道: For instruction xvpermi.q, unused bits in operands[3] need be set to 0 to avoid causing undefined behavior on LA464. gcc/ChangeLog: * config/loongarch/lasx.md: Set the unused bits in operand[3] to 0. gcc/testsuite/ChangeLog:

[Committed] RISC-V: Update MAX_SEW for available vsevl info[VSETVL PASS]

2024-01-05 Thread Juzhe-Zhong
This patch fixes a bug of VSETVL PASS in this following situation: Ignore curr info since prev info available with it: prev_info: VALID (insn 8, bb 2) Demand fields: demand_ratio_and_ge_sew demand_avl SEW=16, VLMUL=mf4, RATIO=64, MAX_SEW=64 TAIL_POLICY=agnostic,

[PATCH v3 8/8] libstdc++: Optimize std::is_unbounded_array compilation performance

2024-01-05 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_unbounded_array by dispatching to the new __is_unbounded_array built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_unbounded_array_v): Use __is_unbounded_array built-in trait. Signed-off-by: Ken Matsui

[PATCH v3 1/8] c++: Implement __is_const built-in trait

2024-01-05 Thread Ken Matsui
This patch implements built-in trait for std::is_const. gcc/cp/ChangeLog: * cp-trait.def: Define __is_const. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v3 6/8] libstdc++: Optimize std::is_pointer compilation performance

2024-01-05 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_pointer by dispatching to the new __is_pointer built-in trait. libstdc++-v3/ChangeLog: * include/bits/cpp_type_traits.h (__is_pointer): Use __is_pointer built-in trait. Optimize its implementation. *

[PATCH v3 5/8] c++: Implement __is_pointer built-in trait

2024-01-05 Thread Ken Matsui
This patch implements built-in trait for std::is_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v3 7/8] c++: Implement __is_unbounded_array built-in trait

2024-01-05 Thread Ken Matsui
This patch implements built-in trait for std::is_unbounded_array. gcc/cp/ChangeLog: * cp-trait.def: Define __is_unbounded_array. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_UNBOUNDED_ARRAY. * semantics.cc (trait_expr_value): Likewise.

[PATCH v3 4/8] libstdc++: Optimize std::is_volatile compilation performance

2024-01-05 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_volatile by dispatching to the new __is_volatile built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_volatile): Use __is_volatile built-in trait. (is_volatile_v): Likewise. Signed-off-by: Ken

[PATCH v3 3/8] c++: Implement __is_volatile built-in trait

2024-01-05 Thread Ken Matsui
This patch implements built-in trait for std::is_volatile. gcc/cp/ChangeLog: * cp-trait.def: Define __is_volatile. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v3 2/8] libstdc++: Optimize std::is_const compilation performance

2024-01-05 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_const by dispatching to the new __is_const built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_const): Use __is_const built-in trait. (is_const_v): Likewise. Signed-off-by: Ken Matsui ---

[PATCH v3 0/8] Optimize more type traits

2024-01-05 Thread Ken Matsui
Changes in v3: - Rebased on top of master. - Fixed __is_pointer in cpp_type_traits.h. Changes in v2: - Removed testsuite_tr1.h includes from the testcases. --- This patch series implements __is_const, __is_volatile, __is_pointer, and __is_unbounded_array built-in traits, which were isolated

Re: [PATCH v2 6/8] libstdc++: Optimize std::is_pointer compilation performance

2024-01-05 Thread Ken Matsui
On Thu, Jan 4, 2024 at 2:13 PM Jonathan Wakely wrote: > > On Sat, 23 Dec 2023 at 22:07, Ken Matsui wrote: > > > > This patch optimizes the compilation performance of std::is_pointer > > by dispatching to the new __is_pointer built-in trait. > > > > libstdc++-v3/ChangeLog: > > > > *

Re: [PATCH v2 6/8] libstdc++: Optimize std::is_pointer compilation performance

2024-01-05 Thread Ken Matsui
On Thu, Jan 4, 2024 at 2:16 PM Patrick Palka wrote: > > On Thu, 4 Jan 2024, Patrick Palka wrote: > > > On Sat, 23 Dec 2023, Ken Matsui wrote: > > > > > This patch optimizes the compilation performance of std::is_pointer > > > by dispatching to the new __is_pointer built-in trait. > > > > > >

[Committed V2] RISC-V: Teach liveness computation loop invariant shift amount

2024-01-05 Thread Juzhe-Zhong
1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize shift with vector shift amount, that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant. 2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize shift with scalar shift amount, that

Re: Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-05 Thread 钟居哲
Thanks Robin. is_gimple_constant makes more senes. Committed with addressing your comments. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-01-05 17:54 To: Juzhe-Zhong; gcc-patches CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw Subject: Re: [PATCH] RISC-V: Teach liveness computation

[Committed V2] RISC-V: Allow simplification non-vlmax with len = NUNITS reg to reg move

2024-01-05 Thread Juzhe-Zhong
V2: Address comments from Robin. While working on fixing a bug, I notice this following code has redundant move: #include "riscv_vector.h" void f (float x, float y, void *out) { float f[4] = { x, x, x, y }; vfloat32m1_t v = __riscv_vle32_v_f32m1 (f, 4); __riscv_vse32_v_f32m1 (out, v, 4); }

Re: [committed] RISC-V: Clean up testsuite for multi-lib testing [NFC]

2024-01-05 Thread 钟居哲
Hi, kito. This patch causes these following regression FAILs: FAIL: gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c (test for excess errors) FAIL: gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c (test for excess errors) FAIL:

[COMMITTED] Regenerate libgomp/configure for copyright year update

2024-01-05 Thread Mark Wielaard
commit a945c346f57ba40fc80c14ac59be0d43624e559d updated libgomp/plugin/configfrag.ac but didn't regenerate/update libgomp/configure which includes that configfrag. libgomp/Changelog: * configure: Regenerate. --- libgomp/configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: Re: [committed] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread 钟居哲
Thanks Jeff. Yeah, I aggree we are not doing thing terribly wrong but Palmer request revert of the vector-crypto, so I revert it (actually, I asked Li Pan revert it). Actually, Wang Feng has fixed the issue: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641903.html It's just a pretty

Repost [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2024-01-05 Thread Michael Meissner
This patch is a prelimianry patch to add the full 1,024 bit dense math register (DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the DMR register. This patch only adds the new 1,024 bit register support. It does not add support for any instructions that need 1,024 bit

Repost [PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2024-01-05 Thread Michael Meissner
This patch changes the assembler instruction names for MMA instructions from the original name used in power10 to the new name when used with the dense math system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the same bits for either spelling. The patches have been tested on

Repost [PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2024-01-05 Thread Michael Meissner
This patch changes the MMA instructions to use either FPR registers (-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA instruction names are used. A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs. The patches have been tested on both little and big

Repost [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2024-01-05 Thread Michael Meissner
The MMA subsystem added the notion of accumulator registers as an optional feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with the traditional floating point registers 0..31, but logically the accumulator registers were separate from the FPR registers. In ISA 3.1, it was

Repost [PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2024-01-05 Thread Michael Meissner
This patch re-enables generating load and store vector pair instructions when doing certain memory copy operations when -mcpu=future is used. During power10 development, it was determined that using store vector pair instructions were problematical in a few cases, so we disabled generating load

Repost [PATCH 1/6] Add -mcpu=future

2024-01-05 Thread Michael Meissner
This patch implements support for a potential future PowerPC cpu. Features added with -mcpu=future, may or may not be added to new PowerPC processors. This patch adds support for the -mcpu=future option. If you use -mcpu=future, the macro __ARCH_PWR_FUTURE__ is defined, and the assembler

Repost [PATCH 0/6] PowerPC Future patches

2024-01-05 Thread Michael Meissner
I posted these patches on October 18th, 2023, and I never receieved any feedback on the changes. What changes do I need to make with these patches to get them into GCC 14? This patch is very preliminary support for a potential new feature to the PowerPC that extends the current power10 MMA

[PATCH] PR target/112886, Add %S to print_operand for vector pair support

2024-01-05 Thread Michael Meissner
In looking at support for load vector pair and store vector pair for the PowerPC in GCC, I noticed that we were missing a print_operand output modifier if you are dealing with vector pairs to print the 2nd register in the vector pair. If the instruction inside of the asm used the Altivec

[PATCH] Fortran: bogus warnings with REPEAT intrinsic and -Wconversion-extra [PR96724]

2024-01-05 Thread Harald Anlauf
Dear all, the attached patch picks up a submission by Jose that was never reviewed: https://gcc.gnu.org/pipermail/fortran/2020-August/054902.html The original patch was unnecessarily complex, as it could invoke more conversions than necessary. I chose to only convert to the essential - and

[PATCH] c++: reference variable as default targ [PR101463]

2024-01-05 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- Here during default template argument substitution we wrongly consider the (substituted) default arguments v and vt as value-dependent[1] which ultimately leads to deduction failure for the calls. The bogus

Re: [PATCH] libgccjit: Add support for setting the comment ident

2024-01-05 Thread David Malcolm
On Fri, 2024-01-05 at 12:09 -0500, Antoni Boucher wrote: > Hi. > This patch adds support for setting the comment ident (analogous to > #ident "comment" in C). > Thanks for the review. Thanks for the patch. This may sound like a silly question, but what does #ident do and what is it used for?

[patch] omp_target_is_accessible (was: [patch] libgomp.texi: Document omp_display_env)

2024-01-05 Thread Tobias Burnus
Hi all, updated patch attached - which also fixes some additional issues and adds omp_target_is_accessible. On 03.01.24 23:35, Sandra Loosemore wrote: On 1/3/24 11:31, Tobias Burnus wrote: [...] I'm not sure about the usability issues, except I think it's generally easier to change an

[PATCH v2 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-05 Thread Andre Vieira
Respin after comments on first version. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 2f5ca79ed8d..4f164c54740 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -65,8 +65,8 @@ extern void arm_emit_speculation_barrier_function (void);

[PATCH v2 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2024-01-05 Thread Andre Vieira
Respin of first version to address comments and make it buildable on its own. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index a9c2752c0ea..f0b01b7461f 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2375,6 +2375,21 @@ extern int making_const_table; else if

[PATCH v2 0/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-05 Thread Andre Vieira
Hi, Resending series version 2 addression comments on first version, also moved parts of the first patch to the second so it can be built without the second patch. Andre Vieira (2): arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns arm: Add support

Re: [PATCH] Add a late-combine pass [PR106594]

2024-01-05 Thread Richard Sandiford
Jeff Law writes: > On 10/24/23 12:49, Richard Sandiford wrote: >> This patch adds a combine pass that runs late in the pipeline. >> There are two instances: one between combine and split1, and one >> after postreload. > So have you done any investigation on cases caught by your new pass >

[PATCH] libgccjit: Add support for setting the comment ident

2024-01-05 Thread Antoni Boucher
Hi. This patch adds support for setting the comment ident (analogous to #ident "comment" in C). Thanks for the review. From 1af4e77540001cce8c30e86040c1da785e435810 Mon Sep 17 00:00:00 2001 From: Antoni Boucher Date: Fri, 27 Oct 2023 17:36:03 -0400 Subject: [PATCH] libgccjit: Add support for

[PATCH] c++: address of NTTP object as targ [PR113242]

2024-01-05 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 13? -- >8 -- invalid_tparm_referent_p was rejecting using the address of a class NTTP object as a template argument, but this should be fine. PR c++/113242 gcc/cp/ChangeLog: * pt.cc

[PATCH] aarch64: Rework uxtl->zip optimisation [PR113196]

2024-01-05 Thread Richard Sandiford
g:f26f92b534f9 implemented unsigned extensions using ZIPs rather than UXTL{,2}, since the former has a higher throughput than the latter on amny cores. The optimisation worked by lowering directly to ZIP during expand, so that the zero input could be hoisted and shared. However, changing to ZIP

[PATCH] Keep track of the FUNCTION_BEG note

2024-01-05 Thread Richard Sandiford
function.cc emits a NOTE_FUNCTION_BEG after all arguments have been copied to pseudos. It then records this note in parm_birth_insn. Various other pieces of code use this insn as a convenient place to insert things at the start of the function. However, cfgexpand later changes parm_birth_insn as

[pushed] aarch64: Extend VECT_COMPARE_COSTS to !SVE [PR113104]

2024-01-05 Thread Richard Sandiford
When SVE is enabled, we try vectorising with multiple different SVE and Advanced SIMD approaches and use the cost model to pick the best one. Until now, we've not done that for Advanced SIMD, since "the first mode that works should always be the best". The testcase is a counterexample. Each

Re: [RFA] [V3] new pass for sign/zero extension elimination

2024-01-05 Thread Jeff Law
On 1/4/24 13:44, Xi Ruoyao wrote: I have successfully bootstrapped and regtested the patch on loongarch64- linux-gnu. The test cases in the patch (intended for RISC-V) also works on LoongArch per my manual testing. I find myself wondering if we should create some kind of target-supports

Re: [committed] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread Jeff Law
On 1/4/24 20:24, Palmer Dabbelt wrote: On Thu, 04 Jan 2024 19:17:21 PST (-0800), juzhe.zh...@rivai.ai wrote: Hi, Wang Feng. Your patch has some ICEs: FAIL: gcc.target/riscv/rvv/base/zvbc-intrinsic.c (internal compiler error: RTL check: expected code 'const_int', have 'reg' in

Re: [EXTERNAL] Re: Fw: [RFC] Either fix or disable SME feature for `aarch64-w64-mingw32` target?

2024-01-05 Thread Radek Barton
Hello, Andrew. Thank you for your input. I've updated the "fixing" patch according to your feedback. Please let me know if I understood it correctly. Radek From: Andrew Pinski Sent: Thursday, January 4, 2024 8:11 PM To: Radek Barton ; Andrew Pinski (QUIC)

[committed] libstdc++: Avoid overflow when appending to std::filesystem::path

2024-01-05 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- This prevents a std::filesystem::path from exceeding INT_MAX/4 components (which is unlikely to ever be a problem except on 16-bit targets). That limit ensures that the capacity*1.5 calculation doesn't overflow. We should also check that we don't

Re: [middle-end PATCH take #2] Only call targetm.truly_noop_truncation for truncations.

2024-01-05 Thread Richard Sandiford
"Roger Sayle" writes: > Very many thanks (and a Happy New Year) to the pre-commit > patch testing folks at linaro.org. Their testing has revealed that > although my patch is clean on x86_64, it triggers some problems > on aarch64 and arm. The issue (with the previous version of my > patch) is

[committed] libstdc++: Do not use __is_convertible unconditionally [PR113241]

2024-01-05 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk, backport to gcc-13 needed too. -- >8 -- The new __is_convertible built-in should only be used after checking that it's supported. libstdc++-v3/ChangeLog: PR libstdc++/113241 * include/std/type_traits (is_convertible_v): Guard use of

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote: > > 在 2024/1/5 下午7:55, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu

Re: [PATCH] libstdc++: Fix testsuite with -Wformat

2024-01-05 Thread Jonathan Wakely
On 06/12/23 15:34 +0100, Gwenole Beauchesne wrote: Tested on x86_64-pc-linux-gnu with --enable-languages=c,c++ and additional -Wformat to CXXFLAGS. Please CC the libstd...@gcc.gnu.org list on all libstdc++ patches, as documented at https://gcc.gnu.org/lists.html Otherwise I won't see the

Re: [PATCH] RISC-V: Allow simplification non-vlmax with len = NUNITS reg to reg move

2024-01-05 Thread Robin Dapp
> +/* Return true it is whole register-register move. */ > +bool > +whole_reg_to_reg_move_p (rtx *ops, machine_mode mode) > +{ > + if (register_operand (ops[0], mode) > + && register_operand (ops[3], mode) > + && satisfies_constraint_vu (ops[2]) > + && satisfies_constraint_Wc1

RE: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Tamar Christina
> On Fri, 2024-01-05 at 11:02 +, Tamar Christina wrote: > > Ok, so something like: > > > > > > ([istarget loongarch*-*-*] && > > > > ([check_effective_target_loongarch_sx] || > > > > [check_effective_target_hard_float])) > > ? > > We don't need "[check_effective_target_loongarch_sx] ||"

Re: [PATCH 1/4] LoongArch: Handle ISA evolution switches along with other options

2024-01-05 Thread Yang Yujie
On Fri, Jan 05, 2024 at 08:12:08PM +0800, Xi Ruoyao wrote: > On Fri, 2024-01-05 at 14:55 +0800, Yang Yujie wrote: > > +#define ISA_HAS_FRECIPE \ > > +  (la_target.isa.evolution & OPTION_MASK_ISA_FRECIPE) > > +#define ISA_HAS_DIV32 \ > > +  (la_target.isa.evolution & OPTION_MASK_ISA_DIV32) > >

Re: Patch: Remove unneeded double operation in libstdc++-v3/src/c++17/fs_path.cc

2024-01-05 Thread Jonathan Wakely
On Fri, 5 Jan 2024 at 13:00, Martin Küttler wrote: > > > >>This is a small change to libstdc++ which does not change any behavior. > > > > Please CC the libstd...@gcc.gnu.org list on all libstdc++ patches, as > > documented at https://gcc.gnu.org/lists.html > > Acknowledged. Sorry. > > >>This

Re: Patch: Remove unneeded double operation in libstdc++-v3/src/c++17/fs_path.cc

2024-01-05 Thread Martin Küttler
>>This is a small change to libstdc++ which does not change any behavior. > > Please CC the libstd...@gcc.gnu.org list on all libstdc++ patches, as > documented at https://gcc.gnu.org/lists.html Acknowledged. Sorry. >>This change has two, ihmo positive, implications: >> >> - The implicit

[PATCH] LoongArch: Implement option save/restore

2024-01-05 Thread Yang Yujie
LTO option streaming and target attributes both require per-function target configuration, which is achieved via option save/restore. We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target context in addition to other automatically maintained option states (via the "Save" option

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午7:55, Xi Ruoyao 写道: On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: 在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:   bool   loongarch_explicit_relocs_p (enum loongarch_symbol_type

Re: Patch: Remove unneeded double operation in libstdc++-v3/src/c++17/fs_path.cc

2024-01-05 Thread Jonathan Wakely
On 18/12/23 09:36 +0100, Martin Küttler wrote: This is a small change to libstdc++ which does not change any behavior. Please CC the libstd...@gcc.gnu.org list on all libstdc++ patches, as documented at https://gcc.gnu.org/lists.html Otherwise I won't see the patches unless I happen to glance

Re: [pushed][PATCH] LoongArch: Fixed the problem of incorrect judgment of the immediate field of the [x]vld/[x]vst instruction.

2024-01-05 Thread chenglulu
Pushed to r14-6955. 在 2024/1/4 上午10:37, Lulu Cheng 写道: The [x]vld/[x]vst directive is defined as follows: [x]vld/[x]vst {x/v}d, rj, si12 When not modified, the immediate field of [x]vld/[x]vst is between 10 and 14 bits depending on the type. However, in loongarch_valid_offset_p, the

Re: [pushed][PATCH v3] LoongArch: testsuite:Added support for vector object detection.

2024-01-05 Thread chenglulu
pushed to r14-6954. 在 2024/1/5 下午2:05, chenxiaolong 写道: - Change the default vectorization "-mlasx" option to "-mlsx" because there are many non-aligned memory accesses when using 256-bit vectorization. - The following detection procedure is added to the target-supports.exp file:

Re: [pushed][PATCH v2 0/7] LoongArch:Enable testing for common

2024-01-05 Thread chenglulu
Pushed 2-7 to r14-6955...r14-6961. 在 2024/1/5 上午11:43, chenxiaolong 写道: v1->v2: On the basis of v1, the reason of the analysis problem is described in detail. When using binutils, which does not support vectorization, and the gcc compiler toolchain, which does support vectorization, the

Re: [PATCH] aarch64: Further fix for throwing insns in ldp/stp pass [PR113217]

2024-01-05 Thread Richard Sandiford
Alex Coplan writes: > As the PR shows, the fix in > r14-6916-g057dc349021660c40699fb5c98fd9cac8e168653 was not complete. > That fix was enough to stop us trying to move throwing accesses above > nondebug insns, but due to this code in try_fuse_pair: > > // Placement strategy: push loads down

Re: [PATCH 1/4] LoongArch: Handle ISA evolution switches along with other options

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 14:55 +0800, Yang Yujie wrote: > +#define ISA_HAS_FRECIPE \ > +  (la_target.isa.evolution & OPTION_MASK_ISA_FRECIPE) > +#define ISA_HAS_DIV32 \ > +  (la_target.isa.evolution & OPTION_MASK_ISA_DIV32) > +#define ISA_HAS_LAM_BH \ > +  (la_target.isa.evolution &

Re: [PATCH v3 3/3] aarch64: Add explicit checks for implicit LSE/LSE2 requirements.

2024-01-05 Thread Richard Sandiford
Richard Sandiford writes: > Victor Do Nascimento writes: >> At present, Evaluation of both `has_lse2(hwcap)' and >> `has_lse128(hwcap)' may require issuing an `mrs' instruction to query >> a system register. This instruction, when issued from user-space >> results in a trap by the kernel which

Re: [PATCH v3 3/3] aarch64: Add explicit checks for implicit LSE/LSE2 requirements.

2024-01-05 Thread Richard Sandiford
Victor Do Nascimento writes: > At present, Evaluation of both `has_lse2(hwcap)' and > `has_lse128(hwcap)' may require issuing an `mrs' instruction to query > a system register. This instruction, when issued from user-space > results in a trap by the kernel which then returns the value read in >

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > > >   bool > > > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > >

[PATCH] aarch64: Further fix for throwing insns in ldp/stp pass [PR113217]

2024-01-05 Thread Alex Coplan
As the PR shows, the fix in r14-6916-g057dc349021660c40699fb5c98fd9cac8e168653 was not complete. That fix was enough to stop us trying to move throwing accesses above nondebug insns, but due to this code in try_fuse_pair: // Placement strategy: push loads down and pull stores up, this should

Re: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 11:02 +, Tamar Christina wrote: > Ok, so something like: > > > > ([istarget loongarch*-*-*] && > > > ([check_effective_target_loongarch_sx] ||  > > > [check_effective_target_hard_float])) > ? We don't need "[check_effective_target_loongarch_sx] ||" because SIMD requires

Re: [PATCH v3 2/3] libatomic: Enable LSE128 128-bit atomics for armv9.4-a

2024-01-05 Thread Richard Sandiford
Victor Do Nascimento writes: > The armv9.4-a architectural revision adds three new atomic operations > associated with the LSE128 feature: > > * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit > value held in a pair of registers, with original data loaded into > the same 2

Re: [PATCH v3 1/3] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

2024-01-05 Thread Richard Sandiford
Victor Do Nascimento writes: > The introduction of further architectural-feature dependent ifuncs > for AArch64 makes hard-coding ifunc `_i' suffixes to functions > cumbersome to work with. It is awkward to remember which ifunc maps > onto which arch feature and makes the code harder to maintain

RE: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Thursday, January 4, 2024 10:39 PM > To: Palmer Dabbelt ; Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; Jeff Law > > Subject: Re: [PATCH]middle-end: Don't apply copysign optimization if target > does > not

Re: [PATCH v3] AArch64: Cleanup memset expansion

2024-01-05 Thread Richard Sandiford
Wilco Dijkstra writes: > v3: rebased to latest trunk > > Cleanup memset implementation. Similar to memcpy/memmove, use an offset and > bytes throughout. Simplify the complex calculations when optimizing for size > by using a fixed limit. > > Passes regress & bootstrap. > > gcc/ChangeLog: >

[committed] libstdc++: Fix std::char_traits::move [PR113200]

2024-01-05 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. Backports needed too. -- >8 -- The current constexpr implementation of std::char_traits::move relies on being able to compare the pointer parameters, which is not allowed for unrelated pointers. We can use __builtin_constant_p to determine whether it's safe

Ping #2 [patch,avr] PR112944: Support .rodata in RAM for AVR64* and AVR128* devices

2024-01-05 Thread Georg-Johann Lay
Ping #2 https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640140.html FYI, Binutils https://sourceware.org/PR31124 is upstream. Johann This is a patch that locates .rodata in flash for some AVR devices that can support it. All new functionality depends on Binutils PR31124 and is

[committed] libstdc++: Remove UB from month and weekday additions and subtractions.

2024-01-05 Thread Jonathan Wakely
From: Cassio Neri Tested x86_64-linux. Pushed to trunk. This seems suitable for backporting too, at least to gcc-13. -- >8 -- The following invoke signed integer overflow (UB) [1]: month + months{MAX} // where MAX is the maximum value of months::rep month + months{MIN} // where MIN

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > >   bool > > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > > >   { > > > +  /* Instructions pcalau12i, addi.d, lu32i.d and

[committed] libstdc++: Use if-constexpr in std::__try_use_facet [PR113099]

2024-01-05 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- As noted in the PR, we can use if-constexpr for the explicit instantantiation definitions that are compiled with -std=gnu++11. We just need to disable the -Wc++17-extensions diagnostics. libstdc++-v3/ChangeLog: PR libstdc++/113099

Re: Ping^3: [PATCH] Add a late-combine pass [PR106594]

2024-01-05 Thread YunQiang Su
I have tested this patch on mips64el: No regression.

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:  bool  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)  { +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-05 Thread Robin Dapp
> 1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize > shift with vector shift amount, > that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant. > 2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize > shift with scalar shift

Re: [PATCH] RISC-V: Fix avl-type operand index error for ZVBC

2024-01-05 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: Feng Wang Date: 2024-01-05 17:23 To: gcc-patches CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang Subject: [PATCH] RISC-V: Fix avl-type operand index error for ZVBC This patch fix the rtl-checking error for crypto vector. The root cause is the avl-type

[PATCH] RISC-V: Fix avl-type operand index error for ZVBC

2024-01-05 Thread Feng Wang
This patch fix the rtl-checking error for crypto vector. The root cause is the avl-type index of zvbc ins is error,it should be operand[8] not operand[5]. gcc/ChangeLog: * config/riscv/vector.md: Modify avl_type operand index of zvbc ins. --- gcc/config/riscv/vector.md | 4 ++-- 1 file

Re: Re: [PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread Feng Wang
2024-01-05 16:55 juzhe.zhong wrote: >--- a/gcc/config/riscv/vector.md >+++ b/gcc/config/riscv/vector.md >@@ -864,9 +864,9 @@ >  vnclip,vicmp,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\ >  vfsgnj,vfcmp,vslideup,vslidedown,vislide1up,\ > 

Re: [PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread juzhe.zh...@rivai.ai
--- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -864,9 +864,9 @@ vnclip,vicmp,vfalu,vfmul,vfminmax,vfdiv,vfwalu,vfwmul,\ vfsgnj,vfcmp,vslideup,vslidedown,vislide1up,\ vislide1down,vfslide1up,vfslide1down,vgather,viwmuladd,vfwmuladd,\ -

[PATCH v7 1/2] RISC-V: Add crypto vector builtin function.

2024-01-05 Thread Feng Wang
Patch v7:Fix avl_type operand index of zvbc ins. Patch v6:Remove unused code. Patch v5:Rebase. Patch v4:Merge crypto vector function.def into vector. Patch v3:Define a shape for vaesz and merge vector-crypto-types.def into riscv-vector-builtins-types.def. Patch v2:Optimize function_shape

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:  bool  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)  { +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i

[committed] RISC-V: Clean up unused variable [NFC]

2024-01-05 Thread Kito Cheng
gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_load_store): Remove `value`. (expand_cond_len_op): Ditto. (expand_gather_scatter): Ditto. (expand_lanes_load_store): Ditto. (expand_fold_extract_last): Ditto. --- gcc/config/riscv/riscv-v.cc | 5

[committed] RISC-V: Clean up testsuite for multi-lib testing [NFC]

2024-01-05 Thread Kito Cheng
- Drop unnecessary including for stdlib.h and math.h - Drop assert.h / assert, use __builtin_abort instead. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift-scalar-template.h: Use __builtin_abort instead of assert. *

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: >  bool >  loongarch_explicit_relocs_p (enum loongarch_symbol_type type) >  { > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent > + so that the linker can infer the PC of pcalau12i to apply relocations > + to