[PATCH][v2] Fold GATHER_SCATTER_*_P into vect_memory_access_type

2025-08-12 Thread Richard Biener
The following splits up VMAT_GATHER_SCATTER into VMAT_GATHER_SCATTER_LEGACY, VMAT_GATHER_SCATTER_IFN and VMAT_GATHER_SCATTER_EMULATED. The main motivation is to reduce the uses of (full) gs_info, but it also makes the kind representable by a single entry rather than the ifn and decl tristate. The

Re: [PATCH v3] x86-64: Remove redundant TLS calls

2025-08-12 Thread Hongtao Liu
On Tue, Aug 12, 2025 at 10:02 PM H.J. Lu wrote: > > On Tue, Aug 12, 2025 at 06:47:54AM -0700, H.J. Lu wrote: > > On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote: > > > > > > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote: > > > > > > > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wr

Re: [PATCH] Use x86 GFNI for vectorized constant byte shifts/rotates

2025-08-12 Thread Hongtao Liu
On Tue, Aug 5, 2025 at 8:49 AM Andi Kleen wrote: > > From: Andi Kleen > > The GFNI AVX gf2p8affineqb instruction can be used to implement > vectorized byte shifts or rotates. This patch uses them to implement > shift and rotate patterns to allow the vectorizer to use them. > Previously AVX couldn

Re: [PATCH] Use x86 GFNI for vectorized constant byte shifts/rotates

2025-08-12 Thread Andi Kleen
> > It might be reasonable to tweak the costs per CPU however, I haven't > > done that. > > > > BTW for rotate the wins are much higher because there are no native > > instructions for it. > For ashl/lshr, the original implementation only takes 2 > instructions(vpsllw/vpsrlw + vpand), and for ashr

Re: [PATCH 0/7 v3] Add Xandes vender extension support.

2025-08-12 Thread Jeff Law
On 8/12/25 4:09 AM, Kito Cheng wrote: This patchset LGTM except 4/7, you can go ahead to commit 1/7~3/7 if you are OK with that :) Note the CI testing is flagging errors on xandesperf-2.c. So Kuan-Lin needs to sort that out and post a v4 assuming something needs adjusting: https://github.

Re:[pushed] [PATCH] LoongArch: Define hook TARGET_COMPUTE_PRESSURE_CLASSES[PR120476].

2025-08-12 Thread Lulu Cheng
Pushed to r16-3177 r15-10223 and r14-11951. 在 2025/8/12 下午2:44, Lulu Cheng 写道: The rtx cost value defined by the target backend affects the calculation of register pressure classes in the IRA, thus affecting scheduling. This may cause program performance degradation. For example, OpenSSL 3.5.1

Re: [PATCH v3 5/5] forwprop: Copy prop aggregates into args

2025-08-12 Thread Andrew Pinski
On Thu, Aug 7, 2025 at 4:36 AM Richard Biener wrote: > > On Wed, Aug 6, 2025 at 7:34 PM Andrew Pinski wrote: > > > > This implements the simple copy prop of aggregates into > > arguments of function calls. This can reduce the number of copies > > done. Just like removing of an extra copy in gener

Re: [PATCH] Use x86 GFNI for vectorized constant byte shifts/rotates

2025-08-12 Thread Hongtao Liu
On Wed, Aug 13, 2025 at 1:40 AM Andi Kleen wrote: > > > > > The latter takes 5 cycles, the former takes 3 cycles. > > It's pipelined however. > > > > > Do you have any microbenchmark or real workloads to show your > > optimization is better? > > Keep in mind it only uses one port vs two. > > Yes I

[committed] cobol: Implement faster zoned decimal to binary conversion.

2025-08-12 Thread Robert Dubner
From: Robert Dubner Date: Tue, 12 Aug 2025 22:13:59 -0400 Subject: [PATCH] cobol: Implement faster zoned decimal to binary conversion. Replace " value *= 10; value += digit" routines with a new one that does two digits at a time and avoids __int128 calculations until they are necessary. These ch

[pushed: r16-3172] testsuite: fix jit.dg/test-error-impossible-must-tail-call.c [PR119783]

2025-08-12 Thread David Malcolm
I added this test back in r7-934-g15c671a79ca66d, but it looks like r15-2125-g81824596361cf4 changed the error message. Tested on x86_64-pc-linux-gnu. Pushed to trunk as r16-3172-gf622df9af2e7c1. gcc/testsuite/ChangeLog: PR testsuite/119783 jit.dg/test-error-impossible-must-tail-c

[pushed: r16-3171] jit: don't use &vect[0] in libgccjit++.h [PR121516]

2025-08-12 Thread David Malcolm
Tested on x86_64-pc-linux-gnu; fixes jit.dg/test-asm.cc. Pushed to trunk as r16-3171-gd6d1fa0039e68e. gcc/jit/ChangeLog: PR jit/121516 * libgccjit++.h (context::new_struct_type): Replace use of &fields[0] with fields.data (). (context::new_function): Likewise for pa

Re: [PATCH] c++: Update DECL_TLS_MODEL after processing a TLS variable

2025-08-12 Thread H.J. Lu
On Tue, Aug 12, 2025 at 4:19 PM Jason Merrill wrote: > > On 8/1/25 4:56 AM, H.J. Lu wrote: > > Set a tentative TLS model in grokvardecl and update DECL_TLS_MODEL with > > the default TLS access model after a TLS variable has been fully processed > > if the default TLS access model is stronger. > >

RE: [PATCH v2] x86: Convert integer constant to mode of move

2025-08-12 Thread Liu, Hongtao
> -Original Message- > From: H.J. Lu > Sent: Tuesday, August 12, 2025 8:19 PM > To: gcc-patches@gcc.gnu.org > Cc: ubiz...@gmail.com; Liu, Hongtao > Subject: [PATCH v2] x86: Convert integer constant to mode of move > > For > > (set (reg/v:DI 106 [ k ]) > (const_int 30 [0x

Re: [PATCH] c++: Update DECL_TLS_MODEL after processing a TLS variable

2025-08-12 Thread Jason Merrill
On 8/1/25 4:56 AM, H.J. Lu wrote: Set a tentative TLS model in grokvardecl and update DECL_TLS_MODEL with the default TLS access model after a TLS variable has been fully processed if the default TLS access model is stronger. gcc/cp/ PR c++/107393 * decl.cc (grokvardecl): Add a

[PATCH v8 4/6] btf: generate and output DECL_TAG and TYPE_TAG records

2025-08-12 Thread David Faust
Support the btf_decl_tag and btf_type_tag attributes in BTF by creating and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records, respectively, for them. Some care is required when -gprune-btf is in effect to avoid emitting decl or type tags for declarations or types which have been pruned and

[PATCH v8 6/6] bpf: add tests for CO-RE and BTF tag interaction

2025-08-12 Thread David Faust
Add a couple of tests to ensure that BTF type/decl tags do not interfere with generation of BPF CO-RE relocations. gcc/testsuite/ * gcc.target/bpf/core-btf-tag-1.c: New test. * gcc.target/bpf/core-btf-tag-2.c: New test. --- gcc/testsuite/gcc.target/bpf/core-btf-tag-1.c | 23 ++

[PATCH v8 2/6] dwarf: create annotation DIEs for btf tags

2025-08-12 Thread David Faust
The btf_decl_tag and btf_type_tag attributes provide a means to annotate declarations and types respectively with arbitrary user provided strings. These strings are recorded in debug information for post-compilation uses, and despite the name they are meant to be recorded in DWARF as well as BTF.

[PATCH v8 5/6] doc: document btf_type_tag and btf_decl_tag attributes

2025-08-12 Thread David Faust
gcc/ * doc/extend.texi (Common Function Attributes) (Common Variable Attributes): Document btf_decl_tag attribute. (Common Type Attributes): Document btf_type_tag attribute. --- gcc/doc/extend.texi | 79 + 1 file changed, 79 inser

[PATCH v8 3/6] ctf: translate annotation DIEs to internal ctf

2025-08-12 Thread David Faust
Translate DW_TAG_GNU_annotation DIEs created for C attributes btf_decl_tag and btf_type_tag into an in-memory representation in the CTF/BTF container. They will be output in BTF as BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records. The new CTF kinds used to represent these annotations, CTF_K_DECL_T

[PATCH v8 0/6] c, dwarf, btf: Add btf_decl_tag and btf_type_tag C attributes

2025-08-12 Thread David Faust
[v7: https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692127.html Review Status: - Patch 2,3 have been OK'd in v6 and are unchanged since then. - Patch 4 was OK'd with nits in v6, fixed in and unchanged from v7. - Patch 6 adds two small BPF-specific sanity checks that I consider obvious

[PATCH v8 1/6] c-family: add btf_type_tag and btf_decl_tag attributes

2025-08-12 Thread David Faust
Add two new c-family attributes, "btf_type_tag" and "btf_decl_tag" along with attribute handlers for them. These attributes may be used to annotate types or declarations respectively with arbitrary strings, which will be recorded in DWARF and/or BTF information. Both attributes accept exactly one

[PATCHv5] libstdc++: Add generate_canonical impl (P0952, LWG2524) [PR119739]

2025-08-12 Thread Nathan Myers
Changes in v5: * Static-assert movable RNG object correctly. * Add a more comprehensive test gencanon.cc Changes in v4: * Static-assert that arg is floating-point, coercible from bigger unsigned. * Static-assert that arg satisfies uniform_random_bit_generator, movable. * #include uniform_int_

Re: [PATCH 0/7 v3] Add Xandes vender extension support.

2025-08-12 Thread Palmer Dabbelt
On Tue, 12 Aug 2025 01:18:35 PDT (-0700), ru...@andestech.com wrote: Changes since v2: [PATCH 1/7] Moved andes test cases to subdir gcc.target/riscv/xandes. [PATCH 2/7] Moved andes test cases to subdir gcc.target/riscv/xandes. Replaced "equality_operator" with "any_eq". Added code_attr "

Re: [PATCH] RISC-V: Expand const_vector with 2 elts per pattern.

2025-08-12 Thread Jeff Law
On 8/4/25 3:09 AM, Robin Dapp wrote: Hi, In PR121334 we are asked to expand a const_vector of size 4 with poly_int elements.  It has 2 elts per pattern so is neither a const_vector_duplicate nor a const_vector_stepped. We don't allow this kind of constant in legitimate_constant_p but expr ap

Re: [PATCH] csky: use quotes when referring to cpus and archs [PR90160]

2025-08-12 Thread Jeff Law
On 8/12/25 2:56 PM, Andrew Pinski wrote: On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote: On 11/26/24 11:43 AM, Florian Weimer wrote: * Jeff Law: On 11/26/24 9:06 AM, David Malcolm wrote: OK for trunk? (caveat: not properly tested) gcc/ChangeLog: PR translation/90160 * conf

[to-be-committed][RISC-V][PR target/121113] Handle HFmode in various insn reservations

2025-08-12 Thread Jeff Law
So this is a minor bug in a few DFA descriptions such as the Xiangshan and a couple of the SiFive descriptions. While Xiangshan covers every insn type, some of the reservations check the mode of the operation. Concretely the fdiv/fsqrt unit reservations vary based on the mode. They handled

Re: [PATCH] csky: use quotes when referring to cpus and archs [PR90160]

2025-08-12 Thread Palmer Dabbelt
On Tue, 12 Aug 2025 13:56:09 PDT (-0700), pins...@gmail.com wrote: On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote: On 11/26/24 11:43 AM, Florian Weimer wrote: > * Jeff Law: > >> On 11/26/24 9:06 AM, David Malcolm wrote: >>> OK for trunk? (caveat: not properly tested) >>> gcc/ChangeLog: >>>

Re: [PATCH] csky: use quotes when referring to cpus and archs [PR90160]

2025-08-12 Thread Andrew Pinski
On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote: > > > > On 11/26/24 11:43 AM, Florian Weimer wrote: > > * Jeff Law: > > > >> On 11/26/24 9:06 AM, David Malcolm wrote: > >>> OK for trunk? (caveat: not properly tested) > >>> gcc/ChangeLog: > >>> PR translation/90160 > >>> * config/csky/csk

Re: [Patch, fortran] PR89092 - Host-associated generic used instead of use-associated TBP in call

2025-08-12 Thread Jerry D
On 8/12/25 10:11 AM, Paul Richard Thomas wrote: The attached patch is utterly trivial. The only useful attribute that FooPrivate possesses  to detect that it has been declared is 'subroutine'. This was missed in the attribute test in resolve.cc(was_declared). Adding it fixes the problem and reg

Re: [PATCH] Use x86 GFNI for vectorized constant byte shifts/rotates

2025-08-12 Thread Andi Kleen
> > The latter takes 5 cycles, the former takes 3 cycles. It's pipelined however. > > Do you have any microbenchmark or real workloads to show your > optimization is better? Keep in mind it only uses one port vs two. Yes I ran it on Arrow lake and saw wins on both Pcore and Ecore according to

[Patch, fortran] PR89092 - Host-associated generic used instead of use-associated TBP in call

2025-08-12 Thread Paul Richard Thomas
The attached patch is utterly trivial. The only useful attribute that FooPrivate possesses to detect that it has been declared is 'subroutine'. This was missed in the attribute test in resolve.cc(was_declared). Adding it fixes the problem and regtests on FC42/x86_64. OK for mainline and some judi

[PATCH] arm_mve: Use inline asm for lsll and asrl MVE primitives

2025-08-12 Thread Keith Packard
The C shift operators do not precisely match the associated ARM instructions: shifts of negative values or by negative amounts are undefined behavior in C, and GCC may substitute alternate instruction sequences when it can determine that the application is using UB. Replace C shift operators with

[PATCH v6] c++: P2036R3 - Change scope of lambda trailing-return-type [PR102610]

2025-08-12 Thread Marek Polacek
On Sun, Aug 10, 2025 at 02:20:22PM -0700, Jason Merrill wrote: > On 8/8/25 11:37 AM, Marek Polacek wrote: > > On Tue, Aug 05, 2025 at 02:54:01PM -0700, Jason Merrill wrote: > > > On 8/4/25 4:53 PM, Marek Polacek wrote: > > > > > > Now that even dummy lambdas have an operator(), I had to tweak > > >

Re: [PATCH] RISC-V: Expand const_vector with 2 elts per pattern.

2025-08-12 Thread Palmer Dabbelt
On Sun, 10 Aug 2025 07:29:25 PDT (-0700), jeffreya...@gmail.com wrote: On 8/4/25 3:09 AM, Robin Dapp wrote: Hi, In PR121334 we are asked to expand a const_vector of size 4 with poly_int elements.  It has 2 elts per pattern so is neither a const_vector_duplicate nor a const_vector_stepped. We

Re: [PATCH v3] x86-64: Remove redundant TLS calls

2025-08-12 Thread H.J. Lu
On Tue, Aug 12, 2025 at 06:47:54AM -0700, H.J. Lu wrote: > On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote: > > > > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote: > > > > > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wrote: > > > > > > > > + rtx_insn *before = nullptr; > > > > > >

Re: [PATCH v3] x86-64: Remove redundant TLS calls

2025-08-12 Thread H.J. Lu
On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote: > > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote: > > > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wrote: > > > > > > > + rtx_insn *before = nullptr; > > > > > > > + rtx_insn *after = nullptr; > > > > > > > + if (insn == BB_HEAD

Re: [PATCH] Add ia64*-*-* to the list of obsolete targets

2025-08-12 Thread Richard Biener
On Tue, 12 Aug 2025, Frank Scheiner wrote: > Dear Richard, > > On 12.08.25 08:46, Richard Biener wrote: > > On Mon, 11 Aug 2025, Frank Scheiner wrote: > >> On 11.08.25 12:59, Richard Biener wrote: > >>> On Mon, 11 Aug 2025, Sam James wrote: > Frank Scheiner writes: > > On 11.08.25 09:49

[PATCH v2] x86: Convert integer constant to mode of move

2025-08-12 Thread H.J. Lu
For (set (reg/v:DI 106 [ k ]) (const_int 30 [0xb2d05e00])) ... (set (reg:V4SI 115 [ _13 ]) (vec_duplicate:V4SI (subreg:SI (reg/v:DI 106 [ k ]) 0))) ... (set (reg:V2SI 118 [ _9 ]) (vec_duplicate:V2SI (subreg:SI (reg/v:DI 106 [ k ]) 0))) we should generate (set (reg:SI 125)

Re: [PATCH 1/2] Match: Support SAT_TRUNC variant NARROW_CLIP

2025-08-12 Thread Richard Biener
On Mon, Aug 11, 2025 at 11:01 PM Edwin Lu wrote: > > On Fri, Aug 8, 2025 at 3:23 AM Richard Biener > wrote: > > > > On Wed, Aug 6, 2025 at 7:04 AM Edwin Lu wrote: > > > > > > This patch tries to add support for a variant of SAT_TRUNC where > > > negative numbers are clipped to 0 instead of NARRO

Re: [PATCH] Add ia64*-*-* to the list of obsolete targets

2025-08-12 Thread Frank Scheiner
Dear Richard, On 12.08.25 08:46, Richard Biener wrote: > On Mon, 11 Aug 2025, Frank Scheiner wrote: >> On 11.08.25 12:59, Richard Biener wrote: >>> On Mon, 11 Aug 2025, Sam James wrote: Frank Scheiner writes: > On 11.08.25 09:49, Richard Biener wrote: >> On Sun, 10 Aug 2025, Jeff Law

[PATCH] Fold GATHER_SCATTER_*_P into vect_memory_access_type

2025-08-12 Thread Richard Biener
The following splits up VMAT_GATHER_SCATTER into VMAT_GATHER_SCATTER_LEGACY, VMAT_GATHER_SCATTER_IFN and VMAT_GATHER_SCATTER_EMULATED. The main motivation is to reduce the uses of (full) gs_info, but it also makes the kind representable by a single entry rather than the ifn and decl tristate. The

[PATCH] testsuite: Fix asm-hard-reg-error-3.c for arm [PR121511]

2025-08-12 Thread Stefan Schulze Frielinghaus
From: Stefan Schulze Frielinghaus This test is about register pairs. On arm a long long is accepted in thumb mode in any register 0-6 whereas in arm mode this is restricted to even register pairs. Thus, in order to trigger the error even if gcc is configured with --with-mode=thumb, add option -

Re:[pushed] [PATCH v2] LoongArch: macro instead enum for base abi type

2025-08-12 Thread Lulu Cheng
Pushed to r16-3165 r15-10220 and r14-11949. 在 2025/8/8 下午4:22, mengqinggang 写道: enum can't be used in #if. For #if expression, identifiers that are not macros, which are all considered to be the number zero. This patch may fix https://sourceware.org/bugzilla/show_bug.cgi?id=32776. gcc/ChangeLo

[PATCH] Cleanup SLP decision during loop analysis

2025-08-12 Thread Richard Biener
The following refactors the now misleading slp_done_for_suggested_uf and slp states kept during vectorizer loop analysis. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-vect-loop.cc (vect_analyze_loop_2): Change slp_done_for_suggested_uf to a boolean s

Re: [PATCH 0/7 v3] Add Xandes vender extension support.

2025-08-12 Thread Kito Cheng
This patchset LGTM except 4/7, you can go ahead to commit 1/7~3/7 if you are OK with that :) On Tue, Aug 12, 2025 at 4:18 PM Kuan-Lin Chen wrote: > > Changes since v2: > [PATCH 1/7] > Moved andes test cases to subdir gcc.target/riscv/xandes. > > [PATCH 2/7] > Moved andes test cases to subdir

Re: [PATCH 4/7 v3] RISC-V: Add support for the XAndesvbfhcvt ISA extension.

2025-08-12 Thread Kito Cheng
I would say no to this one since it seems apparently not right due to its lack of correct vsetvli info, either drop from this patch set or define those as static inline asm. On Tue, Aug 12, 2025 at 4:18 PM Kuan-Lin Chen wrote: > > This patch add support for XAndesvbfhcvt ISA extension. > This ext

Re: RISC-V GCC Patchwork Sync-Up ?

2025-08-12 Thread Kito Cheng
Hi Umesh: I've added you to the meeting invitation, you should be able to see that in your google calendar :) On Tue, Aug 12, 2025 at 5:30 PM Umesh Kalappa wrote: > > Hi all, > > Does the "RISC-V GCC Patchwork Sync-Up Meeting" is happening ,if so > > Please send us the calendar link ,try to goog

RISC-V GCC Patchwork Sync-Up ?

2025-08-12 Thread Umesh Kalappa
Hi all, Does the "RISC-V GCC Patchwork Sync-Up Meeting" is happening ,if so Please send us the calendar link ,try to google up the same ,but with no luck . Here at MIPS ,we have developed the RISCV extension core and have few GCC changes to push back to trunk and like to be part of the meeting t

Re: [PATCH] fwprop: Don't propagate asms [PR121253]

2025-08-12 Thread Richard Biener
On Tue, Aug 12, 2025 at 10:14 AM Richard Sandiford wrote: > > For the reasons explained in the comment, fwprop shouldn't even > try to propagate an asm definition. > > Tested on aarch64-linux-gnu. Bordering on obvious, but just in case: > OK to install? OK. Richard. > Richard > > > gcc/ >

Re: [PATCH] Use x86 GFNI for vectorized constant byte shifts/rotates

2025-08-12 Thread Hongtao Liu
On Tue, Aug 5, 2025 at 8:49 AM Andi Kleen wrote: > > From: Andi Kleen > > The GFNI AVX gf2p8affineqb instruction can be used to implement > vectorized byte shifts or rotates. This patch uses them to implement > shift and rotate patterns to allow the vectorizer to use them. > Previously AVX couldn

[PATCH 7/7 v3] RISC-V: Add support for the XAndesvdot ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This extension defines vector instructions to calculae of the signed/unsigned dot product of four SEW/4-bit data and accumulate the result into a SEWbit element for all elements in a vector register. gcc/ChangeLog: * config/riscv/andes-vector-builtins-bases.cc (nds_vd4dot): New class.

[PATCH 1/7 v3] RISC-V: Add basic XAndes vendor extension support.

2025-08-12 Thread Kuan-Lin Chen
This patch add basic support for the following XAndes ISA extensions: XANDESPERF XANDESBFHCVT XANDESVBFHCVT XANDESVSINTLOAD XANDESVPACKFPH XANDESVDOT gcc/ChangeLog: * config/riscv/riscv-ext.def: Include riscv-ext-andes.def. * config/riscv/riscv-ext.opt (riscv_xandes_subext): New

[PATCH 6/7 v3] RISC-V: Add support for the XAndesvpackfph ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This extension defines vector instructions to extract a pair of FP16 data from a floating-point register. Multiply the top FP16 data with the FP16 elements and add the result with the bottom FP16 data. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Turn on VECTOR_ELEN_FP_16

[PATCH 5/7 v3] RISC-V: Add support for the XAndesvsintload ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This extension defines vector load instructions to move sign-extended or zero-extended INT4 data into 8-bit vector register elements. gcc/ChangeLog: * config/riscv/andes-vector-builtins-bases.cc (nds_nibbleload): New class. * config/riscv/andes-vector-builtins-bases.h (nds

[PATCH 2/7 v3] RISC-V: Add support for the XAndesperf ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This patch adds support for the XAndesperf ISA extension. The 32-bit AndeStar V5 extension includes branch instructions, load effective address instructions, and string processing instructions for performance improvement. New INSN patterns are added into the new file andes.md as a seprated vender e

[PATCH 0/7 v3] Add Xandes vender extension support.

2025-08-12 Thread Kuan-Lin Chen
Changes since v2: [PATCH 1/7] Moved andes test cases to subdir gcc.target/riscv/xandes. [PATCH 2/7] Moved andes test cases to subdir gcc.target/riscv/xandes. Replaced "equality_operator" with "any_eq". Added code_attr "cs" for "any_eq". Fixed comment for "*nds_bfoz4". Modified builtins

[PATCH 4/7 v3] RISC-V: Add support for the XAndesvbfhcvt ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This patch add support for XAndesvbfhcvt ISA extension. This extension defines instructions to perform vector floating-point conversion between the BFLOAT16 floating-point data and the IEEE-754 32-bit single-precision floating-point (SP) data in a vector register. gcc/ChangeLog: * common/

[PATCH 3/7 v3] RISC-V: Add support for the XAndesbfhcvt ISA extension.

2025-08-12 Thread Kuan-Lin Chen
This extension defines instructions to perform scalar floating-point conversion between the BFLOAT16 floating-point data and the IEEE-754 32-bit single-precision floating-point (SP) data in a scalar floating point register. gcc/ChangeLog: * config/riscv/andes.def: Add nds_fcvt_s_bf16 and

[PATCH] tree-optimization/121509 - failure to detect unvectorizable loop

2025-08-12 Thread Richard Biener
With the hybrid stmt detection no longer working as a gate-keeper to detect unhandled stmts we have to, and can, detect those earlier. The appropriate place is vect_mark_stmts_to_be_vectorized where for trivially relevant PHIs we can stop analyzing when the PHI wasn't classified as a known def duri

[PATCH] tree-optimization/121514 - ICE with recent VN improvement

2025-08-12 Thread Richard Biener
When inserting a compensation stmt during VN we are making sure to register the result for the original stmt into the hashtable so VN iteration has the chance to converge and we avoid inserting another copy each time. But the implementation doesn't work for non-SSA name values, and is also not nec

[PATCH] fwprop: Don't propagate asms [PR121253]

2025-08-12 Thread Richard Sandiford
For the reasons explained in the comment, fwprop shouldn't even try to propagate an asm definition. Tested on aarch64-linux-gnu. Bordering on obvious, but just in case: OK to install? Richard gcc/ PR rtl-optimization/121253 * fwprop.cc (forward_propagate_into): Don't propagate

Re: [PATCH] forwprop: Fix non-call exceptions some more with copy prop for aggregates [PR121494]

2025-08-12 Thread Richard Biener
On Tue, Aug 12, 2025 at 8:59 AM Andrew Pinski wrote: > > From: Andrew Pinski > > Note this conflicts with my not yet approved patch for copy prop for > aggregates into > function arguments (I will get back to that soon). > > So the problem here is that I assumed if: > *a = decl1; > would not cau

Re: [PATCH] LoongArch: Don't set movgr2cf cost for LA664 [PR120476]

2025-08-12 Thread Xi Ruoyao
Dropped in favor of https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692394.html. On Tue, 2025-08-05 at 17:34 +0800, Xi Ruoyao wrote: > Despite LA664 has 1-cycle movgr2cf in real, it seems setting the correct > value in the cost model has puzzled the register allocator and severely > impacted

[PATCH 3/3] Do not set STMT_VINFO_VECTYPE for non-dataref stmts

2025-08-12 Thread Richard Biener
Now that all STMT_VINFO_VECTYPE uses from vectorizable_* have been pruged there's no longer a need to have STMT_VINFO_VECTYPE set. We still rely on it being present on data-ref stmts and there it can differ between different SLP instances when doing BB vectorization. The following removes the setti

[PATCH 2/3] Pass down vector type to avoid STMT_VINFO_VECTYPE on reduc-info

2025-08-12 Thread Richard Biener
The following passes down the vector type to functions instead of querying it from the reduc-info stmt-info. Bootstrapped and tested on x86_64-unknown-linux-gnu, cross-tested aarch64-linux-gnu, pushed. * tree-vect-loop.cc (get_initial_defs_for_reduction): Get vector type as argume

[PATCH 1/3] Do not use STMT_VINFO_VECTYPE in vectorizable_reduction

2025-08-12 Thread Richard Biener
There's one use of STMT_VINFO_VECTYPE in vectorizable_reduction where I'm only 99% sure which SLP_TREE_VECTYPE to replace it with (vectorizable_reduction needs a lot of post-only-SLP TLC). The following replaces it with the hopefully appropriate one. Bootstrapped and tested on x86_64-unknown-linu