The following splits up VMAT_GATHER_SCATTER into
VMAT_GATHER_SCATTER_LEGACY, VMAT_GATHER_SCATTER_IFN and
VMAT_GATHER_SCATTER_EMULATED. The main motivation is to reduce
the uses of (full) gs_info, but it also makes the kind representable
by a single entry rather than the ifn and decl tristate.
The
On Tue, Aug 12, 2025 at 10:02 PM H.J. Lu wrote:
>
> On Tue, Aug 12, 2025 at 06:47:54AM -0700, H.J. Lu wrote:
> > On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote:
> > >
> > > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote:
> > > >
> > > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wr
On Tue, Aug 5, 2025 at 8:49 AM Andi Kleen wrote:
>
> From: Andi Kleen
>
> The GFNI AVX gf2p8affineqb instruction can be used to implement
> vectorized byte shifts or rotates. This patch uses them to implement
> shift and rotate patterns to allow the vectorizer to use them.
> Previously AVX couldn
> > It might be reasonable to tweak the costs per CPU however, I haven't
> > done that.
> >
> > BTW for rotate the wins are much higher because there are no native
> > instructions for it.
> For ashl/lshr, the original implementation only takes 2
> instructions(vpsllw/vpsrlw + vpand), and for ashr
On 8/12/25 4:09 AM, Kito Cheng wrote:
This patchset LGTM except 4/7, you can go ahead to commit 1/7~3/7 if
you are OK with that :)
Note the CI testing is flagging errors on xandesperf-2.c. So Kuan-Lin
needs to sort that out and post a v4 assuming something needs adjusting:
https://github.
Pushed to r16-3177 r15-10223 and r14-11951.
在 2025/8/12 下午2:44, Lulu Cheng 写道:
The rtx cost value defined by the target backend affects the
calculation of register pressure classes in the IRA, thus affecting
scheduling. This may cause program performance degradation.
For example, OpenSSL 3.5.1
On Thu, Aug 7, 2025 at 4:36 AM Richard Biener
wrote:
>
> On Wed, Aug 6, 2025 at 7:34 PM Andrew Pinski wrote:
> >
> > This implements the simple copy prop of aggregates into
> > arguments of function calls. This can reduce the number of copies
> > done. Just like removing of an extra copy in gener
On Wed, Aug 13, 2025 at 1:40 AM Andi Kleen wrote:
>
> >
> > The latter takes 5 cycles, the former takes 3 cycles.
>
> It's pipelined however.
>
> >
> > Do you have any microbenchmark or real workloads to show your
> > optimization is better?
>
> Keep in mind it only uses one port vs two.
>
> Yes I
From: Robert Dubner
Date: Tue, 12 Aug 2025 22:13:59 -0400
Subject: [PATCH] cobol: Implement faster zoned decimal to binary
conversion.
Replace " value *= 10; value += digit" routines with a new one that does
two
digits at a time and avoids __int128 calculations until they are
necessary.
These ch
I added this test back in r7-934-g15c671a79ca66d, but it looks like
r15-2125-g81824596361cf4 changed the error message.
Tested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-3172-gf622df9af2e7c1.
gcc/testsuite/ChangeLog:
PR testsuite/119783
jit.dg/test-error-impossible-must-tail-c
Tested on x86_64-pc-linux-gnu; fixes jit.dg/test-asm.cc.
Pushed to trunk as r16-3171-gd6d1fa0039e68e.
gcc/jit/ChangeLog:
PR jit/121516
* libgccjit++.h (context::new_struct_type): Replace use of
&fields[0] with fields.data ().
(context::new_function): Likewise for pa
On Tue, Aug 12, 2025 at 4:19 PM Jason Merrill wrote:
>
> On 8/1/25 4:56 AM, H.J. Lu wrote:
> > Set a tentative TLS model in grokvardecl and update DECL_TLS_MODEL with
> > the default TLS access model after a TLS variable has been fully processed
> > if the default TLS access model is stronger.
> >
> -Original Message-
> From: H.J. Lu
> Sent: Tuesday, August 12, 2025 8:19 PM
> To: gcc-patches@gcc.gnu.org
> Cc: ubiz...@gmail.com; Liu, Hongtao
> Subject: [PATCH v2] x86: Convert integer constant to mode of move
>
> For
>
> (set (reg/v:DI 106 [ k ])
> (const_int 30 [0x
On 8/1/25 4:56 AM, H.J. Lu wrote:
Set a tentative TLS model in grokvardecl and update DECL_TLS_MODEL with
the default TLS access model after a TLS variable has been fully processed
if the default TLS access model is stronger.
gcc/cp/
PR c++/107393
* decl.cc (grokvardecl): Add a
Support the btf_decl_tag and btf_type_tag attributes in BTF by creating
and emitting BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records,
respectively, for them.
Some care is required when -gprune-btf is in effect to avoid emitting
decl or type tags for declarations or types which have been pruned and
Add a couple of tests to ensure that BTF type/decl tags do not interfere
with generation of BPF CO-RE relocations.
gcc/testsuite/
* gcc.target/bpf/core-btf-tag-1.c: New test.
* gcc.target/bpf/core-btf-tag-2.c: New test.
---
gcc/testsuite/gcc.target/bpf/core-btf-tag-1.c | 23 ++
The btf_decl_tag and btf_type_tag attributes provide a means to annotate
declarations and types respectively with arbitrary user provided
strings. These strings are recorded in debug information for
post-compilation uses, and despite the name they are meant to be
recorded in DWARF as well as BTF.
gcc/
* doc/extend.texi (Common Function Attributes)
(Common Variable Attributes): Document btf_decl_tag attribute.
(Common Type Attributes): Document btf_type_tag attribute.
---
gcc/doc/extend.texi | 79 +
1 file changed, 79 inser
Translate DW_TAG_GNU_annotation DIEs created for C attributes
btf_decl_tag and btf_type_tag into an in-memory representation in the
CTF/BTF container. They will be output in BTF as BTF_KIND_DECL_TAG and
BTF_KIND_TYPE_TAG records.
The new CTF kinds used to represent these annotations, CTF_K_DECL_T
[v7: https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692127.html
Review Status:
- Patch 2,3 have been OK'd in v6 and are unchanged since then.
- Patch 4 was OK'd with nits in v6, fixed in and unchanged from v7.
- Patch 6 adds two small BPF-specific sanity checks that I consider
obvious
Add two new c-family attributes, "btf_type_tag" and "btf_decl_tag"
along with attribute handlers for them. These attributes may be
used to annotate types or declarations respectively with arbitrary
strings, which will be recorded in DWARF and/or BTF information.
Both attributes accept exactly one
Changes in v5:
* Static-assert movable RNG object correctly.
* Add a more comprehensive test gencanon.cc
Changes in v4:
* Static-assert that arg is floating-point, coercible from bigger unsigned.
* Static-assert that arg satisfies uniform_random_bit_generator, movable.
* #include uniform_int_
On Tue, 12 Aug 2025 01:18:35 PDT (-0700), ru...@andestech.com wrote:
Changes since v2:
[PATCH 1/7]
Moved andes test cases to subdir gcc.target/riscv/xandes.
[PATCH 2/7]
Moved andes test cases to subdir gcc.target/riscv/xandes.
Replaced "equality_operator" with "any_eq".
Added code_attr "
On 8/4/25 3:09 AM, Robin Dapp wrote:
Hi,
In PR121334 we are asked to expand a const_vector of size 4 with
poly_int elements. It has 2 elts per pattern so is neither a
const_vector_duplicate nor a const_vector_stepped.
We don't allow this kind of constant in legitimate_constant_p but expr
ap
On 8/12/25 2:56 PM, Andrew Pinski wrote:
On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote:
On 11/26/24 11:43 AM, Florian Weimer wrote:
* Jeff Law:
On 11/26/24 9:06 AM, David Malcolm wrote:
OK for trunk? (caveat: not properly tested)
gcc/ChangeLog:
PR translation/90160
* conf
So this is a minor bug in a few DFA descriptions such as the Xiangshan
and a couple of the SiFive descriptions.
While Xiangshan covers every insn type, some of the reservations check
the mode of the operation. Concretely the fdiv/fsqrt unit reservations
vary based on the mode. They handled
On Tue, 12 Aug 2025 13:56:09 PDT (-0700), pins...@gmail.com wrote:
On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote:
On 11/26/24 11:43 AM, Florian Weimer wrote:
> * Jeff Law:
>
>> On 11/26/24 9:06 AM, David Malcolm wrote:
>>> OK for trunk? (caveat: not properly tested)
>>> gcc/ChangeLog:
>>>
On Tue, Nov 26, 2024 at 12:54 PM Jeff Law wrote:
>
>
>
> On 11/26/24 11:43 AM, Florian Weimer wrote:
> > * Jeff Law:
> >
> >> On 11/26/24 9:06 AM, David Malcolm wrote:
> >>> OK for trunk? (caveat: not properly tested)
> >>> gcc/ChangeLog:
> >>> PR translation/90160
> >>> * config/csky/csk
On 8/12/25 10:11 AM, Paul Richard Thomas wrote:
The attached patch is utterly trivial. The only useful attribute that FooPrivate
possesses to detect that it has been declared is 'subroutine'. This was missed
in the attribute test in resolve.cc(was_declared). Adding it fixes the problem
and reg
>
> The latter takes 5 cycles, the former takes 3 cycles.
It's pipelined however.
>
> Do you have any microbenchmark or real workloads to show your
> optimization is better?
Keep in mind it only uses one port vs two.
Yes I ran it on Arrow lake and saw wins on both Pcore and Ecore
according to
The attached patch is utterly trivial. The only useful attribute
that FooPrivate possesses to detect that it has been declared is
'subroutine'. This was missed in the attribute test in
resolve.cc(was_declared). Adding it fixes the problem and regtests on
FC42/x86_64.
OK for mainline and some judi
The C shift operators do not precisely match the associated ARM
instructions: shifts of negative values or by negative amounts are
undefined behavior in C, and GCC may substitute alternate instruction
sequences when it can determine that the application is using UB.
Replace C shift operators with
On Sun, Aug 10, 2025 at 02:20:22PM -0700, Jason Merrill wrote:
> On 8/8/25 11:37 AM, Marek Polacek wrote:
> > On Tue, Aug 05, 2025 at 02:54:01PM -0700, Jason Merrill wrote:
> > > On 8/4/25 4:53 PM, Marek Polacek wrote:
> > > > > > Now that even dummy lambdas have an operator(), I had to tweak
> > >
On Sun, 10 Aug 2025 07:29:25 PDT (-0700), jeffreya...@gmail.com wrote:
On 8/4/25 3:09 AM, Robin Dapp wrote:
Hi,
In PR121334 we are asked to expand a const_vector of size 4 with
poly_int elements. It has 2 elts per pattern so is neither a
const_vector_duplicate nor a const_vector_stepped.
We
On Tue, Aug 12, 2025 at 06:47:54AM -0700, H.J. Lu wrote:
> On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote:
> >
> > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote:
> > >
> > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wrote:
> > > > > > > > + rtx_insn *before = nullptr;
> > > > > >
On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu wrote:
>
> On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu wrote:
> >
> > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wrote:
> > > > > > > + rtx_insn *before = nullptr;
> > > > > > > + rtx_insn *after = nullptr;
> > > > > > > + if (insn == BB_HEAD
On Tue, 12 Aug 2025, Frank Scheiner wrote:
> Dear Richard,
>
> On 12.08.25 08:46, Richard Biener wrote:
> > On Mon, 11 Aug 2025, Frank Scheiner wrote:
> >> On 11.08.25 12:59, Richard Biener wrote:
> >>> On Mon, 11 Aug 2025, Sam James wrote:
> Frank Scheiner writes:
> > On 11.08.25 09:49
For
(set (reg/v:DI 106 [ k ])
(const_int 30 [0xb2d05e00]))
...
(set (reg:V4SI 115 [ _13 ])
(vec_duplicate:V4SI (subreg:SI (reg/v:DI 106 [ k ]) 0)))
...
(set (reg:V2SI 118 [ _9 ])
(vec_duplicate:V2SI (subreg:SI (reg/v:DI 106 [ k ]) 0)))
we should generate
(set (reg:SI 125)
On Mon, Aug 11, 2025 at 11:01 PM Edwin Lu wrote:
>
> On Fri, Aug 8, 2025 at 3:23 AM Richard Biener
> wrote:
> >
> > On Wed, Aug 6, 2025 at 7:04 AM Edwin Lu wrote:
> > >
> > > This patch tries to add support for a variant of SAT_TRUNC where
> > > negative numbers are clipped to 0 instead of NARRO
Dear Richard,
On 12.08.25 08:46, Richard Biener wrote:
> On Mon, 11 Aug 2025, Frank Scheiner wrote:
>> On 11.08.25 12:59, Richard Biener wrote:
>>> On Mon, 11 Aug 2025, Sam James wrote:
Frank Scheiner writes:
> On 11.08.25 09:49, Richard Biener wrote:
>> On Sun, 10 Aug 2025, Jeff Law
The following splits up VMAT_GATHER_SCATTER into
VMAT_GATHER_SCATTER_LEGACY, VMAT_GATHER_SCATTER_IFN and
VMAT_GATHER_SCATTER_EMULATED. The main motivation is to reduce
the uses of (full) gs_info, but it also makes the kind representable
by a single entry rather than the ifn and decl tristate.
The
From: Stefan Schulze Frielinghaus
This test is about register pairs. On arm a long long is accepted in
thumb mode in any register 0-6 whereas in arm mode this is restricted to
even register pairs. Thus, in order to trigger the error even if gcc is
configured with --with-mode=thumb, add option -
Pushed to r16-3165 r15-10220 and r14-11949.
在 2025/8/8 下午4:22, mengqinggang 写道:
enum can't be used in #if.
For #if expression, identifiers that are not macros,
which are all considered to be the number zero.
This patch may fix https://sourceware.org/bugzilla/show_bug.cgi?id=32776.
gcc/ChangeLo
The following refactors the now misleading slp_done_for_suggested_uf
and slp states kept during vectorizer loop analysis.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-vect-loop.cc (vect_analyze_loop_2): Change
slp_done_for_suggested_uf to a boolean
s
This patchset LGTM except 4/7, you can go ahead to commit 1/7~3/7 if
you are OK with that :)
On Tue, Aug 12, 2025 at 4:18 PM Kuan-Lin Chen wrote:
>
> Changes since v2:
> [PATCH 1/7]
> Moved andes test cases to subdir gcc.target/riscv/xandes.
>
> [PATCH 2/7]
> Moved andes test cases to subdir
I would say no to this one since it seems apparently not right due to
its lack of correct vsetvli info, either drop from this patch set or
define those as static inline asm.
On Tue, Aug 12, 2025 at 4:18 PM Kuan-Lin Chen wrote:
>
> This patch add support for XAndesvbfhcvt ISA extension.
> This ext
Hi Umesh:
I've added you to the meeting invitation, you should be able to see
that in your google calendar :)
On Tue, Aug 12, 2025 at 5:30 PM Umesh Kalappa wrote:
>
> Hi all,
>
> Does the "RISC-V GCC Patchwork Sync-Up Meeting" is happening ,if so
>
> Please send us the calendar link ,try to goog
Hi all,
Does the "RISC-V GCC Patchwork Sync-Up Meeting" is happening ,if so
Please send us the calendar link ,try to google up the same ,but with no
luck .
Here at MIPS ,we have developed the RISCV extension core and have few GCC
changes to push back to trunk and like to be part of the meeting t
On Tue, Aug 12, 2025 at 10:14 AM Richard Sandiford
wrote:
>
> For the reasons explained in the comment, fwprop shouldn't even
> try to propagate an asm definition.
>
> Tested on aarch64-linux-gnu. Bordering on obvious, but just in case:
> OK to install?
OK.
Richard.
> Richard
>
>
> gcc/
>
On Tue, Aug 5, 2025 at 8:49 AM Andi Kleen wrote:
>
> From: Andi Kleen
>
> The GFNI AVX gf2p8affineqb instruction can be used to implement
> vectorized byte shifts or rotates. This patch uses them to implement
> shift and rotate patterns to allow the vectorizer to use them.
> Previously AVX couldn
This extension defines vector instructions to calculae of the signed/unsigned
dot product of four SEW/4-bit data and accumulate the result into a SEWbit
element for all elements in a vector register.
gcc/ChangeLog:
* config/riscv/andes-vector-builtins-bases.cc (nds_vd4dot): New class.
This patch add basic support for the following XAndes ISA extensions:
XANDESPERF
XANDESBFHCVT
XANDESVBFHCVT
XANDESVSINTLOAD
XANDESVPACKFPH
XANDESVDOT
gcc/ChangeLog:
* config/riscv/riscv-ext.def: Include riscv-ext-andes.def.
* config/riscv/riscv-ext.opt (riscv_xandes_subext): New
This extension defines vector instructions to extract a pair of FP16 data from
a floating-point register. Multiply the top FP16 data with the FP16 elements
and add the result with the bottom FP16 data.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc:
Turn on VECTOR_ELEN_FP_16
This extension defines vector load instructions to move sign-extended or
zero-extended INT4 data into 8-bit vector register elements.
gcc/ChangeLog:
* config/riscv/andes-vector-builtins-bases.cc
(nds_nibbleload): New class.
* config/riscv/andes-vector-builtins-bases.h (nds
This patch adds support for the XAndesperf ISA extension.
The 32-bit AndeStar V5 extension includes branch instructions,
load effective address instructions, and string processing
instructions for performance improvement.
New INSN patterns are added into the new file andes.md
as a seprated vender e
Changes since v2:
[PATCH 1/7]
Moved andes test cases to subdir gcc.target/riscv/xandes.
[PATCH 2/7]
Moved andes test cases to subdir gcc.target/riscv/xandes.
Replaced "equality_operator" with "any_eq".
Added code_attr "cs" for "any_eq".
Fixed comment for "*nds_bfoz4".
Modified builtins
This patch add support for XAndesvbfhcvt ISA extension.
This extension defines instructions to perform vector floating-point
conversion between the BFLOAT16 floating-point data and the IEEE-754 32-bit
single-precision floating-point (SP) data in a vector register.
gcc/ChangeLog:
* common/
This extension defines instructions to perform scalar floating-point
conversion between the BFLOAT16 floating-point data and the IEEE-754
32-bit single-precision floating-point (SP) data in a scalar
floating point register.
gcc/ChangeLog:
* config/riscv/andes.def: Add nds_fcvt_s_bf16 and
With the hybrid stmt detection no longer working as a gate-keeper
to detect unhandled stmts we have to, and can, detect those earlier.
The appropriate place is vect_mark_stmts_to_be_vectorized where
for trivially relevant PHIs we can stop analyzing when the PHI
wasn't classified as a known def duri
When inserting a compensation stmt during VN we are making sure to
register the result for the original stmt into the hashtable so
VN iteration has the chance to converge and we avoid inserting
another copy each time. But the implementation doesn't work for
non-SSA name values, and is also not nec
For the reasons explained in the comment, fwprop shouldn't even
try to propagate an asm definition.
Tested on aarch64-linux-gnu. Bordering on obvious, but just in case:
OK to install?
Richard
gcc/
PR rtl-optimization/121253
* fwprop.cc (forward_propagate_into): Don't propagate
On Tue, Aug 12, 2025 at 8:59 AM Andrew Pinski
wrote:
>
> From: Andrew Pinski
>
> Note this conflicts with my not yet approved patch for copy prop for
> aggregates into
> function arguments (I will get back to that soon).
>
> So the problem here is that I assumed if:
> *a = decl1;
> would not cau
Dropped in favor of
https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692394.html.
On Tue, 2025-08-05 at 17:34 +0800, Xi Ruoyao wrote:
> Despite LA664 has 1-cycle movgr2cf in real, it seems setting the correct
> value in the cost model has puzzled the register allocator and severely
> impacted
Now that all STMT_VINFO_VECTYPE uses from vectorizable_* have been
pruged there's no longer a need to have STMT_VINFO_VECTYPE set.
We still rely on it being present on data-ref stmts and there it
can differ between different SLP instances when doing BB vectorization.
The following removes the setti
The following passes down the vector type to functions instead of
querying it from the reduc-info stmt-info.
Bootstrapped and tested on x86_64-unknown-linux-gnu, cross-tested
aarch64-linux-gnu, pushed.
* tree-vect-loop.cc (get_initial_defs_for_reduction):
Get vector type as argume
There's one use of STMT_VINFO_VECTYPE in vectorizable_reduction
where I'm only 99% sure which SLP_TREE_VECTYPE to replace it with
(vectorizable_reduction needs a lot of post-only-SLP TLC). The
following replaces it with the hopefully appropriate one.
Bootstrapped and tested on x86_64-unknown-linu
66 matches
Mail list logo