RE: [PATCH] AArch64: Add isfinite expander [PR 66462]

2025-08-27 Thread Tamar Christina
> -Original Message- > From: Wilco Dijkstra > Sent: Wednesday, August 27, 2025 6:32 PM > To: GCC Patches > Cc: Kyrylo Tkachov ; Alex Coplan > ; Tamar Christina ; Andrew > Pinski ; Alice Carlotti > Subject: [PATCH] AArch64: Add isfinite expander [PR 66462] > > > Add an expander for isfi

RE: [PATCH v2] AArch64: Add isinf expander [PR 66462]

2025-08-27 Thread Tamar Christina
> -Original Message- > From: Wilco Dijkstra > Sent: Wednesday, August 27, 2025 6:04 PM > To: Kyrylo Tkachov > Cc: GCC Patches ; Alex Coplan > ; Tamar Christina ; Andrew > Pinski ; Alice Carlotti > Subject: [PATCH v2] AArch64: Add isinf expander [PR 66462] > > v2: Add testcase > > Add a

[PATCH] Fix _Decimal128 arithmetic error under FE_UPWARD.

2025-08-27 Thread liuhongt
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. libgcc/config/libbid/ChangeLog: PR target/120691 * bid128_div.c: Fix _Decimal128 arithmetic error under FE_UPWARD. * bid128_rem.c: Ditto. * bid128_sqrt.c: Ditto. * bid64_

[PATCH] i386: default to -mtls-dialect=gnu2 if appropriate

2025-08-27 Thread Sam James
GNU2 TLS descriptors were introduced in 2006 (r0-73091-g5bf5a10b1ccacf) but were only opt-in with -mtls-dialect=gnu2. They are more efficient and it's time to enable them by default. Builds on the --with-tls= machinery from r16-3355-g96a291c4bb0b8a. We achieve this for GNU/Linux IA-32/X86-64 targ

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vnmsac.vv to vnmsac.vx on GR2VR cost

2025-08-27 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vnmsac.vv to the vnmsac.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vnmsac.vv unsigned combine with GR2VR cost 0, 1 and 15

2025-08-27 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vnmsac.vvm combine to vnmsac.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vnmsac.vx. * gcc.target/riscv/rvv/autovec/vx_

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vnmsac.vv signed combine with GR2VR cost 0, 1 and 15

2025-08-27 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vnmsac.vvm combine to vnmsac.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check for vnmsac.vx. * gcc.target/riscv/rvv/autovec/vx_

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vnmsac.vv to vnmsac.vx on GR2VR cost

2025-08-27 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vnmsac.vv into vnmsac.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VRlike 1, 2, 15 in test. From: | ... | vmv.v.x | L1: | vnmsac.vv | J L1 | ... To: | ... | L1: |

Re: [PATCH] x86-64: Better compare source operands of *tls_dynamic_gnu2_call_64_di

2025-08-27 Thread H.J. Lu
On Tue, Aug 26, 2025 at 10:24 PM Hongtao Liu wrote: > > On Wed, Aug 27, 2025 at 6:32 AM H.J. Lu wrote: > > > > Source operands of 2 *tls_dynamic_gnu2_call_64_di patterns in > > > > (insn 10 9 11 3 (set (reg:DI 100) > > (unspec:DI [ > > (symbol_ref:DI ("caml_state") [flags

[PATCH v2] x86-64: Improve source operand check for TLS_CALL

2025-08-27 Thread H.J. Lu
Source operands of 2 TLS_CALL patterns in (insn 10 9 11 3 (set (reg:DI 100) (unspec:DI [ (symbol_ref:DI ("caml_state") [flags 0x10] ) ] UNSPEC_TLSDESC)) "x.c":7:16 1674 {*tls_dynamic_gnu2_lea_64_di} (nil)) (insn 11 10 12 3 (parallel [ (set (reg

[committed] Remove xfail marker on RISC-V test

2025-08-27 Thread Jeff Law
So yet another testsuite hygiene patch. This time turning XPASS -> PASS. My tester treats those cases the same so I didn't get notified that nozicond-2.c was passing after some recent changes. This removes the xfail marker on that test and thus the test is expected to pass now. Pushing to

[patch][v2][gcn] gcc/configure.ac + install.texi - changes to detect HAVE_AS_LEB128 [PR119367]

2025-08-27 Thread Tobias Burnus
Ok, admittedly, the following patch [v2] is cleaner and the assumption that the llvm-mc assembler is used is also inside the backend assembler spec. OK for mainline? * * * Note: I tried to see whether this patch plus the comment 14 patch makes a difference. While the testcase itself (and in our

Re: [PATCH] xtensa: Rewrite bswapsi2_internal with compact syntax

2025-08-27 Thread Max Filippov
Hi Suwa-san, On Wed, Aug 27, 2025 at 12:18 AM Takayuki 'January June' Suwa wrote: > > Also, the omission of the instruction that sets the shift amount register > (SAR) to 8 is now more efficient: it is omitted if there was a previous > BSWAP rtx in the same BB, but not omitted if no BSWAP is foun

[PUSHED] ifcvt: fix factor_out_operators (again) [PR121695]

2025-08-27 Thread Andrew Pinski
r16-2648-gaebbc90d8c7c70 had a copy and pasto where the second statement was supposed to be setting the operand 1 of the phi but it was setting operand 0 instead. This fixes typo. Push as obvious after a quick build test for x86_64-linux-gnu. PR tree-optimization/121695 gcc/ChangeLog:

Re: [patch][gcn] gcc/configure.ac + install.texi - changes to detect HAVE_AS_LEB128 [PR119367]

2025-08-27 Thread Jakub Jelinek
On Wed, Aug 27, 2025 at 07:45:14PM +0200, Tobias Burnus wrote: > PR debug/119367 > * configure.ac (check_leb128_asflags): For gcn, use "--filetype=obj > --arch=amdgcn", if supported. > * configure: Regenerate. > * doc/install.texi (amdgcn-*-*): Also add llvm-objdump to

[PATCH] configure: Add readelf fallback for HAVE_AS_ULEB128 test [PR119367]

2025-08-27 Thread Jakub Jelinek
Hi! The following patch adds a readelf fallback if objdump nor otool don't exist. All of GNU binutils readelf, eu-readelf and llvm-readelf can handle it with those options. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2025-08-27 Jakub Jelinek PR debug/119367

[COMMITTED] RISC-V: testsuite: Fix vf_vfmul and vf_vfrdiv

2025-08-27 Thread Paul-Antoine Arras
This is a minor fix to previous patches of mine: r16-3393-gf864fc36fe0db4 "Add pattern for vector-scalar single-width floating-point multiply" r16-3395-g7c2ab5865cacc4 "Add pattern for reverse floating-point divide" I'll go ahead and commit it as obvious. -- PA commit 4f1f484a06ea94ab1484445c7

[PATCH] dwarf2out: Use DW_LNS_advance_pc instead of DW_LNS_fixed_advance_pc if possible [PR119367]

2025-08-27 Thread Jakub Jelinek
Hi! In the usual case we use .loc directives and don't emit the line table manually. And assembler usually uses DW_LNS_advance_pc which has uleb128 argument and in most cases will have just a single byte operand. But if we do emit it for whatever reason (old or buggy assembler or -gno-as-loc{,vie

[PATCH] c++, v3: Fix ICE with parameter uses in expansion stmts [PR121575]

2025-08-27 Thread Jakub Jelinek
On Wed, Aug 27, 2025 at 01:45:38PM +0200, Jason Merrill wrote: > > would be wrong. Guess > > if (DECL_CONTEXT (t) > > && !uses_template_parms (DECL_CONTEXT (t))) > > RETURN (t); > > would fix these ICEs, shall I go with that > > Sounds good. The following pass

[PATCH] c++, v2: Fix auto return type deduction with expansion statements [PR121583]

2025-08-27 Thread Jakub Jelinek
On Mon, Aug 25, 2025 at 04:00:37PM -0400, Jason Merrill wrote: > On 8/25/25 12:02 PM, Jakub Jelinek wrote: > > On Mon, Aug 25, 2025 at 11:55:24AM -0400, Patrick Palka wrote: > > > > The following patch fixes that by testing DECL_TEMPLATE_INFO, dunno > > > > what else would be more appropriate for t

[PATCH] RISC-V: Add pattern for vector-scalar floating-point min

2025-08-27 Thread Paul-Antoine Arras
This pattern enables the combine pass (or late-combine, depending on the case) to merge a vec_duplicate into an smin RTL instruction. Before this patch, we have two instructions, e.g.: vfmv.v.f v2,fa0 vfmin.vv v1,v1,v2 After, we get only one: vfmin.vf v1,v1,fa0 gcc/Change

[patch][gcn] gcc/configure.ac + install.texi - changes to detect HAVE_AS_LEB128 [PR119367]

2025-08-27 Thread Tobias Burnus
PR119367 is about an overflow related to debug views, if those are not supported directly by the assembler. The problem is that dw2_asm_output_delta yields a 2-byte variable, which isn't sufficient. However, as the draft patch of comment 14 shows, dw2_asm_output_delta_uleb128 can be used to avoid

[PATCH] AArch64: Add isfinite expander [PR 66462]

2025-08-27 Thread Wilco Dijkstra
Add an expander for isfinite using integer arithmetic. This is typically faster and avoids generating spurious exceptions on signaling NaNs. This fixes part of PR66462. int isfinite1 (float x) { return __builtin_isfinite (x); } Before: fabss0, s0 mov w0, 2139095039

[PATCH v2] AArch64: Add isinf expander [PR 66462]

2025-08-27 Thread Wilco Dijkstra
v2: Add testcase Add an expander for isinf using integer arithmetic. This is typically faster and avoids generating spurious exceptions on signaling NaNs. This fixes part of PR66462. int isinf1 (float x) { return __builtin_isinf (x); } Before: fabss0, s0 mov w0, 213909

[PATCH v2] optab: Add optab for isnan [PR 101852]

2025-08-27 Thread Wilco Dijkstra
ping Add an optab for isnan. This requires changes to the existing folding code to extend the interclass_mathfn infrastructure to support BUILT_IN_ISNAN. It now checks for a valid optab before emitting the generic expansion. There is no change if no optab is defined. Update documentation, includ

Re: [committed v3] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jakub Jelinek
On Wed, Aug 27, 2025 at 03:18:55PM +0200, Tomasz Kamiński wrote: > For any minimum value of a signed type, its negation (with wraparound) results > in the same value, behaving like zero. Representing the unordered result with > this minimum value, along with 0 for equal, 1 for greater, and -1 for l

Re: [Patch, fortran] PR82205 - parametrized derived types, problems with initialization

2025-08-27 Thread Jerry D
On 8/27/25 3:00 AM, Paul Richard Thomas wrote: This patch corrects the form of PDT constructors so that they are standard conforming: structure-constructor is type-name [ ( type-param-spec-list ) ] ( [ component-spec-list ] ) At present, the type-param-spec-list for PDTs is rolled into the

Re: [Patch, fortran] PR82843 - (PDT) Constructors with PDT components do not work.

2025-08-27 Thread Jerry D
On 8/27/25 3:10 AM, Paul Richard Thomas wrote: This patch corrects errors due to PDT components taking the PDT template as their type in PDT constructors and component references. The latter took a long time to debug because yours truly did not catch on to the basic problem until a light bulb m

Re: [PATCH] vect: Extend peeling and versioning for alignment to VLA modes

2025-08-27 Thread Robin Dapp
we're seeing several dozens of ICEs in apply_scale since this patch (PR121523). I didn't pay too much attention due to vacation etc. but now coming back to this. Any specific spot I should start looking? I had a quick look and part? of the issue is that vect_gen_prolog_loop_niters returns -1

[PATCH] forwprop: Improve the reject case for copy prop [PR107051]

2025-08-27 Thread Andrew Pinski
Currently the code rejects: ``` tmp = *a; *b = tmp; ``` (unless *a == *b). This can be improved such that if a and b are known to share the same base, then only reject it if they overlap; that is the difference of the offsets (from the base) is maybe less than the size. This fixes the testcase in

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 16:11, Tomasz Kaminski wrote: > > > > On Wed, Aug 27, 2025 at 5:07 PM Tomasz Kaminski wrote: >> >> >> >> On Wed, Aug 27, 2025 at 5:00 PM Jonathan Wakely wrote: >>> >>> On Wed, 27 Aug 2025 at 15:03, Patrick Palka wrote: >>> > >>> > On Wed, 27 Aug 2025, Jonathan Wakely wrot

RE: [PATCH] Pass reduction var to vectorize_fold_left_reduction directly

2025-08-27 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, August 27, 2025 2:56 PM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > > Subject: [PATCH] Pass reduction var to vectorize_fold_left_reduction directly > > Instead of going via the PHI node def, use the scal

[PATCH] testsuite: arm: Simplify fp16-aapcs tests

2025-08-27 Thread Torbjörn SVENSSON
Reduce fp16-aapcs testcases to return value testing since parameter passing are already tested in aapcs/vfp*.c gcc/testsuite/ChangeLog: * gcc.target/arm/fp16-aapcs.c: New test. * gcc.target/arm/fp16-aapcs-1.c: Removed. * gcc.target/arm/fp16-aapcs-2.c: Likewise. * gc

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
On Wed, Aug 27, 2025 at 5:07 PM Tomasz Kaminski wrote: > > > On Wed, Aug 27, 2025 at 5:00 PM Jonathan Wakely > wrote: > >> On Wed, 27 Aug 2025 at 15:03, Patrick Palka wrote: >> > >> > On Wed, 27 Aug 2025, Jonathan Wakely wrote: >> > >> > > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński >> wrote

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
On Wed, Aug 27, 2025 at 5:00 PM Jonathan Wakely wrote: > On Wed, 27 Aug 2025 at 15:03, Patrick Palka wrote: > > > > On Wed, 27 Aug 2025, Jonathan Wakely wrote: > > > > > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński > wrote: > > > > > > > > For any minimum value of a signed type, its negation (

Re: [PATCH] libstdc++: Use _M_reverse to reverse partial_ordering using operator<=>

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 15:53, Tomasz Kamiński wrote: > > The patch r16-3414-gfcb3009a32dc33 changed the representation of unordered to > optimize reversing of order, but it did not update implementation of reversing > operator<=>(0, partial_order). > > libstdc++-v3/ChangeLog: > > * libsupc

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 16:00, Jonathan Wakely wrote: > > On Wed, 27 Aug 2025 at 15:03, Patrick Palka wrote: > > > > On Wed, 27 Aug 2025, Jonathan Wakely wrote: > > > > > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński wrote: > > > > > > > > For any minimum value of a signed type, its negation (wit

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 15:03, Patrick Palka wrote: > > On Wed, 27 Aug 2025, Jonathan Wakely wrote: > > > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński wrote: > > > > > > For any minimum value of a signed type, its negation (with wraparound) > > > results > > > in the same value, behaving like ze

[PATCH] libstdc++: Use _M_reverse to reverse partial_ordering using operator<=>

2025-08-27 Thread Tomasz Kamiński
The patch r16-3414-gfcb3009a32dc33 changed the representation of unordered to optimize reversing of order, but it did not update implementation of reversing operator<=>(0, partial_order). libstdc++-v3/ChangeLog: * libsupc++/compare (operator<=>(__cmp_cat::__unspec, partial_orderin

Re: [PATCH 1/7] arm: [MVE intrinsics] rework vgetq_lane vsetq_lane

2025-08-27 Thread Christophe Lyon
FWIW, the series is on forgejo too: https://forge.sourceware.org/gcc/gcc-TEST/pulls/68 On Wed, 27 Aug 2025 at 16:45, Christophe Lyon wrote: > > Implement vgetq_lane and vsetq_lane using the new MVE builtins > framework. > > Although MVE intrinsics are not supported in big-endian mode, we keep > t

Re: [PATCH] arm_mve: Use inline asm for lsll and asrl MVE primitives

2025-08-27 Thread Christophe Lyon
Hi, On Thu, 14 Aug 2025 at 11:27, Christophe Lyon wrote: > > Hi Keith, > > On Tue, 12 Aug 2025 at 18:33, Keith Packard wrote: > > > > The C shift operators do not precisely match the associated ARM > > instructions: shifts of negative values or by negative amounts are > > undefined behavior in C

[PATCH 6/7] arm: [MVE intrinsics] rework sqshll srshrl uqshll urshrl

2025-08-27 Thread Christophe Lyon
Implement sqshll, srshrl, uqshll and urshrl using the new MVE builtins framework. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (enum which_scalar_shift): Add ss_SQSHLL, ss_SRSHRL, ss_UQSHLL, and ss_URSHRL. (mve_function_scalar_shift): Add support for ss_SQSHLL, ss_

[PATCH 7/7] arm: [MVE intrinsics] rework sqrshr sqshl srshr uqrshl uqshl urshr

2025-08-27 Thread Christophe Lyon
Implement sqrshr, sqshl, srshr, uqrshl, uqshl and urshr using the new MVE builtins framework. The patch fixes a probable copy/paste typo in mve_sqshl_si and mve_srshr_si: operand 1 should have mode SI, and not DI. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (enum which_scalar_sh

[PATCH 1/7] arm: [MVE intrinsics] rework vgetq_lane vsetq_lane

2025-08-27 Thread Christophe Lyon
Implement vgetq_lane and vsetq_lane using the new MVE builtins framework. Although MVE intrinsics are not supported in big-endian mode, we keep the code to convert lane indices into GCC's vector indices, so that it's already in place in case we want to support big-endian in the future. The patch

[PATCH 5/7] arm: [MVE intrinsics] rework asrl sqrshrl

2025-08-27 Thread Christophe Lyon
Implement asrl and sqrshrl using the new MVE builtins framework. The patch adds a testcase calling asrl (value, -10) to make sure the compiler keeps the asrl instruction: before this patch, it would be interpreted as undefined behavior and return 0. gcc/ChangeLog: * config/arm/arm-mve-bu

[PATCH 4/7] arm: [MVE intrinsics] rework lsll uqrshll uqrshll_sat48

2025-08-27 Thread Christophe Lyon
Implement lsll, uqrshll and uqrshll_sat48 using the new MVE builtins framework. The new enum uses the 'ss_' prefix to avoid clashes with some unspec identifiers in later patches in the series. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (enum which_scalar_shift): New. (c

[PATCH 2/7] arm: [MVE intrinsics] rework vpnot

2025-08-27 Thread Christophe Lyon
Implement vpnot using the new MVE builtins framework. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class mve_function_vpnot): New. (vpnot): New. * config/arm/arm-mve-builtins-base.def (vpnot): New. * config/arm/arm-mve-builtins-base.h (vpnot): New.

[PATCH 3/7] arm: fix MVE asrl lsll lsrl patterns

2025-08-27 Thread Christophe Lyon
The thumb2_asrl, thumb2_lsll and thumb2_lsrl patterns were incorrecly using (match_dup 0) for the first argument of the shift operator. This patch replaces that with (match_operand:DI 1 arm_general_register_operandarm_general_register_operand "0") and fixes the related expanders in arm.md to use t

Re: [PATCH] c++/modules: Add explanatory note for incomplete types with definition in different module [PR119844]

2025-08-27 Thread Jason Merrill
On 8/27/25 8:29 AM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- The confusion in the PR arose because the definition of 'User' in a separate named module did not provide an implementation for the forward-declaration in the global modul

Re: [PATCH v2 1/3] AArch64: Support C/C++ operations on svbool_t

2025-08-27 Thread Jason Merrill
On 8/27/25 6:36 AM, Tejas Belagod wrote: +tree +c_common_bool_type (unsigned int precision, bool unsigned_p) +{ +  /* Standard boolean types.  */ +  if (precision == TYPE_PRECISION (boolean_type_node) +  && unsigned_p == TYPE_UNSIGNED (boolean_type_node)) +    return boolean_type_node; + + 

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
On Wed, Aug 27, 2025 at 4:03 PM Patrick Palka wrote: > On Wed, 27 Aug 2025, Jonathan Wakely wrote: > > > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński > wrote: > > > > > > For any minimum value of a signed type, its negation (with wraparound) > results > > > in the same value, behaving like zero

[PATCH 12/14] lto: Check partitioning of toplevel assembly related symbols

2025-08-27 Thread Michal Jires
Check that must_remain_in_tu is partitioned correctly, and that refereced_from_asm is not renamed. gcc/lto/ChangeLog: * lto-partition.cc (lto_1_to_1_map): must_remain_in_tu check. (privatize_symbol_name_1): refereced_from_asm check. --- gcc/lto/lto-partition.cc | 43 +

[PATCH 04/14] lto: Simplify control variable in loop of balanced partitioning

2025-08-27 Thread Michal Jires
Minor simplification as preparation for next patch. gcc/lto/ChangeLog: * lto-partition.cc (lto_balanced_map): Simplify. --- gcc/lto/lto-partition.cc | 26 -- 1 file changed, 8 insertions(+), 18 deletions(-) diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-part

[PATCH 14/14] lto: Fix SegFault in ICF caused by missing definition

2025-08-27 Thread Michal Jires
ICF assumes that all nodes in summaries are defined. lto_symtab_merge_symbols can set node's definition flag to false. remove_unreachable_nodes then releases its body without setting body_removed. This results in SegFault in ICF. It might be better to solve it in those other places, this patch is

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Patrick Palka
On Wed, 27 Aug 2025, Jonathan Wakely wrote: > On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński wrote: > > > > For any minimum value of a signed type, its negation (with wraparound) > > results > > in the same value, behaving like zero. Representing the unordered result > > with > > this minimum va

[PATCH 13/14] lto: Toplevel assembly tests

2025-08-27 Thread Michal Jires
gcc/testsuite/ChangeLog: * gcc.dg/lto/toplevel-asm_0.c: New test. * gcc.dg/lto/toplevel-asm_1.c: New test. * gcc.dg/lto/toplevel-asm_2.c: New test. --- gcc/testsuite/gcc.dg/lto/toplevel-asm_0.c | 8 gcc/testsuite/gcc.dg/lto/toplevel-asm_1.c | 7 +++ gcc/testsu

[PATCH 11/14] lto: Disable optimizations conflicting with must_remain_in_tu

2025-08-27 Thread Michal Jires
These optimizations may cross translation unit boundaries, so disable them when must_remain_in_tu is set. gcc/ChangeLog: * cif-code.def (MUST_REMAIN_IN_TU): New. * ipa-icf.cc (sem_function::equals_wpa): False with must_remain_in_tu. (sem_variable::equals_wpa): Like

[PATCH] Pass reduction var to vectorize_fold_left_reduction directly

2025-08-27 Thread Richard Biener
Instead of going via the PHI node def, use the scalar reduction input from the reduction stmt. Bootstrapped and tested on x86_64-unknown-linux-gnu. Sorry for the re-post, this is now re-based on pristine trunk, hopefully making git am happy. * tree-vect-loop.cc (vectorize_fold_left_reduc

[PATCH 09/14] lto: Add toplevel assembly heuristics

2025-08-27 Thread Michal Jires
This new pass heuristically detects symbols referenced by toplevel assembly to prevent their optimization. Heuristics is done by comparing identifiers in assembly to known symbols. The pass is split into 2 passes, in LGEN and in WPA. There must be one pass for WPA to be able to reference any symb

[PATCH 08/14] lto: Add toplevel assembly flags to symtab_node

2025-08-27 Thread Michal Jires
Add referenced_from_asm and must_remain_in_tu and propagate them for following patches. gcc/ChangeLog: * cgraph.h: Add flags. * cgraphclones.cc (cgraph_node::create_clone): Propagate flags. * lto-cgraph.cc (lto_output_node): Likewise. (lto_output_varpool_node): Lik

[PATCH 02/14] lto: Keep lto file data

2025-08-27 Thread Michal Jires
We use lto_file_data in 1to1 partitioning, so we need to not zero it out. Nothing depends on lto_file_data being NULL. gcc/ChangeLog: * cgraph.cc (cgraph_node::release_body): Keep lto_file_data. (cgraph_node::remove): likewise. * lto-section-in.cc (lto_free_function_in_dec

[PATCH 06/14] lto: Partition toplevel assembly in 1to1

2025-08-27 Thread Michal Jires
1to1 partitioning now also partitions toplevel assembly. Other partitionings keep the old behavior of putting all toplevel assembly into single partition. gcc/ChangeLog: * lto-cgraph.cc (compute_ltrans_boundary): Add asm_node. gcc/lto/ChangeLog: * lto-partition.cc (create_partit

[PATCH 10/14] lto: Forbid privatization of symbols referenced from assembly

2025-08-27 Thread Michal Jires
If privitized symbol needs to be made public, it is renamed. That is not possible for symbols referenced from assembly. Thus we forbid privatization of those symbols. gcc/ChangeLog: * cgraph.h (cgraph_node::only_called_directly_or_aliased_p): Consider referenced_from_asm.

[PATCH 07/14] lto: Stream out partitioned toplevel assembly

2025-08-27 Thread Michal Jires
Toplevel assembly is now streamed as partitioned instead of into the first partition. gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Remove asm_nodes_out. * lto-streamer-out.cc (lto_output_toplevel_asms): Use partitioning. (create_order_remap): Remove asm_nodes_ou

Re: [PATCH] Avoid mult pattern if that will break reduction constraints

2025-08-27 Thread Jakub Jelinek
On Wed, Aug 27, 2025 at 03:50:26PM +0200, Richard Biener wrote: > Ah, thanks for the write-up how to compute the number of uses. > > I'll test the following. > --- a/gcc/tree-vect-patterns.cc > +++ b/gcc/tree-vect-patterns.cc > @@ -4303,6 +4303,8 @@ vect_synth_mult_by_constant (vec_info *vinfo, tr

[PATCH 05/14] lto: Use toplevel_node in lto_symtab_encoder

2025-08-27 Thread Michal Jires
This patch replaces symtab_node with toplevel_node in lto_symtab_encoder and modifies all places where lto_symtab_encoder is used to handle (ignore) asm_node. gcc/ChangeLog: * ipa-icf.cc (sem_item_optimizer::write_summary): Use toplevel_node. (sem_item_optimizer::read_sect

[PATCH 03/14] cgraph: Add toplevel_node

2025-08-27 Thread Michal Jires
asm_node and symbol_node will now inherit from toplevel_node. This is now useful for lto partitioning, in future it should be also useful for toplevel extended assembly. gcc/ChangeLog: * cgraph.h (enum symtab_type): Replace with toplevel_type. (enum toplevel_type): New. (s

[PATCH 01/14] lto: Fix reversed sorting of node order.

2025-08-27 Thread Michal Jires
Sorting by node order in lto partitioning is incorrectly reversed. For default balanced partitioning this caused all noreorder symbols to be partitioned into a single partition where they were sorted again, but correctly. gcc/lto/ChangeLog: * lto-partition.cc (cmp_partitions_order): Rever

Re: [PATCH] PR target/89828 Inernal compiler error on "-fno-omit-frame-pointer"

2025-08-27 Thread Yoshinori Sato
On Tue, 26 Aug 2025 07:50:13 +0900, Jeff Law wrote: > > > > On 8/25/25 9:00 AM, Yoshinori Sato wrote: > > The problem was caused by an erroneous note about creating a stack frame, > > which caused the cur_cfa reg to fail to assert with a value other than > > the frame pointer. > > > > This fix

[PATCH 00/14] lto: Linux LTO toplevel assembly

2025-08-27 Thread Michal Jires
These patches allow us to handle toplevel assembly referencing symbols. Previous linux kernel patches needed to mark any such referenced symbols manually. Currently needed linux patches are here: https://gitlab.com/mixal_iirec/linux_gcc_lto_patches First part of these patches allows toplevel asse

Re: [PATCH] Avoid mult pattern if that will break reduction constraints

2025-08-27 Thread Richard Biener
On Wed, 27 Aug 2025, Jakub Jelinek wrote: > On Wed, Aug 27, 2025 at 02:48:33PM +0200, Richard Biener wrote: > > I don't understand how synth-mult works, but it does introduce > > multiple uses of a reduction variable which will ultimatively > > fail vectorization (or ICE with a pending change). S

[PATCH] Pass reduction var to vectorize_fold_left_reduction directly

2025-08-27 Thread Richard Biener
Instead of going via the PHI node def, use the scalar reduction input from the reduction stmt. Bootstrapped and tested on x86_64-unknown-linux-gnu. I'll wait for risc-v CI and hope for aarch64 CI as well ... Thanks, Richard. * tree-vect-loop.cc (vectorize_fold_left_reduction): Get

Re: [PATCH] Avoid mult pattern if that will break reduction constraints

2025-08-27 Thread Jakub Jelinek
On Wed, Aug 27, 2025 at 02:48:33PM +0200, Richard Biener wrote: > I don't understand how synth-mult works, but it does introduce > multiple uses of a reduction variable which will ultimatively > fail vectorization (or ICE with a pending change). So avoid > applying the pattern. I've tried to do s

[PATCH] Remove dead code

2025-08-27 Thread Richard Biener
The following removes trivially dead code. Built on x86_64-unknown-linux-gnu, pushed. * tree-vect-loop.cc (vect_transform_cycle_phi): Remove unused reduc_stmt_info. --- gcc/tree-vect-loop.cc | 2 -- 1 file changed, 2 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vec

[PATCHv11] libstdc++: New generate_canonical impl (P0952, LWG2524) [PR119739]

2025-08-27 Thread Nathan Myers
Changes in v11: * Add doxygen entry for generate_canonical. Changes in v10: * Rewrite entirely after consultation with P0952 authors. * Require radix2 for all float types. * Perform all intermediate calculations on integers. * Use one implementation for all C++11 to C++26. * Optimize for eac

[committed v3] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kamiński
For any minimum value of a signed type, its negation (with wraparound) results in the same value, behaving like zero. Representing the unordered result with this minimum value, along with 0 for equal, 1 for greater, and -1 for less in partial_ordering, allows its value to be reversed using unary ne

Re: [PATCH v2] match.pd: Fold (C << x) x -> 0 or 1

2025-08-27 Thread Richard Biener
On Mon, 25 Aug 2025, dhr...@nvidia.com wrote: > From: Dhruv Chawla > > For ==, < and <=, the fold is to 0. For !=, > and >=, the fold is to 1. > This only applies when C != 0. So -50 << 1 < 1 is true, so does this only work for unsigned types, or tree_expr_nonnegative_p in addition to tree_expr

[PATCH] libstdc++: Merge bind_front and bind_back binders

2025-08-27 Thread Tomasz Kamiński
The _Bind_front and _Bind_back class templates are now merged into a single _Binder implementation that accepts _Back as a template parameter. This makes the bind_back implementation available in C++20 mode, allowing it to be used for range adaptor closures. With zero bound arguments, bind_back an

[PATCH] Avoid mult pattern if that will break reduction constraints

2025-08-27 Thread Richard Biener
I don't understand how synth-mult works, but it does introduce multiple uses of a reduction variable which will ultimatively fail vectorization (or ICE with a pending change). So avoid applying the pattern. I've tried to do so selectively, possibly preserving pattern-matching x * 4 as x << 2. So

Ping^6: [PATCH v4] get source line for diagnostic from preprocessed file [PR preprocessor/79106]

2025-08-27 Thread Bader, Lucas
Gentle ping for https://gcc.gnu.org/pipermail/gcc-patches/2025-March/676875.html

[PATCH] c++/modules: Add explanatory note for incomplete types with definition in different module [PR119844]

2025-08-27 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- The confusion in the PR arose because the definition of 'User' in a separate named module did not provide an implementation for the forward-declaration in the global module. This seems likely to be a common mistake while p

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
On Wed, Aug 27, 2025 at 12:48 PM Jonathan Wakely wrote: > On Wed, 27 Aug 2025 at 11:36, Tomasz Kaminski wrote: > > > > Again, pretty printers needs to get updated: > > iff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py > b/libstdc++-v3/python/libstdcxx/v6/printers.py > > index 5f5963cb595.

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 11:05, Tomasz Kamiński wrote: > > For any minimum value of a signed type, its negation (with wraparound) results > in the same value, behaving like zero. Representing the unordered result with > this minimum value, along with 0 for equal, 1 for greater, and -1 for less > in

Re: [PATCH] c++: Fix up cpp_warn on __STDCPP_FLOAT*_T__ [PR121520]

2025-08-27 Thread Jason Merrill
On 8/27/25 4:03 AM, Jakub Jelinek wrote: Hi! I got the cpp_warn on __STDCPP_FLOAT*_T__ if we aren't predefining those wrong, so e.g. on powerpc64le we don't diagnose #undef __STDCPP_FLOAT16_T__. I've added it as else if on the if (c_dialect_cxx () && cxx_dialect > cxx20 && !floatn_nx_types[i].ex

Re: [PATCH] c, c++: Allow &__real__ static_var in constant expressions [PR121678]

2025-08-27 Thread Jason Merrill
On 8/27/25 4:13 AM, Jakub Jelinek wrote: Hi! When looking at constexpr references, I've noticed staticp handles COMPONENT_REFs and ARRAY_REFs (the latter if the index is INTEGER_CST), but not {REAL,IMAG}PART_EXPR. I think that is incorrect and causes rejection of constexpr (for C++) or static c

Re: [PATCH] c++, v2: Fix ICE with parameter uses in expansion stmts [PR121575]

2025-08-27 Thread Jason Merrill
On 8/27/25 5:22 AM, Jakub Jelinek wrote: On Tue, Aug 26, 2025 at 12:26:17AM +0200, Jakub Jelinek wrote: Unfortunately that version (unlike the previous one) regresses: ... UNRESOLVED: std/ranges/access/rend.cc -std=gnu++26 compilation failed to produce executable Will try to investigate wha

Re: [PATCH v3 0/3] implement defer statements as per ts 25755

2025-08-27 Thread Joseph Myers
On Tue, 26 Aug 2025, Anna (navi) Figueiredo Gomes wrote: > this is similar to the failure test `i ()` in gcc/testsuite/gcc.dg/defer-2.c, > and the technical specification covers such case: > > > 6.4: > > Constraints: > > Jumps by means of goto in E shall not jump over a defer statement in E.

Re: [PATCH v2 2/2] testsuite: arm: factorize arm_v8_neon_ok flags

2025-08-27 Thread Torbjorn SVENSSON
On 2025-08-18 19:24, Christophe Lyon wrote: Like we do in other effective-targets, add "-mcpu=unset -march=armv8-a" directly when setting et_arm_v8_neon_flags in arm_v8_neon_ok_nocache, to avoid having to add these two flags in all users of arm_v8_neon_ok. This avoids duplication and possible

Re: [PATCH v4 9/9] aarch64: Add memtag-stack tests

2025-08-27 Thread Claudiu Zissulescu-Ianculescu
Hi, On 8/22/25 5:51 PM, Richard Sandiford wrote: >> +/* FIXME - scan-assembler-times-not subg ? */ >> +/* FIXME - generate stgp instead of stg + str ? */ > > Were you planning to address these in this series, or are they TODOs > for future work? Similarly for the other FIXMEs. I am planning a

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 11:36, Tomasz Kaminski wrote: > > Again, pretty printers needs to get updated: > iff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py > b/libstdc++-v3/python/libstdcxx/v6/printers.py > index 5f5963cb595..cf510433fd4 100644 > --- a/libstdc++-v3/python/libstdcxx/v6/printe

Re: [PATCH] Restrict avx256_avoid_vec_perm only for loop vectorization.

2025-08-27 Thread Hongtao Liu
On Wed, Aug 27, 2025 at 4:53 PM Richard Biener wrote: > > On Wed, Aug 27, 2025 at 6:57 AM liuhongt wrote: > > > > Since kind == vec_perm may not be a real vec_perm, just a broadcast or > > simple load in BB vectorizer. > > Btw, you can now (in some cases) do better, namely you should > always hav

Re: [PATCH v2 1/3] AArch64: Support C/C++ operations on svbool_t

2025-08-27 Thread Tejas Belagod
+tree +c_common_bool_type (unsigned int precision, bool unsigned_p) +{ +  /* Standard boolean types.  */ +  if (precision == TYPE_PRECISION (boolean_type_node) +  && unsigned_p == TYPE_UNSIGNED (boolean_type_node)) +    return boolean_type_node; + +  /* Non-standard boolean types created by

Re: [PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
Again, pretty printers needs to get updated: iff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py index 5f5963cb595..cf510433fd4 100644 --- a/libstdc++-v3/python/libstdcxx/v6/printers.py +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py @@ -1749,7

[PATCH v2] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kamiński
For any minimum value of a signed type, its negation (with wraparound) results in the same value, behaving like zero. Representing the unordered result with this minimum value, along with 0 for equal, 1 for greater, and -1 for less in partial_ordering, allows its value to be reversed using unary ne

[Patch, fortran] PR82843 - (PDT) Constructors with PDT components do not work.

2025-08-27 Thread Paul Richard Thomas
This patch corrects errors due to PDT components taking the PDT template as their type in PDT constructors and component references. The latter took a long time to debug because yours truly did not catch on to the basic problem until a light bulb moment, triggered by an excess of coffee :-) With th

[Patch, fortran] PR82205 - parametrized derived types, problems with initialization

2025-08-27 Thread Paul Richard Thomas
This patch corrects the form of PDT constructors so that they are standard conforming: structure-constructor is type-name [ ( type-param-spec-list ) ] ( [ component-spec-list ] ) At present, the type-param-spec-list for PDTs is rolled into the component-spec-list. The patch separates the type-par

Re: [PATCH] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Tomasz Kaminski
On Wed, Aug 27, 2025 at 11:15 AM Jonathan Wakely wrote: > On Wed, 27 Aug 2025 at 07:33, Tomasz Kamiński wrote: > > > > For any minimum value of a signed type, its negation (with wraparound) > results > > in the same value, behaving like zero. Representing the unordered result > with > > this min

Re: [PATCH] c++, v2: Fix ICE with parameter uses in expansion stmts [PR121575]

2025-08-27 Thread Jakub Jelinek
On Tue, Aug 26, 2025 at 12:26:17AM +0200, Jakub Jelinek wrote: > Unfortunately that version (unlike the previous one) regresses: ... > UNRESOLVED: std/ranges/access/rend.cc -std=gnu++26 compilation failed to > produce executable > > Will try to investigate what's going on tomorrow^H^H^Htoday. T

[PATCH] tree-optimization/121686 - failed SLP discovery for live recurrence

2025-08-27 Thread Richard Biener
The following adjusts the SLP build for only-live stmts to not only consider vect_induction_def and vect_internal_def that are not part of a reduction but instead consider all non-reduction defs that are not part of a reduction, specifically in this case a recurrence def. This is also a missed opt

Re: [PATCH] libsupc++: Change _Unordered comparison value to minimum value of signed char.

2025-08-27 Thread Jonathan Wakely
On Wed, 27 Aug 2025 at 07:33, Tomasz Kamiński wrote: > > For any minimum value of a signed type, its negation (with wraparound) results > in the same value, behaving like zero. Representing the unordered result with > this minimum value, along with 0 for equal, 1 for greater, and -1 for less > in

  1   2   >