[PATCH] middle-end/111591 - explain why TBAA doesn't need adjustment

2023-12-12 Thread Richard Biener
While tidying the prototype patch I've done for the reduced testcase in PR111591 and in that process trying to produce a testcase that is miscompiled by stack slot coalescing and the TBAA info that remains un-altered I've realized we do not need to adjust TBAA info. The following documents this

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Richard Biener
On Tue, 12 Dec 2023, Peter Bergner wrote: > On 12/12/23 8:36 PM, Jason Merrill wrote: > > This test is failing for me below C++17, I think you need > > > > // { dg-do compile { target c++17 } } > > or > > // { dg-require-effective-target c++17 } > > Sorry about that. Should we do the above or

[PATCH] Force broadcast constant to mem for vec_dup{v4di, v8si, v4df, v8df} when TARGET_AVX2 is not available.

2023-12-12 Thread liuhongt
vpbroadcastd/vpbroadcastq is avaiable under TARGET_AVX2, but vec_dup{v4di,v8si} pattern is avaiable under AVX with memory operand. And it will cause LRA/Reload to generate spill and reload if we put constant in register. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Xi Ruoyao
On Wed, 2023-12-13 at 14:32 +0800, Jiahao Xu wrote: > > 在 2023/12/13 下午2:21, Xi Ruoyao 写道: > > On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote: > > > This test was extracted from the hot functions of 526.blender_r. Setting > > > LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in

[PATCH] RISC-V: Fix dynamic lmul tests depended on abi

2023-12-12 Thread demin . han
These two tests depend on -mabi. Other toolchain configs would report: fatal error: gnu/stubs-ilp32.h: No such file or directory gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c: Fix abi issue * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c:

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Jiahao Xu
在 2023/12/13 下午2:21, Xi Ruoyao 写道: On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote: This test was extracted from the hot functions of 526.blender_r. Setting LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in dynamic instruction count and a 13.4% performance improvement. After

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Xi Ruoyao
On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote: > This test was extracted from the hot functions of 526.blender_r. Setting > LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in dynamic > instruction count and a 13.4% performance improvement. After applying > the patch mentioned

Re: [gcc-wwwdocs PATCH] gcc-13/14: Mention recent update for x86_64 backend

2023-12-12 Thread Gerald Pfeifer
On Fri, 8 Dec 2023, Haochen Jiang wrote: > +++ b/htdocs/gcc-13/changes.html > +Based on ISA extensions enabled on Alder Lake, the switch further enables > +the AVX-IFMA, AVX-VNNI-INT8, AVX-NE-CONVERT, CMPccXADD, ENQCMD and UINTR > +ISA extensions. Personally I would alphabetically

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Jiahao Xu
在 2023/12/13 上午2:27, Xi Ruoyao 写道: On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote: On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote: I guess here the problem is floating-point compare instruction is much more costly than other instructions but the fact is not correctly modeled yet. 

[PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-12 Thread Juzhe-Zhong
Fix VSETVL BUG that AVL is polluted .L15: li a3,9 lui a4,%hi(s) sw a3,%lo(j)(t2) sh a5,%lo(s)(a4) <--a4 is hold the address of s beq t0,zero,.L42 sw t5,8(t4) vsetvli zero,a4,e8,m8,ta,ma <<--- a4 as avl Actually,

RE: [PATCH] [gcc-wwwdocs]gcc-13/14: Mention Intel new ISA and march support

2023-12-12 Thread Gerald Pfeifer
On Mon, 27 Nov 2023, Jiang, Haochen wrote: >> How about changing this to use "and", as in >> "The switch enables the AMX-FP16, PREFETCHI ISA extensions." >> ? > Ok for me. Done and pushed thusly. Gerald commit 617a25d7d89a9cce121e85b693eed1ee3f94354b Author: Gerald Pfeifer Date: Wed Dec

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 8:36 PM, Jason Merrill wrote: > This test is failing for me below C++17, I think you need > > // { dg-do compile { target c++17 } } > or > // { dg-require-effective-target c++17 } Sorry about that. Should we do the above or should we just add -std=c++17 to dg-options? ...or do we

Re: [PATCH] c++: End lifetime of objects in constexpr after destructor call [PR71093]

2023-12-12 Thread Jason Merrill
On 12/12/23 12:50, Jason Merrill wrote: On 12/12/23 10:24, Jason Merrill wrote: On 12/12/23 06:15, Jakub Jelinek wrote: On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote: On Tue, 12 Dec 2023, Jakub Jelinek wrote: On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill

Re: [PATCH] RISC-V: Add Zvfbfmin extension to the -march= option

2023-12-12 Thread Palmer Dabbelt
On Tue, 12 Dec 2023 19:24:51 PST (-0800), zengx...@eswincomputing.com wrote: This patch would like to add new sub extension (aka Zvfbfmin) to the -march= option. It introduces a new data type BF16. Depending on different usage scenarios, the Zvfbfmin extension may depend on 'V' or 'Zve32f'.

[PATCH] RISC-V: Don't make Ztso imply A

2023-12-12 Thread Palmer Dabbelt
I can't actually find anything in the ISA manual that makes Ztso imply A. In theory the memory ordering is just a different thing that the set of availiable instructions (ie, Ztso without A would still imply TSO for loads and stores). It also seems like a configuration that could be sane to

Re: [PATCH DejaGNU 1/1] Support per-test execution timeout factor

2023-12-12 Thread Jacob Bachmeyer
Maciej W. Rozycki wrote: Add support for the `test_timeout_factor' global variable letting a test case scale the wait timeout used for code execution. This is useful for particularly slow test cases for which increasing the wait timeout globally would be excessive. *

[PATCH] RISC-V: Add Zvfbfmin extension to the -march= option

2023-12-12 Thread Xiao Zeng
This patch would like to add new sub extension (aka Zvfbfmin) to the -march= option. It introduces a new data type BF16. Depending on different usage scenarios, the Zvfbfmin extension may depend on 'V' or 'Zve32f'. This patch only implements dependencies in scenario of Embedded Processor. In

[PATCH #2a/2] strub: indirect volatile parms in wrappers

2023-12-12 Thread Alexandre Oliva
[sorry that the previous, unfinished post got through] On Dec 12, 2023, Richard Biener wrote: > On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote: >> DECL_NOT_GIMPLE_REG_P (arg) = 0; > I wonder why you clear this at all? That code seems to be inherited from expand_thunk. ISTR that flag

[PATCH #2a/2]

2023-12-12 Thread Alexandre Oliva
On Dec 12, 2023, Richard Biener wrote: > On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote: >> DECL_NOT_GIMPLE_REG_P (arg) = 0; > I wonder why you clear this at all? That code seems to be inherited from expand_thunk. ISTR that flag was not negated when I started the strub implementation,

Re: PING^1 [PATCH] range: Workaround different type precision issue between _Float128 and long double [PR112788]

2023-12-12 Thread Kewen.Lin
Hi Jakub & Andrew, on 2023/12/12 22:42, Jakub Jelinek wrote: > On Tue, Dec 12, 2023 at 09:33:38AM -0500, Andrew MacLeod wrote: >> I leave this for the release managers, but I am not opposed to it for this >> release... It would be nice to remove it for the next release > > I can live with it for

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread chenglulu
在 2023/12/13 上午2:27, Xi Ruoyao 写道: On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote: fld.s $f1,$r4,0 fld.s $f0,$r4,4 fld.s $f3,$r4,8 fld.s $f2,$r4,12 fcmp.slt.s $fcc1,$f0,$f3 fcmp.sgt.s $fcc0,$f1,$f2 movcf2gr

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Jason Merrill
On 12/12/23 17:50, Peter Bergner wrote: On 12/12/23 1:26 PM, Richard Biener wrote: Am 12.12.2023 um 19:51 schrieb Peter Bergner : On 12/12/23 12:45 PM, Peter Bergner wrote: +/* PR target/112822 */ Oops, this should be: /* PR tree-optimization/112822 */ It's fixed on my end. Ok Pushed

[PATCH] i386: Fix PR110790 testcase

2023-12-12 Thread Haochen Jiang
Hi all, This patch will fix the testcase fail previously introduced. Approved by another thread: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640288.html Pushed to trunk. Thx, Haochen gcc/testsuite/ChangeLog: * gcc.target/i386/pr110790-2.c: Change scan-assembler from shrq

RE: [RFC] Intel AVX10.1 Compiler Design and Support

2023-12-12 Thread Jiang, Haochen
> > On the other hand, a new EVEX-capable level might bring earlier adoption > > of EVEX capabilities to AMD CPUs, which still should be an improvement > > over AVX2. This could benefit AMD as well. So I would really like to > > see some AMD feedback here. > > > > There's also the matter that

[PATCH v2] LoongArch: Modify the check type of the vector builtin function.

2023-12-12 Thread chenxiaolong
On LoongArch architecture, using the latest gcc14 in regression test, it is found that the vector test cases in vector directory appear FAIL entries with unmatched pointer types. In order to solve this kind of problem, the type of the variable in the check result is modified with the parameter

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread chenglulu
在 2023/12/13 上午2:27, Xi Ruoyao 写道: fld.s $f1,$r4,0 fld.s $f0,$r4,4 fld.s $f3,$r4,8 fld.s $f2,$r4,12 fcmp.slt.s $fcc1,$f0,$f3 fcmp.sgt.s $fcc0,$f1,$f2 movcf2gr$r13,$fcc1 movcf2gr$r12,$fcc0

Re: [PATCH] aarch64/expr: Use ccmp when the outer expression is used twice [PR100942]

2023-12-12 Thread Andrew Pinski
On Tue, Dec 12, 2023 at 12:22 AM Andrew Pinski wrote: > > Ccmp is not used if the result of the and/ior is used by both > a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation > here by using ccmp in this case. > Two changes is required, first we need to allow the outer statement's

Re: [PATCH V4 1/3]rs6000: accurate num_insns_constant_gpr

2023-12-12 Thread Jiufu Guo
Hi, "Kewen.Lin" writes: > Hi Jeff, > > on 2023/12/11 11:26, Jiufu Guo wrote: >> Hi, >> >> Trunk gcc supports more constants to be built via two instructions: >> e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic". >> And then num_insns_constant should also be updated. >> >> Function

Re: [PATCH V4 2/3] Using pli for constant splitting

2023-12-12 Thread Jiufu Guo
Hi, "Kewen.Lin" writes: > Hi, > > on 2023/12/11 11:26, Jiufu Guo wrote: >> Hi, >> >> For constant building e.g. r120=0x, which does not fit 'li or lis', >> 'pli' is used to build this constant via 'emit_move_insn'. >> >> While for a complicated constant, e.g. 0xULL,

Re: [PATCH] c++: Fix warmth propagation for member function templates

2023-12-12 Thread Jason Merrill
On 12/12/23 14:29, Jason Xu wrote: Support was recently added for class-level warmth attributes that are propagated to member functions. The current implementation ignores member function templates and this patch fixes that. Thanks! I'm applying this variant of the patch: From

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Hongtao Liu
On Tue, Dec 12, 2023 at 10:38 PM Jan Hubicka wrote: > > Hi, > this patch disables use of FMA in matrix multiplication loop for generic (for > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > For Intel this is neutral both on the matrix multiplication microbenchmark >

Re: [PATCH] c++: unifying constants vs their type [PR99186, PR104867]

2023-12-12 Thread Patrick Palka
On Tue, 12 Dec 2023, Patrick Palka wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK > for trunk? > > -- >8 -- > > When unifying constants we need to generally treat constants of > different types but same value as different, in light of auto template > parameters.

[PATCH] libcpp: Fix macro expansion for argument of __has_include [PR110558]

2023-12-12 Thread Lewis Hyatt
Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558 This is a small fix for the libcpp issue noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it ok for trunk please? Also, it's not a regression, having never worked since __has_include was introduced in GCC 5, but

Re: [PATCH GCC 1/1] testsuite: Support test execution timeout factor as a keyword

2023-12-12 Thread Jeff Law
On 12/12/23 07:04, Maciej W. Rozycki wrote: Add support for the `dg-test-timeout-factor' keyword letting a test case scale the wait timeout used for code execution, analogously to `dg-timeout-factor' used for code compilation. This is useful for particularly slow test cases for which

Re: [PATCH DejaGNU 1/1] Support per-test execution timeout factor

2023-12-12 Thread Jeff Law
On 12/12/23 07:04, Maciej W. Rozycki wrote: Add support for the `test_timeout_factor' global variable letting a test case scale the wait timeout used for code execution. This is useful for particularly slow test cases for which increasing the wait timeout globally would be excessive.

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 1:26 PM, Richard Biener wrote: >> Am 12.12.2023 um 19:51 schrieb Peter Bergner : >> >> On 12/12/23 12:45 PM, Peter Bergner wrote: >>> +/* PR target/112822 */ >> >> Oops, this should be: >> >> /* PR tree-optimization/112822 */ >> >> It's fixed on my end. > > Ok Pushed now that Martin

[PATCH v3] c++: fix ICE with sizeof in a template [PR112869]

2023-12-12 Thread Marek Polacek
On Fri, Dec 08, 2023 at 11:09:15PM -0500, Jason Merrill wrote: > On 12/8/23 16:15, Marek Polacek wrote: > > On Fri, Dec 08, 2023 at 12:09:18PM -0500, Jason Merrill wrote: > > > On 12/5/23 15:31, Marek Polacek wrote: > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > >

[committed] libstdc++: Fix std::format("{}", 'c')

2023-12-12 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8-- When I added a fast path for std::format("{}", x) in r14-5587-g41a5ea4cab2c59 I forgot to handle char separately from other integral types. That caused std::format("{}", 'c') to return "99" instead of "c". libstdc++-v3/ChangeLog: *

[committed] libstdc++: Fix std::format output of %C for negative years

2023-12-12 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8-- During discussion of LWG 4022 I noticed that we do not correctly implement floored division for the century. We were just truncating towards zero, rather than applying the floor function. For negative values that rounds the wrong way.

[committed] libstdc++: Remove redundant -std flags from Makefile

2023-12-12 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8-- In r14-4060-gc4baeaecbbf7d0 I moved some files from src/c++98 to src/c++11 but I didn't remove the redundant -std=gnu++11 flags for those files. The flags aren't needed now, because AM_CXXFLAGS for that directory already uses -std=gnu++11. This

[PATCH] btf: change encoding of forward-declared enums [PR111735]

2023-12-12 Thread David Faust
The BTF specification does not formally define a representation for forward-declared enum types such as: enum Foo; Forward-declarations for struct and union types are represented by BTF_KIND_FWD, which has a 1-bit flag distinguishing the two. The de-facto standard format used by other tools

Re: [PATCH] RISC-V: Apply vla vs. vls mode heuristic vector COST model

2023-12-12 Thread Robin Dapp
Given that it's almost verbatim aarch64's implementation and the general approach appears sensible, LGTM. Regards Robin

[PATCH] c++: unifying constants vs their type [PR99186, PR104867]

2023-12-12 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- When unifying constants we need to generally treat constants of different types but same value as different, in light of auto template parameters. This patch fixes this in a minimal way; it seems we could

Re: [PATCH] c++: Fix warmth propagation for member function templates

2023-12-12 Thread Marek Polacek
On Tue, Dec 12, 2023 at 07:29:40PM +, Jason Xu wrote: > Support was recently added for class-level warmth attributes that are > propagated to member functions. The current implementation ignores > member function templates and this patch fixes that. Thanks for the patch. Is there a bug in

[PATCH v4 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

2023-12-12 Thread Mary Bennett
Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md Contributors: Mary Bennett Nandni Jamnadas Pietra Ferreira Charlie Keaney Jessica Mills Craig Blackmore Simon Cook Jeremy Bennett Helene Chelin gcc/ChangeLog: *

Re: [PATCH] c++: unifying FUNCTION_DECLs [PR93740]

2023-12-12 Thread Jason Merrill
On 12/12/23 13:40, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? OK. I considered removing the is_overloaded_fn test now as well, but it could in theory be hit (and not subsumed by the type_unknown_p test) for e.g. OVERLOAD of a single

[PATCH v4 2/3] RISC-V: Update XCValu constraints to match other vendors

2023-12-12 Thread Mary Bennett
gcc/ChangeLog: * config/riscv/constraints.md: CVP2 -> CV_alu_pow2. * config/riscv/corev.md: Likewise. --- gcc/config/riscv/constraints.md | 15 --- gcc/config/riscv/corev.md | 4 ++-- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git

[PATCH v4 0/3] RISC-V: Support CORE-V XCVELW and XCVBI extensions

2023-12-12 Thread Mary Bennett
Thank you for reviewing my patches! v1 -> v2: * Bring the MEM into the operand for cv.elw. The new predicate is move_operand. * Add comment to riscv.md detailing why corev.md must appear before the generic riscv instructions. v2 -> v3: * Merge patterns for CORE-V branch immediate

[PATCH v4 1/3] RISC-V: Add support for XCVelw extension in CV32E40P

2023-12-12 Thread Mary Bennett
Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md Contributors: Mary Bennett Nandni Jamnadas Pietra Ferreira Charlie Keaney Jessica Mills Craig Blackmore Simon Cook Jeremy Bennett Helene Chelin gcc/ChangeLog: *

[PATCH] c++: Fix warmth propagation for member function templates

2023-12-12 Thread Jason Xu
Support was recently added for class-level warmth attributes that are propagated to member functions. The current implementation ignores member function templates and this patch fixes that. gcc/cp/ChangeLog: * class.cc (propagate_class_warmth_attribute): fix warmth propagation

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Richard Biener
> Am 12.12.2023 um 19:51 schrieb Peter Bergner : > > On 12/12/23 12:45 PM, Peter Bergner wrote: >> +/* PR target/112822 */ > > Oops, this should be: > > /* PR tree-optimization/112822 */ > > It's fixed on my end. Ok Richard > Peter > > > >

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 12:45 PM, Peter Bergner wrote: > +/* PR target/112822 */ Oops, this should be: /* PR tree-optimization/112822 */ It's fixed on my end. Peter

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 10:50 AM, Martin Jambor wrote: > The testcase has reasonable size but it is specific to ppc64le and its > altivec vectors. My plan is to ask the bug reporter to massage it into > a target specific testcase in bugzilla. Alternatively I can try to > craft a testcase from scratch but

[PATCH pushed] LoongArch: testsuite: Remove XFAIL in vect-ftint-no-inexact.c

2023-12-12 Thread Xi Ruoyao
After r14-6455 this no longer fails. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vect-ftint-no-inexact.c (xfail): Remove. --- Tested on loongarch64-linux-gnu. Pushed as obvious. gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c | 3 +-- 1 file changed, 1 insertion(+), 2

[PATCH] c++: unifying FUNCTION_DECLs [PR93740]

2023-12-12 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? I considered removing the is_overloaded_fn test now as well, but it could in theory be hit (and not subsumed by the type_unknown_p test) for e.g. OVERLOAD of a single FUNCTION_DECL. I wonder if that's something we'd

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Xi Ruoyao
On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote: > On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote: > > > I guess here the problem is floating-point compare instruction is much > > > more costly than other instructions but the fact is not correctly > > > modeled yet.  Could you try > > >

Re: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-12-12 Thread Richard Earnshaw
On 30/11/2023 12:55, Stamatis Markianos-Wright wrote: Hi Andre, Thanks for the comments, see latest revision attached. On 27/11/2023 12:47, Andre Vieira (lists) wrote: Hi Stam, Just some comments. +/* Recursively scan through the DF chain backwards within the basic block and +  

Re: [PATCH] c++: End lifetime of objects in constexpr after destructor call [PR71093]

2023-12-12 Thread Jason Merrill
On 12/12/23 10:24, Jason Merrill wrote: On 12/12/23 06:15, Jakub Jelinek wrote: On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote: On Tue, 12 Dec 2023, Jakub Jelinek wrote: On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill wrote: In discussion of PR71093 it came up

[pushed] testsuite: fix is_nothrow_default_constructible8.C

2023-12-12 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- This testcase uses variable templates, a C++14 feature. gcc/testsuite/ChangeLog: * g++.dg/ext/is_nothrow_constructible8.C: Require C++14. --- gcc/testsuite/g++.dg/ext/is_nothrow_constructible8.C | 2 +- 1 file changed, 1

Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-12-12 Thread Jeff Law
On 11/29/23 21:10, Joern Rennecke wrote: I originally computed mmask in carry_backpropagate from XEXP (x, 0), but abandoned that when I realized we also get called for RTX_OBJ things. I forgot to adjust the SIGN_EXTEND code, though. Fixed in the attached revised patch. Also made sure to

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Richard Biener
> Am 12.12.2023 um 17:50 schrieb Martin Jambor : > > Hi, > > PR 112822 revealed a corner case in load_assign_lhs_subreplacements > where it creates invalid gimple: an assignment where on the LHS there > is a complex variable which however is not a gimple register because > it has partial

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Alexander Monakov
On Tue, 12 Dec 2023, Richard Biener wrote: > On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote: > > > > Hi, > > this patch disables use of FMA in matrix multiplication loop for generic > > (for > > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > > > For Intel this is

[PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Martin Jambor
Hi, PR 112822 revealed a corner case in load_assign_lhs_subreplacements where it creates invalid gimple: an assignment where on the LHS there is a complex variable which however is not a gimple register because it has partial defs and on the right hand side there is a VIEW_CONVERT_EXPR. This

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
> > This came up in a separate thread as well, but when doing reassoc of a > chain with > multiple dependent FMAs. > > I can't understand how this uarch detail can affect performance when > as in the testcase > the longest input latency is on the multiplication from a memory load. > Do we

Re: [PATCH] Treat "p" in asms as addressing VOIDmode

2023-12-12 Thread Maciej W. Rozycki
On Mon, 11 Dec 2023, Richard Sandiford wrote: > > It all seems a bit hackish. I don't think ports have had much success > > using 'p' through the decades. I think I generally ended up having to > > go with distinct constraints rather than relying on 'p'. > > > > OK for the trunk, but ewww. >

Re: [PATCH v3 2/6] libgomp, openmp: Add ompx_pinned_mem_alloc

2023-12-12 Thread Andrew Stubbs
On 12/12/2023 10:05, Tobias Burnus wrote: Hi Andrew, On 11.12.23 18:04, Andrew Stubbs wrote: This creates a new predefined allocator as a shortcut for using pinned memory with OpenMP.  The name uses the OpenMP extension space and is intended to be consistent with other OpenMP implementations

Re: [PATCH v3 08/11] aarch64: Generalize writeback ldp/stp patterns

2023-12-12 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > This is a v3 patch which is rebased on top of the SME changes. > Otherwise it is the same as v2, posted here: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html > > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? > >

Re: [PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-12 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > This is a v2 version which addresses feedback from Richard's review > here: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html > > I'll reply inline to address specific comments. > > Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? > >

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-12 Thread Richard Sandiford
Robin Dapp writes: >> - Change the second mode to vec_extract_optab. This is only a name >> lookup, and it seems more natural to continue using the real element mode. > > Am I understanding correctly that this implies we should provide > a vec_extractbi expander? (with the innermode being

Re: GCC/Rust libgrust-v2/to-submit branch

2023-12-12 Thread Thomas Schwinge
Hi Arthur, Pierre-Emmanuel! On 2023-12-12T10:39:50+0100, I wrote: > On 2023-11-27T16:46:08+0100, I wrote: >> On 2023-11-21T16:20:22+0100, Arthur Cohen wrote: >>> On 11/20/23 15:55, Thomas Schwinge wrote: Arthur and Pierre-Emmanuel have prepared a GCC/Rust libgrust-v2/to-submit branch:

Re: [PATCH] c++: End lifetime of objects in constexpr after destructor call [PR71093]

2023-12-12 Thread Jason Merrill
On 12/12/23 06:15, Jakub Jelinek wrote: On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote: On Tue, 12 Dec 2023, Jakub Jelinek wrote: On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill wrote: In discussion of PR71093 it came up that more clobber_kind options would be

[PATCH v3] aarch64,arm: Move branch-protection data to targets

2023-12-12 Thread Szabolcs Nagy
The branch-protection types are target specific, not the same on arm and aarch64. This currently affects pac-ret+b-key, but there will be a new type on aarch64 that is not relevant for arm. After the move, change aarch_ identifiers to aarch64_ or arm_ as appropriate. gcc/ChangeLog: *

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-12 Thread Robin Dapp
> - Change the second mode to vec_extract_optab. This is only a name > lookup, and it seems more natural to continue using the real element mode. Am I understanding correctly that this implies we should provide a vec_extractbi expander? (with the innermode being BImode here). Regards Robin

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Richard Biener
On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote: > > Hi, > this patch disables use of FMA in matrix multiplication loop for generic (for > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > For Intel this is neutral both on the matrix multiplication microbenchmark >

Re: [PATCH] multiflags: fix doc warning properly

2023-12-12 Thread Joseph Myers
On Mon, 11 Dec 2023, Alexandre Oliva wrote: > On Dec 11, 2023, Joseph Myers wrote: > > > On Fri, 8 Dec 2023, Alexandre Oliva wrote: > >> @@ -20589,7 +20589,7 @@ allocation before or after interprocedural > >> optimization. > >> This option enables multilib-aware @code{TFLAGS} to be used to

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-12 Thread Richard Sandiford
Robin Dapp writes: > What also works is something like: > > scalar_mode extract_mode = innermode; > if (GET_MODE_CLASS (outermode) == MODE_VECTOR_BOOL) > extract_mode = smallest_int_mode_for_size > (GET_MODE_PRECISION (innermode)); > > however > >> So

Re: PING^1 [PATCH] range: Workaround different type precision issue between _Float128 and long double [PR112788]

2023-12-12 Thread Jakub Jelinek
On Tue, Dec 12, 2023 at 09:33:38AM -0500, Andrew MacLeod wrote: > I leave this for the release managers, but I am not opposed to it for this > release... It would be nice to remove it for the next release I can live with it for GCC 14, so ok, but it is very ugly. We should fix it in a better way

Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
Hi, this patch disables use of FMA in matrix multiplication loop for generic (for x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. For Intel this is neutral both on the matrix multiplication microbenchmark (attached) and spec2k17 where the difference was within noise for

Re: PING^1 [PATCH] range: Workaround different type precision issue between _Float128 and long double [PR112788]

2023-12-12 Thread Andrew MacLeod
I leave this for the release managers, but I am not opposed to it for this release... It would be nice to remove it for the next release Andrew On 12/12/23 01:07, Kewen.Lin wrote: Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639140.html BR, Kewen on

[PATCH] RISC-V: Apply vla vs. vls mode heuristic vector COST model

2023-12-12 Thread Juzhe-Zhong
This patch apply vla vs. vls mode heuristic which can fixes the following FAILs: FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize scan-assembler-not vset FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize scan-assembler-times li\\s+[a-x0-9]+,0\\s+ret 2 The root

Re: [PATCH] strub: add note on attribute access

2023-12-12 Thread Jan Hubicka
> On Dec 7, 2023, Alexandre Oliva wrote: > > > Thanks for raising the issue. Maybe there should be at least a comment > > there, and perhaps some asserts to check that pointer and reference > > types don't make to indirect_parms. > > Document why attribute access doesn't need the same

Re: [PATCH] ipa/92606 - properly handle no_icf attribute for variables

2023-12-12 Thread Jan Hubicka
> The following adds no_icf handling for variables where the attribute > was rejected. It also fixes the check for no_icf by checking both > the source and the targets decl. > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > This would solve the AVR issue with merging of "progmem"

[PATCH] tree-optimization/112961 - include latch in if-conversion CSE

2023-12-12 Thread Richard Biener
The following makes sure to also process the (empty) latch when performing CSE on the if-converted loop body. That's important to get all uses of copies propagated out on the backedge as well. To avoid CSE on the PHI nodes itself which is prohibitive (see PR90402) this temporarily adds a fake

[PATCH DejaGNU 1/1] Support per-test execution timeout factor

2023-12-12 Thread Maciej W. Rozycki
Add support for the `test_timeout_factor' global variable letting a test case scale the wait timeout used for code execution. This is useful for particularly slow test cases for which increasing the wait timeout globally would be excessive. * baseboards/qemu.exp (qemu_load): Handle

[PATCH GCC 1/1] testsuite: Support test execution timeout factor as a keyword

2023-12-12 Thread Maciej W. Rozycki
Add support for the `dg-test-timeout-factor' keyword letting a test case scale the wait timeout used for code execution, analogously to `dg-timeout-factor' used for code compilation. This is useful for particularly slow test cases for which increasing the wait timeout globally would be excessive.

[PATCH DejaGNU/GCC 0/1] Support per-test execution timeout factor

2023-12-12 Thread Maciej W. Rozycki
Hi, This patch quasi-series makes it possible for individual test cases identified as being slow to request more time via the GCC test harness by providing a test execution timeout factor, applied to the tool execution timeout set globally for all the test cases. This is to avoid excessive

Re: [PATCH v2] RISC-V: Supports RISC-V Profiles in '-march' option.

2023-12-12 Thread Christoph Müllner
On Tue, Dec 12, 2023 at 1:08 PM Jiawei wrote: > > Supports RISC-V profiles[1] in -march option. > > Default input set the profile is before other formal extensions. > > V2: Fixes some format errors and adds code comments for parse function > Thanks for Jeff Law's review and comments. > >

Re: [PATCH] RISC-V: Refactor Dynamic LMUL codes

2023-12-12 Thread Robin Dapp
Yes, no harm in doing that. LGTM. Regards Robin

Re: [PATCH] tree-optimization/112736 - avoid overread with non-grouped SLP load

2023-12-12 Thread Richard Biener
On Tue, 12 Dec 2023, Richard Sandiford wrote: > Richard Biener writes: > > The following aovids over/under-read of storage when vectorizing > > a non-grouped load with SLP. Instead of forcing peeling for gaps > > use a smaller load for the last vector which might access excess > > elements.

Re: [PATCH] tree-optimization/112736 - avoid overread with non-grouped SLP load

2023-12-12 Thread Richard Sandiford
Richard Biener writes: > The following aovids over/under-read of storage when vectorizing > a non-grouped load with SLP. Instead of forcing peeling for gaps > use a smaller load for the last vector which might access excess > elements. This builds upon the existing optimization avoiding >

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Xi Ruoyao
On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote: > > I guess here the problem is floating-point compare instruction is much > > more costly than other instructions but the fact is not correctly > > modeled yet.  Could you try > >

Re: [RFC] Intel AVX10.1 Compiler Design and Support

2023-12-12 Thread Richard Biener
On Tue, Dec 12, 2023 at 10:05 AM Florian Weimer wrote: > > * Richard Biener: > > > If it were possible I'd axe x86_64-v4. Maybe we should add a x86_64-v3.5 > > that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit > > (and obviously not requiring more of the AVX512 features

Re: Re: [RFC] RISC-V: Support RISC-V Profiles in -march option.

2023-12-12 Thread jiawei
-原始邮件- 发件人: "Jeff Law" 发送时间: 2023-12-12 00:15:44 (星期二) 收件人: Jiawei , gcc-patches@gcc.gnu.org 抄送: kito.ch...@sifive.com, pal...@dabbelt.com, christoph.muell...@vrull.eu 主题: Re: [RFC] RISC-V: Support RISC-V Profiles in -march option. On 11/20/23 12:14, Jiawei wrote: Supports

[committed] testsuite: Fix up test directive syntax errors

2023-12-12 Thread Jakub Jelinek
Hi! I've noticed +ERROR: gcc.dg/gomp/pr87887-1.c: syntax error in target selector ".-4" for " dg-warning 13 "unsupported return type ‘struct S’ for ‘simd’ functions" { target aarch64*-*-* } .-4 " +ERROR: gcc.dg/gomp/pr87887-1.c: syntax error in target selector ".-4" for " dg-warning 13

[PATCH v2] RISC-V: Supports RISC-V Profiles in '-march' option.

2023-12-12 Thread Jiawei
Supports RISC-V profiles[1] in -march option. Default input set the profile is before other formal extensions. V2: Fixes some format errors and adds code comments for parse function Thanks for Jeff Law's review and comments. [1]https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc

Re: [PATCH V3 3/4] OpenMP: Use enumerators for names of trait-sets and traits

2023-12-12 Thread Tobias Burnus
Hi Sandra, On 07.12.23 16:52, Sandra Loosemore wrote: This patch introduces enumerators to represent trait-set names and trait names, which makes it easier to use tables to control other behavior and for switch statements to dispatch on the tags. The tags are stored in the same place in the

Re: [PATCH] Adjust vectorized cost for reduction.

2023-12-12 Thread Richard Biener
On Tue, Dec 12, 2023 at 7:12 AM liuhongt wrote: > > x86 doesn't support horizontal reduction instructions, reduc_op_scal_m > is emulated with vec_extract_half + op(half vector length) > Take that into account when calculating cost for vectorization. > > Bootstrapped and regtested on

Re: [PATCH v2] LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT.

2023-12-12 Thread Jiahao Xu
在 2023/12/12 下午7:26, Xi Ruoyao 写道: On Tue, 2023-12-12 at 19:14 +0800, Jiahao Xu wrote: Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the short-circuit operation instead of the non-short-circuit operation. This gives a 1.8% improvement in SPECCPU 2017 fprate on

Re: [PATCH #1/2] strub: handle volatile promoted args in internal strub [PR112938]

2023-12-12 Thread Richard Biener
On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote: > > > When generating code for an internal strub wrapper, don't clear the > DECL_NOT_GIMPLE_REG_P flag of volatile args, and gimplify them both > before and after any conversion. > > While at that, move variable TMP into narrower scopes so

[PATCH v8 2/2] Add gcov MC/DC tests for GDC

2023-12-12 Thread Jørgen Kvalsvik
This is a mostly straight port from the gcov-19.c tests from the C test suite. The only notable differences from C to D are that D flips the true/false outcomes for loop headers, and the D front end ties loop and ternary conditions to slightly different locus. The test for >64 conditions warning

  1   2   >