[PATCH] i386: Sync move_max/store_max with prefer-vector-width [PR112824]

2023-12-13 Thread Hongyu Wang
Hi, Currently move_max follows the tuning feature first, but ideally it should sync with prefer-vector-width when it is explicitly set to keep vector move and operation with same vector size. Bootstrapped/regtested on x86-64-pc-linux-gnu{-m32,} OK for trunk? gcc/ChangeLog: PR

[Committed] RISC-V: Add failed SLP testcase

2023-12-13 Thread Juzhe-Zhong
After recent RVV cost model tweak, I found this PR issue has been fixed. Add testcase and committed. PR target/112387 gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr112387.c: New test. --- .../vect/costmodel/riscv/rvv/pr112387.c | 19 +++ 1

Re: [PATCH] LoongArch: Use the movcf2gr instruction to implement cstore4

2023-12-13 Thread Jiahao Xu
The implementation of this patch has some issues. When I compile 521.wrf with -Ofast -mlasx -flto -muse-movcf2gr, it results in an ICE: during RTL pass: reload module_mp_fast_sbm.fppized.f90: In function 'fast_sbm.constprop': module_mp_fast_sbm.fppized.f90:1369:25: internal compiler error:

[PATCH] tree-optimization/110640 - testcase for fixed bug

2023-12-13 Thread Richard Biener
Pushed. PR tree-optimization/110640 * gcc.dg/torture/pr110640.c: New testcase. --- gcc/testsuite/gcc.dg/torture/pr110640.c | 22 ++ 1 file changed, 22 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/torture/pr110640.c diff --git

[PATCH] match.pd: Simplify (t * u) / (t * v) [PR112994]

2023-12-13 Thread Jakub Jelinek
Hi! On top of the previously posted patch, this simplifies say (x * 16) / (x * 4) into 4. Unlike the previous pattern, this is something we didn't fold previously on GENERIC, so I think it shouldn't be all wrapped with #if GIMPLE. The question whether there should be fold_overflow_warning for

[PATCH] match.pd: Simplify (t * u) / v -> t * (u / v) [PR112994]

2023-12-13 Thread Jakub Jelinek
Hi! The following testcase is optimized just on GENERIC (using strict_overflow_p = false; if (TREE_CODE (arg1) == INTEGER_CST && (tem = extract_muldiv (op0, arg1, code, NULL_TREE, _overflow_p)) != 0) { if

[committed] testsuite: Fix up target-enter-data-1.c on 32-bit targets

2023-12-13 Thread Jakub Jelinek
On Wed, Nov 29, 2023 at 11:43:05AM +, Julian Brown wrote: > * c-c++-common/gomp/target-enter-data-1.c: Adjust scan output. struct bar { int num_vectors; double *vectors; }; is 16 bytes only on 64-bit targets, on 32-bit ones it is just 8 bytes, so the explicit matching of the * 16

RE: [PATCH] RISC-V: Add RVV builtin vectorization cost model

2023-12-13 Thread Li, Pan2
Committed, thanks Kito. Pan From: Kito Cheng Sent: Thursday, December 14, 2023 2:45 PM To: Juzhe-Zhong Cc: GCC Patches ; Kito Cheng ; Jeff Law ; Robin Dapp Subject: Re: [PATCH] RISC-V: Add RVV builtin vectorization cost model LGTM Juzhe-Zhong mailto:juzhe.zh...@rivai.ai>> 於 2023年12月14日 週四

Re: [PATCH] RISC-V: Add RVV builtin vectorization cost model

2023-12-13 Thread Kito Cheng
LGTM Juzhe-Zhong 於 2023年12月14日 週四 11:24 寫道: > This patch fixes PR11153: > > ble a1,zero,.L8 > addiw a5,a1,-1 > li a4,4 > addisp,sp,-16 > mv a2,a0 > sext.w a3,a1 > bleua5,a4,.L9 > srliw a4,a3,2 >

Re: Re: [PATCH v3 2/4] RISC-V: Add crypto vector builtin function.

2023-12-13 Thread juzhe.zh...@rivai.ai
I prefer all vector related function registration should be in the same function groups. like aarch64: /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ { #NAME, ::NAME,

[PATCH] RISC-V: Add RVV builtin vectorization cost model

2023-12-13 Thread Juzhe-Zhong
This patch fixes PR11153: ble a1,zero,.L8 addiw a5,a1,-1 li a4,4 addisp,sp,-16 mv a2,a0 sext.w a3,a1 bleua5,a4,.L9 srliw a4,a3,2 sllia4,a4,4 mv a5,a0 add a4,a4,a0

[PATCH] i386: Remove RAO-INT from Grand Ridge

2023-12-13 Thread Haochen Jiang
Hi all, According to ISE050 published at the end of September, RAO-INT will not be in Grand Ridge anymore. This patch aims to remove it. The documentation comes following: https://cdrdv2.intel.com/v1/dl/getContent/671368 Regtested on x86_64-pc-linux-gnu. Ok for trunk and backport to GCC13?

[r14-6515 Regression] FAIL: c-c++-common/gomp/target-enter-data-1.c -std=c++98 scan-tree-dump-times gimple "map\\(struct:\\*\\(f->bars \\+ \\(sizetype\\) \\(\\([^\\)]+\\) n \\* 16\\)\\) \\[len: 1\\]

2023-12-13 Thread haochen.jiang
On Linux/x86_64, 5fdb150cd4bf8f2da335e3f5c3a17aafcbc66dbe is the first bad commit commit 5fdb150cd4bf8f2da335e3f5c3a17aafcbc66dbe Author: Julian Brown Date: Mon Aug 14 12:41:56 2023 + OpenMP/OpenACC: Rework clause expansion and nested struct handling caused FAIL:

RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-13 Thread Zhu, Lipeng
On 2023/12/14 4:52, Thomas Schwinge wrote: > Hi Lipeng! > > On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote: > > On 2023/12/12 1:45, H.J. Lu wrote: > >> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng > wrote: > >> > On 2023/12/9 23:23, Jakub Jelinek wrote: > >> > > On Sat, Dec 09, 2023 at

Re: [PATCH] c++: Implement P2582R1, CTAD from inherited constructors

2023-12-13 Thread Jason Merrill
On 11/27/23 10:58, Patrick Palka wrote: gcc/cp/ChangeLog: * cp-tree.h (type_targs_deducible_from): Adjust return type. * pt.cc (alias_ctad_tweaks): Handle C++23 inherited CTAD. (inherited_ctad_tweaks): Define. (type_targs_deducible_from): Return the deduced

Re: [pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Jason Merrill
On 12/13/23 19:00, Marek Polacek wrote: On Wed, Dec 13, 2023 at 11:47:37AM -0500, Jason Merrill wrote: Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location information. I think the following should be an obvious

[PATCH] gprofng: a new GNU profiler

2023-12-13 Thread vladimir . mezentsev
From: Vladimir Mezentsev This is fixes for releases/gcc-13 for 31109 gprofng not built and installed in a combined binutils+gcc build I only cherry-picked 24552056fd5fc677c0d032f54a5cad1c4303d312 and tested my build. ChangeLog: * Makefile.def: Add gprofng module. *

Re: [PATCH] [ICE] Support vpcmov for V4HF/V4BF/V2HF/V2BF under TARGET_XOP.

2023-12-13 Thread Hongtao Liu
On Wed, Dec 13, 2023 at 7:59 PM Jakub Jelinek wrote: > > On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ready push to trunk. > > > > gcc/ChangeLog: > > > > PR target/112904 > > * config/i386/mmx.md

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread chenglulu
在 2023/12/13 下午9:20, Xi Ruoyao 写道: On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl,

[PATCH] libstdc++: Optimize std::is_trivially_destructible_v

2023-12-13 Thread Jonathan Wakely
Tested x86_64-linux. Does this look right? Can we do it faster, or simplify it? -- >8 -- This reduces the overhead of using std::is_trivially_destructible_v and as a result fixes some recent regressions seen with a non-default GLIBCXX_TESTSUITE_STDS env var: FAIL: 20_util/variant/87619.cc

Re: [PATCH] RISC-V: fix scalar crypto pattern

2023-12-13 Thread Jeff Law
On 12/13/23 02:03, Christoph Müllner wrote: On Wed, Dec 13, 2023 at 9:22 AM Liao Shihua wrote: In Scalar Crypto Built-In functions, some require immediate parameters, But register_operand are incorrectly used in the pattern. E.g.: __builtin_riscv_aes64ks1i(rs1,1) Before: li

[committed] Minor testsuite fallout from c99 changes

2023-12-13 Thread Jeff Law
The alpha port recently failed its weekly test due to a lack of a prototype for the syscall() routine. Fixed thusly and pushed to the trunk. Jeff commit acfd33620af3519b84baecedb0eb6618c2f599a6 Author: Jeff Law Date: Wed Dec 13 17:24:39 2023 -0700 [committed] Minor testsuite fallout

Re: [pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Marek Polacek
On Wed, Dec 13, 2023 at 11:47:37AM -0500, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > -- 8< -- > > When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location > information. I think the following should be an obvious fix, so I'll check it in. --

Re: [PATCH 2/2] aarch64: Handle autoinc addresses in ld1rq splitter [PR112906]

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > This patch uses the new force_reload_address routine added by the > previous patch to fix PR112906. > > Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? OK, thanks, and sorry for the breakage. Richard > > Thanks, > Alex > > gcc/ChangeLog: > > PR

Re: [PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > In PR112906 we ICE because we try to use force_reg to reload an > auto-increment address, but force_reg can't do this. > > With the aim of fixing the PR by supporting reloading arbitrary > addresses in pre-RA splitters, this patch generalizes >

Re: [PATCH v3 10/11] aarch64: Add new load/store pair fusion pass

2023-12-13 Thread Richard Sandiford
Thanks for the update. The new comments are really nice, and I think make the implementation much easier to follow. I was going to say OK with the changes below, but there's one question/ comment near the end about the double list walk. Alex Coplan writes: > +// Convenience wrapper around

Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-12-13 Thread Peter Bergner
On 11/24/23 3:28 AM, Kewen.Lin wrote: >> + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); > > Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16? > I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more, > we have to add one explicit .to_constant ()

[PATCH V3] RISC-V: XFAIL scan dump fails for autovec PR111311

2023-12-13 Thread Edwin Lu
Clean up scan dump failures on linux rv64 vector targets Juzhe mentioned could be ignored for now. This will help reduce noise and make it more obvious if a bug or regression is introduced. The failures that are still reported are either execution failures or failures that are also present on

[PATCH 2/2] aarch64: Handle autoinc addresses in ld1rq splitter [PR112906]

2023-12-13 Thread Alex Coplan
This patch uses the new force_reload_address routine added by the previous patch to fix PR112906. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: PR target/112906 * config/aarch64/aarch64-sve.md (@aarch64_vec_duplicate_vq_le): Use

[PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

2023-12-13 Thread Alex Coplan
Hi, In PR112906 we ICE because we try to use force_reg to reload an auto-increment address, but force_reg can't do this. With the aim of fixing the PR by supporting reloading arbitrary addresses in pre-RA splitters, this patch generalizes lra-constraints.cc:emit_inc and makes it available to the

RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-13 Thread Thomas Schwinge
Hi Lipeng! On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote: > On 2023/12/12 1:45, H.J. Lu wrote: >> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote: >> > On 2023/12/9 23:23, Jakub Jelinek wrote: >> > > On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote: >> > > > This patch try to

Re: [PATCH v3] c++: fix ICE with sizeof in a template [PR112869]

2023-12-13 Thread Jason Merrill
On 12/12/23 17:48, Marek Polacek wrote: On Fri, Dec 08, 2023 at 11:09:15PM -0500, Jason Merrill wrote: On 12/8/23 16:15, Marek Polacek wrote: On Fri, Dec 08, 2023 at 12:09:18PM -0500, Jason Merrill wrote: On 12/5/23 15:31, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu,

Fix 'libgomp/config/linux/allocator.c' 'size_t' vs. '%ld' format string mismatch (was: Build breakage)

2023-12-13 Thread Thomas Schwinge
Hi! On 2023-12-13T20:36:40+0100, I wrote: > On 2023-12-13T11:15:54-0800, Jerry D via Gcc wrote: >> I am getting this failure to build from clean trunk. > > This is due to commit r14-6499-g348874f0baac0f22c98ab11abbfa65fd172f6bdd > "libgomp: basic pinned memory on Linux", which supposedly was

Re: [PATCH] c++: unifying constants vs their type [PR99186, PR104867]

2023-12-13 Thread Jason Merrill
On 12/12/23 16:21, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? OK. -- >8 -- When unifying constants we need to generally treat constants of different types but same value as different, in light of auto template parameters. This patch

Re: [PATCH] libcpp: Fix valgrind errors on pr88974.c [PR112956]

2023-12-13 Thread Jason Merrill
On 12/13/23 03:39, Jakub Jelinek wrote: Hi! On the c-c++-common/cpp/pr88974.c testcase I'm seeing ==600549== Conditional jump or move depends on uninitialised value(s) ==600549==at 0x1DD3A05: cpp_get_token_1(cpp_reader*, unsigned int*) (macro.cc:3050) ==600549==by 0x1DBFC7F:

[pushed] c++: TARGET_EXPR location in default arg [PR96997]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- My r14-6505-g52b4b7d7f5c7c0 change to copy the location in build_aggr_init_expr reopened PR96997; let's fix it properly this time, by clearing the location like we do for other trees. PR c++/96997 gcc/cp/ChangeLog: *

Re: [PATCH] libgccjit: Add ability to get CPU features

2023-12-13 Thread Antoni Boucher
David: Ping. I guess if we want to have this merged for this release, it should be sooner rather than later (if it's still an option). On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote: > On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote: > > Hi. > > This patch adds support for getting

Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-13 Thread Kees Cook
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote: > On 2023/12/13 16:48, Dan Li wrote: > > + Likun > > > > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote: > >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra > >> wrote: > >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > >>> >

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jason Merrill
On 12/13/23 11:26, Jakub Jelinek wrote: On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote: gcc/testsuite/ChangeLog: * g++.dg/pr112822.C: Require C++17. --- gcc/testsuite/g++.dg/pr112822.C | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/g++.dg/pr112822.C

Re: [pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Patrick Palka
On Wed, 13 Dec 2023, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > -- 8< -- > > When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location > information. > > gcc/cp/ChangeLog: > > * tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION. I made

Re: [committed v2] aarch64: Add missing driver-aarch64 dependencies

2023-12-13 Thread Richard Sandiford
Andrew Carlotti writes: > On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote: >> Andrew Carlotti writes: >> The .def files are included in TM_H by: >> >> TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ >> $(srcdir)/config/aarch64/aarch64-tuning-flags.def \ >>

Re: [PATCH v2] aarch64: Fix +nocrypto handling

2023-12-13 Thread Richard Sandiford
Andrew Carlotti writes: > Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with > checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the > AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is > retained because removing it would make processing the data

Re: [PATCH] c++: Fix tinst_level::to_list [PR112968]

2023-12-13 Thread Jason Merrill
On 12/13/23 04:49, Jakub Jelinek wrote: Hi! With valgrind checking, there are various errors reported on some C++26 libstdc++ tests, like: ==2009913== Conditional jump or move depends on uninitialised value(s) ==2009913==at 0x914C59: gt_ggc_mx_lang_tree_node(void*) (gt-cp-tree.h:107)

Re: [r14-6468 Regression] FAIL: std/time/year/io.cc -std=gnu++26 execution test on Linux/x86_64

2023-12-13 Thread Jonathan Wakely
On Wed, 13 Dec 2023 at 10:51, haochen.jiang wrote: > > On Linux/x86_64, > > a01462ae8bafa86e7df47a252917ba6899d587cf is the first bad commit > commit a01462ae8bafa86e7df47a252917ba6899d587cf > Author: Jonathan Wakely > Date: Mon Dec 11 15:33:59 2023 + > > libstdc++: Fix std::format

[PATCH] middle-end: Fix up constant handling in emit_conditional_move [PR111260]

2023-12-13 Thread Andrew Pinski
After r14-2667-gceae1400cf24f329393e96dd9720, we force a constant to a register if it is shared with one of the other operands. The problem is used the comparison mode for the register but that could be different from the operand mode. This causes some issues on some targets. To fix it, we either

[pushed 2/4] c++: constant direct-initialization [PR108243]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- When testing the proposed patch for PR71093 I noticed that it changed the diagnostic for consteval-prop6.C. I then noticed that the diagnostic wasn't very helpful either way; it was complaining about modification of the 'x' variable, but

[pushed 4/4] c++: End lifetime of objects in constexpr after destructor call [PR71093]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. This is modified from Nathaniel's last version by adjusting for my recent CLOBBER changes and removing the special handling of __in_chrg which is no longer needed since my previous commit. -- 8< -- This patch adds checks for using objects after

[pushed 3/4] c++: fix in-charge parm in constexpr

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- I was puzzled by the proposed patch for PR71093 specifically ignoring the in-charge parameter; the problem turned out to be that when cxx_eval_call_expression jumps from the clone to the cloned function, it assumes that the latter has the

[pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location information. gcc/cp/ChangeLog: * tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION. gcc/testsuite/ChangeLog: *

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote: > gcc/testsuite/ChangeLog: > > * g++.dg/pr112822.C: Require C++17. > --- > gcc/testsuite/g++.dg/pr112822.C | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/gcc/testsuite/g++.dg/pr112822.C

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jason Merrill
On 12/12/23 21:36, Jason Merrill wrote: On 12/12/23 17:50, Peter Bergner wrote: On 12/12/23 1:26 PM, Richard Biener wrote: Am 12.12.2023 um 19:51 schrieb Peter Bergner : On 12/12/23 12:45 PM, Peter Bergner wrote: +/* PR target/112822 */ Oops, this should be: /* PR

Re: [PATCH v4] A new copy propagation and PHI elimination pass

2023-12-13 Thread Richard Biener
> Am 13.12.2023 um 17:12 schrieb Filip Kastl : > >  >> Hi, this is a patch that I submitted two months ago as an RFC. I added some polish since. It is a new lightweight pass that removes redundant PHI functions and as a bonus does basic copy

Re: [PATCH] tree-optimization/111807 - ICE in verify_sra_access_forest

2023-12-13 Thread Richard Biener
> Am 13.12.2023 um 17:07 schrieb Martin Jambor : > > Hi, > > sorry for getting to this only so late, my email backlog from my medical > leave still isn't empty. > >> On Mon, Oct 16 2023, Richard Biener wrote: >> The following addresses build_reconstructed_reference failing to >> build

[PATCH v4] A new copy propagation and PHI elimination pass

2023-12-13 Thread Filip Kastl
> > > Hi, > > > > > > this is a patch that I submitted two months ago as an RFC. I added some > > > polish > > > since. > > > > > > It is a new lightweight pass that removes redundant PHI functions and as a > > > bonus does basic copy propagation. With Jan Hubi?ka we measured that it > > > is

Re: [PATCH] tree-optimization/111807 - ICE in verify_sra_access_forest

2023-12-13 Thread Martin Jambor
Hi, sorry for getting to this only so late, my email backlog from my medical leave still isn't empty. On Mon, Oct 16 2023, Richard Biener wrote: > The following addresses build_reconstructed_reference failing to > build references with a different offset than the models and thus > the caller

Re: Disable FMADD in chains for Zen4 and generic

2023-12-13 Thread Jan Hubicka
> > The diffrerence is that Cores understand the fact that fmadd does not need > > all three parameters to start computation, while Zen cores doesn't. > > > > Since this seems noticeable win on zen and not loss on Core it seems like > > good > > default for generic. > > > > I plan to commit the

[committed] amdgcn: XNACK support

2023-12-13 Thread Andrew Stubbs
Some AMD GCN devices support an "XNACK" mode in which the device can handle page-misses (and maybe other traps in memory instructions), but it's not completely invisible to software. We need this now to support OpenMP Unified Shared Memory (I plan to post updated patches for that in January),

[PATCH v2] extend.texi: Fix typos in LSX intrinsics

2023-12-13 Thread Jiajie Chen
Several typos have been found and fixed: missing semicolons, using variable name instead of type, duplicate functions and wrong types. gcc/ChangeLog: * doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name. (__lsx_vfrintrm_d, __lsx_vfrintrm_s, __lsx_vfrintrne_d,

[wwwdocs][patch] gcc-14/changes.html + project/gomp/: Update OpenMP status

2023-12-13 Thread Tobias Burnus
Attached is an in-between update for the release notes and also for the project status page. The latter contains an implementation-status page that is updated based on the libgomp.texi entries; I think there are more issues, but I found an incomplete update which is now fixed. I probably need

[PATCH v2] aarch64: Fix +nopredres, +nols64 and +nomops

2023-12-13 Thread Andrew Carlotti
On Sat, Dec 09, 2023 at 07:22:49PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > ... > > This is the only use of native_detect_p, so it'd be good to remove > the field itself. Done > > ... > > > > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv) > >

[PATCH v2] aarch64: Fix +nocrypto handling

2023-12-13 Thread Andrew Carlotti
Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is retained because removing it would make processing the data in option-extensions.def

[committed v2] aarch64 testsuite: Check entire .arch string

2023-12-13 Thread Andrew Carlotti
Add a terminating newline to various tests, and add missing extensions to some test strings. The current output is broken for options_set_4.c, so this test is left unchanged, to be fixed in a subsequent patch. Committed as obvious, with options_set_4.c removed compared to v1.

Re: [PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > On 12/12/2023 15:58, Richard Sandiford wrote: >> Alex Coplan writes: >> > Hi, >> > >> > This is a v2 version which addresses feedback from Richard's review >> > here: >> > >> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html >> > >> > I'll reply inline

[committed v2] aarch64: Add missing driver-aarch64 dependencies

2023-12-13 Thread Andrew Carlotti
On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > The .def files are included in TM_H by: > > TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ > $(srcdir)/config/aarch64/aarch64-tuning-flags.def \ >

Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-13 Thread Mark Rutland
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote: > On 2023/12/13 16:48, Dan Li wrote: > > + Likun > > > > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote: > >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote: > >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > >>> > In

Re: [PATCH v4] aarch64: SVE/NEON Bridging intrinsics

2023-12-13 Thread Richard Sandiford
Richard Ball writes: > ACLE has added intrinsics to bridge between SVE and Neon. > > The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and > SVE vectors. > > This patch adds support to GCC for the following 3 intrinsics: > svset_neonq, svget_neonq and svdup_neonq > >

Re: [PATCH v3 1/6] libgomp: basic pinned memory on Linux

2023-12-13 Thread Andrew Stubbs
On 12/12/2023 09:02, Tobias Burnus wrote: On 11.12.23 18:04, Andrew Stubbs wrote: Implement the OpenMP pinned memory trait on Linux hosts using the mlock syscall.  Pinned allocations are performed using mmap, not malloc, to ensure that they can be unpinned safely when freed. This

[PATCH] LoongArch: Use the movcf2gr instruction to implement cstore4

2023-12-13 Thread Xi Ruoyao
We used a branch to load floating-point comparison results into GPR. This is very slow when the branch is not predictable. Use the movcf2gr instruction to implement cstore4 if movcf2gr is fast enough. gcc/ChangeLog: * config/loongarch/genopts/loongarch.opt.in (muse-movcf2gr): New

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Peter Bergner
On 12/13/23 2:05 AM, Jakub Jelinek wrote: > On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote: >> On Tue, 12 Dec 2023, Peter Bergner wrote: >> >>> On 12/12/23 8:36 PM, Jason Merrill wrote: This test is failing for me below C++17, I think you need // { dg-do compile {

RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-12-13 Thread Tamar Christina
> > > else if (vect_use_mask_type_p (stmt_info)) > > > { > > > unsigned int precision = stmt_info->mask_precision; > > > scalar_type = build_nonstandard_integer_type (precision, 1); > > > vectype = get_mask_type_for_scalar_type (vinfo, scalar_type, > > > group_size); > > >

RE: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Li, Pan2
Committed with below comments, thanks Juzhe and Robin. Pan -Original Message- From: Robin Dapp Sent: Wednesday, December 13, 2023 9:56 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai Subject: Re: [PATCH v1] RISC-V: Refine test cases for both

Re: Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread 钟居哲
Thanks Richard. LGTM for RISC-V part. Thanks Robin for fixing it. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-12-13 22:05 To: Robin Dapp CC: Richard Biener; gcc-patches; juzhe.zhong\@rivai.ai Subject: Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773]. Robin Dapp

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Richard Sandiford
Robin Dapp writes: > @@ -1758,16 +1759,19 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 > bitsize, poly_uint64 bitnum, >if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) > { >scalar_mode innermode = GET_MODE_INNER (outermode); >enum insn_code icode > =

[committed] aarch64 testsuite: Only run aarch64-ssve tests once

2023-12-13 Thread Andrew Carlotti
Results verified by running `RUNTESTFLAGS="aarch64-ssve.exp=*" make -k -j 56 check-gcc` before and after the change. I initally spotted the issue because the tests were being run a nondeterministic number of time during unrelated regresison testing. Committed as obvious.

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Robin Dapp
Thanks, LGTM but please add a comment like: These test cases used to cause out-of-bounds writes to the stack and therefore showed unreliable behavior. Depending on the execution environment they can either pass or fail. As of now, with the latest QEMU version, they will pass even without the

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread juzhe.zhong
lgtm from my side. But I'd like to see Robin's commentsThanks Replied Message Frompan2...@intel.comDate12/13/2023 21:49 Togcc-patches@gcc.gnu.org Ccjuzhe.zh...@rivai.ai,pan2...@intel.com,rdapp@gmail.comSubject[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread pan2 . li
From: Pan Li Refine the test cases for: * Name convention. * Add run case. PR target/112929 PR target/112988 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here. *

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread Xi Ruoyao
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli.

[committed] RISC-V:Add crypto vector implied ISA info.

2023-12-13 Thread Feng Wang
Due to the crypto vector entension is depend on the Vector extension, so add the implied ISA info with the corresponding crypto vector extension. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Modify implied ISA info. * config/riscv/arch-canonicalize: Add crypto vector

Re: Re: [PATCH v3 2/4] RISC-V: Add crypto vector builtin function.

2023-12-13 Thread Feng Wang
2023-12-13 18:18 juzhe.zhong wrote: > > >+    multiple_p (GET_MODE_BITSIZE (e.arg_mode (0)), >+    GET_MODE_BITSIZE (e.arg_mode (1)), ); > >Change it into gcc_assert (multiple_p (...)) > >+/* A list of all Vector Crypto intrinsic functions.  */ >+static function_group_info

Re: [PATCH 3/3] LoongArch: Add alslsi3_extend

2023-12-13 Thread chenglulu
LGTM! Thanks! 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Following the instruction cost fix, we are generating alsl.w $a0, $a0, $a0, 4 instead of li.w $t0, 17 mul.w $a0, $t0 for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't have a sign-extending pattern for

Re: [PATCH 1/3] LoongArch: Include rtl.h for COSTS_N_INSNS instead of hard coding our own

2023-12-13 Thread chenglulu
LGTM! Thanks. 在 2023/12/10 上午1:03, Xi Ruoyao 写道: With loongarch-def.cc switched from C to C++, we can include rtl.h for COSTS_N_INSNS, instead of hard coding our own. THis is a non-functional change for now, but it will make the code more future-proof in case COSTS_N_INSNS in rtl.h would be

[PATCH 6/6] Defer assigning vector types until after VF is determined

2023-12-13 Thread Richard Biener
The following defers, for non-gather/scatter and non-pattern stmts, setting of STMT_VINFO_VECTYPE until after we computed the desired vectorization factor. This allows us to use larger vector types when the vectorization factor and the preferred vector mode allow, reducing the number of vector

[PATCH 3/6] Query an appropriate offset vector type in vect_gather_scatter_fn_p

2023-12-13 Thread Richard Biener
The gather_load optab and friends require the offset vector mode to have the same number of lanes as the data vector mode. Restrict the vector type query to that when searching for a proper offset type. * tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Use

[PATCH 5/6] Allow poly_uint64 for group_size args to vector type query routines

2023-12-13 Thread Richard Biener
The following changes the unsigned group_size argument to a poly_uint64 one to avoid too much special-casing in callers for VLA vectors when passing down the effective maximum desirable vector size to vector type query routines. The intent is to be able to pass down the vectorization factor

[PATCH 2/6] Set LOOP_VINFO_VECT_FACTOR only when it is final

2023-12-13 Thread Richard Biener
The following makes sure to keep LOOP_VINFO_VECT_FACTOR at the indetermined value zero until it is final, making LOOP_VINFO_VECT_FACTOR an rvalue and changing some direct references to use the macro. * tree-vectorizer.h (LOOP_VINFO_VECT_FACTOR): Make an rvalue. * tree-vect-loop.cc

[PATCH 4/6] More explicit vector types

2023-12-13 Thread Richard Biener
This reduces more calls to get_vectype_for_scalar_type. * tree-vect-loop.cc (vect_transform_cycle_phi): Specify the vector type for invariant/external defs. * tree-vect-stmts.cc (vectorizable_shift): For invariant or external shifted operands use the result vector

[committed] libstdc++: Fix regression in std::format output of %Y for negative years

2023-12-13 Thread Jonathan Wakely
It seems that what I pushed didn't match what I tested, due to testing on a different machine! Tested x86_64-linux, on the right machine this time. Pushed to trunk. -- >8 -- The change in r14-6468-ga01462ae8bafa8 was only supposed to apply to %C formats, not %Y. libstdc++-v3/ChangeLog:

[PATCH 1/6] Reduce the number of get_vectype_for_scalar_type calls

2023-12-13 Thread Richard Biener
The following removes get_vectype_for_scalar_type calls when we already have the vector type computed. It also avoids some premature and possibly redundant or unnecessary check during data-ref analysis for gathers. * tree-vect-data-refs.cc (vect_analyze_data_refs): Do not check

[PATCH][0/6][RFC] Relax single-vector-size restriction

2023-12-13 Thread Richard Biener
I've been asked to look into how to best relax the current restriction of the vectorizer that it prefers to use a single vector size throughout loop vectorization. That size is determined by the preferred_simd_mode and the autovectorize_vector_modes hook for other-than-first iterations. The

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
OK. will add it later. Replied Message FromRobin DappDate12/13/2023 20:23 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]> Do you mean

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> Do you mean add some comments in tests? I meant add it as a run test as well and comment that the test has caused out-of-bounds writes before and passed by the time of adding it (or so) and is kept regardless. Regards Robin

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread chenglulu
在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli. gcc/ChangeLog: PR target/112936 *

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Robin Dapp
Thanks. The attached v2 goes with your suggestion and adds a vec_extractbi expander. Apart from that it keeps the MODE_PRECISION changes from before and uses insn_data[icode].operand[0]'s mode. Apart from that no changes on the riscv side. Bootstrapped and regtested on x86 and aarch64. On

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
Do you mean add some comments in tests? Replied Message FromRobin DappDate12/13/2023 20:16 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> I don”t choose to run since I didn”t have issue run on my local > simulator no matter qemu or spike. Yes it was flaky. That's kind of expected with the out-of-bounds writes we did. They can depend on runtime environment and other factors. Of course it's a bit counterintuitive to add a

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
I don”t choose to run since I didn”t have issue run on my local simulator no matter qemu or spike.So it”s better to check vsetvl asm.full available is not consistent between LCM analysis and earliest fusion,so it”s safe to postpone it. Replied Message FromRobin DappDate12/13/2023 20:08

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
Hi Juzhe, in general looks OK to me. Just a question for understanding: > - if (header_info.valid_p () > - && (anticipated_exp_p (header_info) || block_info.full_available)) Why is full_available true if we cannot use it? > +/* { dg-do compile } */ It would be nice if we could

Re: [PATCH] [ICE] Support vpcmov for V4HF/V4BF/V2HF/V2BF under TARGET_XOP.

2023-12-13 Thread Jakub Jelinek
On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ready push to trunk. > > gcc/ChangeLog: > > PR target/112904 > * config/i386/mmx.md (*xop_pcmov_): New define_insn. > > gcc/testsuite/ChangeLog: > > *

RE: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi

2023-12-13 Thread Li, Pan2
Committed, thanks all. Pan From: juzhe.zh...@rivai.ai Sent: Wednesday, December 13, 2023 7:16 PM To: demin.han ; gcc-patches Cc: Li, Pan2 Subject: Re: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi LGTM.

  1   2   >