[PATCH] vect: Cost adjacent vector loads/stores together [PR111784]

2023-10-17 Thread Kewen.Lin
Hi, As comments[1][2], this patch is to change the costing way on some adjacent vector loads/stores from costing one by one to costing them together with the total number once. It helps to fix the exposed regression PR111784 on aarch64, as aarch64 specific costing could make different decisions

[PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-17 Thread Juzhe-Zhong
This patch optimize this following permutation with consecutive patterns index: typedef char vnx16i __attribute__ ((vector_size (16))); #define MASK_16 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15 vnx16i __attribute__ ((noinline, noclone)) test_1 (vnx16i x, vnx16i y) {

Re: [PATCH V2 14/14] RISC-V: P14: Adjust and add testcases

2023-10-17 Thread juzhe.zh...@rivai.ai
OK juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:35 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 14/14] RISC-V: P14: Adjust and add testcases This sub-patch adjust some testcases and add some bugfix testcases. PR

Re: [PATCH V2 13/14] RISC-V: P13: Reorganize functions used to modify RTL

2023-10-17 Thread juzhe.zh...@rivai.ai
OK juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 13/14] RISC-V: P13: Reorganize functions used to modify RTL This sub-patch reoriganize the functions that used to modify

Re: [PATCH V2 12/14] RISC-V: P12: Delete riscv-vsetvl.h

2023-10-17 Thread juzhe.zh...@rivai.ai
OK juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 12/14] RISC-V: P12: Delete riscv-vsetvl.h This sub-patch delete the unused header file riscv-vsetvl.h since we no need

Re: [PATCH V2 09/14] RISC-V: P9: Cleanup post optimize phase

2023-10-17 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 09/14] RISC-V: P9: Cleanup post optimize phase This sub-patch deletes partial post optimize code(which implement in the

Re: [PATCH V2 08/14] RISC-V: P8: Unified insert and delete of vsetvl insn into Phase 4

2023-10-17 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 08/14] RISC-V: P8: Unified insert and delete of vsetvl insn into Phase 4 This sub-patch move the modification of rtl

Re: [PATCH V2 07/14] RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class

2023-10-17 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 07/14] RISC-V: P7: Move earliest fuse and lcm code to pre_vsetvl class This patch adjust move the code phase 2 and 3

Re: [PATCH V2 06/14] RISC-V: P6: Add computing reaching definition data flow

2023-10-17 Thread juzhe.zh...@rivai.ai
Copy and paste the original comments: -/* Compute the local properties of each recorded expression. - - Local properties are those that are defined by the block, irrespective of - other blocks. - - An expression is transparent in a block if its operands are not modified - in the block. -

Re: [PATCH V2 06/14] RISC-V: P6: Add computing reaching definition data flow

2023-10-17 Thread juzhe.zh...@rivai.ai
compute_vsetvl_lcm_data -> compute_lcm_local_properties juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 06/14] RISC-V: P6: Add computing reaching definition data flow

Re: [PATCH V2 05/14] RISC-V: P5: combine phase 1 and 2

2023-10-17 Thread juzhe.zh...@rivai.ai
LGTM on algorithm of local analysis. juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 05/14] RISC-V: P5: combine phase 1 and 2 This sub-patch combine phase 1 and 2 to use the

Re: [PATCH V2 11/14] RISC-V: P11: Adjust vector_block_info to vsetvl_block_info class

2023-10-17 Thread juzhe.zh...@rivai.ai
+ const vsetvl_info _header_info () const + { +gcc_assert (!empty_p ()); +return infos.is_empty () ? m_info : infos[0]; + } Change it into get_entry_info (be consistent with mode-switching naming which also uses LCM). + const vsetvl_info _footer_info () const + { +gcc_assert

Re: [PATCH V2 04/14] RISC-V: P4: move method from pass_vsetvl to pre_vsetvl

2023-10-17 Thread juzhe.zh...@rivai.ai
LGMT this patch. juzhe.zh...@rivai.ai From: Lehua Ding Date: 2023-10-17 19:34 To: gcc-patches CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding Subject: [PATCH V2 04/14] RISC-V: P4: move method from pass_vsetvl to pre_vsetvl This sub-patch remove the method about

Re: [PATCH V2 03/14] RISC-V: P3: Refactor vector_infos_manager

2023-10-17 Thread juzhe.zh...@rivai.ai
+ demand_system dem; + auto_vec vector_block_infos; + + /* data for avl reaching defintion. */ + sbitmap avl_regs; + sbitmap *avl_def_in; + sbitmap *avl_def_out; + sbitmap *reg_def_loc; + + /* data for vsetvl info reaching defintion. */ + vsetvl_info unknow_info; + auto_vec

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-17 Thread Lehua Ding
Hi Patrick, Thanks a lot for reporting these failes, very important. I'll locate the causes since my previous run was with these parameters: -march=gcv_zvfh_zfh + -cmodel=medany + spike did not encounter these fails. On 2023/10/18 4:25, Patrick O'Neill wrote: Hi Lehua! I ran the gcc

[r14-4629 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-10-17 Thread Jiang, Haochen
On Linux/x86_64, 3179ad72f67f31824c444ef30ef171ad7495d274 is the first bad commit commit 3179ad72f67f31824c444ef30ef171ad7495d274 Author: Richard Biener rguent...@suse.de Date: Fri Oct 13 12:32:51 2023 +0200 OMP SIMD inbranch call vectorization for AVX512 style

Re: [PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-17 Thread chenglulu
在 2023/10/17 下午10:24, WANG Xuerui 写道: On 10/17/23 22:06, Xi Ruoyao wrote: During the review of a LLVM change [1], on LA464 we found that zeroing "an" LLVM change (because the word LLVM is pronounced letter-by-letter) a fcc with fcmp.caf.s is much faster than a movgr2cf from $r0.

[PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-17 Thread pan2 . li
From: Pan Li The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++)

Re: [PATCH] RISC-V: Enable more tests for dynamic LMUL and bug fix[PR111832]

2023-10-17 Thread juzhe.zh...@rivai.ai
Committed. juzhe.zh...@rivai.ai From: Juzhe-Zhong Date: 2023-10-17 15:30 To: gcc-patches CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong Subject: [PATCH] RISC-V: Enable more tests for dynamic LMUL and bug fix[PR111832] Last time, Robin has mentioned that dynamic LMUL will

Re: [PATCH 0/3] Add Intel new cpu archs

2023-10-17 Thread Hongtao Liu
On Mon, Oct 16, 2023 at 2:25 PM Haochen Jiang wrote: > > Hi all, > > The patches aim to add new cpu archs Clear Water Forest and > Panther Lake. Here comes the documentation: > > https://cdrdv2.intel.com/v1/dl/getContent/671368 > > Also in the patches, I refactored how we detect cpu according to

[PATCH] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-17 Thread Dimitrij Mijoski
We can test codecvt::length() with the same data that we test codecvt::in(). For each call of in() we add another call to length(). Some additional small cosmentic changes are applied. libstdc++-v3/ChangeLog: * testsuite/22_locale/codecvt/codecvt_unicode.h: Test length() ---

[PATCH 2/2] aarch64: Put LR save slot first in more cases

2023-10-17 Thread Richard Sandiford
Now that the prologue and epilogue code iterates over saved registers in offset order, we can put the LR save slot first without compromising LDP/STP formation. This isn't worthwhile when shadow call stacks are enabled, since the first two registers are also push/pop candidates, and LR cannot be

[PATCH 1/2] aarch64: Use vecs to store register save order

2023-10-17 Thread Richard Sandiford
aarch64_save/restore_callee_saves looped over registers in register number order. This in turn meant that we could only use LDP and STP for registers that were consecutive both number-wise and offset-wise (after unsaved registers are excluded). This patch instead builds lists of the registers

Re: [PATCH] c++: accepts-invalid with =delete("") [PR111840]

2023-10-17 Thread Jason Merrill
On 10/17/23 17:38, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? OK. -- >8 -- r6-2367 added a DECL_INITIAL check to cp_parser_simple_declaration so that we don't emit multiple errors in g++.dg/parse/error57.C. But that means we don't diagnose int f1()

[PATCH] c++: accepts-invalid with =delete("") [PR111840]

2023-10-17 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- r6-2367 added a DECL_INITIAL check to cp_parser_simple_declaration so that we don't emit multiple errors in g++.dg/parse/error57.C. But that means we don't diagnose int f1() = delete("george_crumb"); anymore, because fn

[pushed] c++: mangling tweaks

2023-10-17 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- Most of this is introducing the abi_check function to reduce the verbosity of most places that check -fabi-version. The start_mangling change is to avoid needing to zero-initialize additional members of the mangling globals, though I'm not

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-17 Thread Jason Merrill
On 9/25/23 21:56, waffl3x wrote: On the plus side, I took my time to figure out how to best to pass down information about whether a param is an xobj param. My initial impression on what you were suggesting was to push another node on the front of the list, but I stared at it for a few hours

Re: [PATCH 11/11] aarch64: Add new load/store pair fusion pass.

2023-10-17 Thread Andrew Pinski
On Tue, Oct 17, 2023 at 1:52 PM Alex Coplan wrote: > > This adds a new aarch64-specific RTL-SSA pass dedicated to forming load > and store pairs (LDPs and STPs). > > As a motivating example for the kind of thing this improves, take the > following testcase: > > extern double c[20]; > > double

Re: [PATCH v3] c++: Fix compile-time-hog in cp_fold_immediate_r [PR111660]

2023-10-17 Thread Marek Polacek
On Tue, Oct 17, 2023 at 04:49:52PM -0400, Jason Merrill wrote: > On 10/16/23 20:39, Marek Polacek wrote: > > On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote: > > > On 10/13/23 14:53, Marek Polacek wrote: > > > > On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote: > > > > >

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-17 Thread Jason Merrill
On 9/25/23 21:56, waffl3x wrote: Also, just a quick update on my copyright assignment, I have sent an e-mail to the FSF and haven't gotten a response yet. From what I was reading, I am confident that it's my preferred option going forward though. Hopefully they get back to me soon. Any

[PATCH 11/11] aarch64: Add new load/store pair fusion pass.

2023-10-17 Thread Alex Coplan
This adds a new aarch64-specific RTL-SSA pass dedicated to forming load and store pairs (LDPs and STPs). As a motivating example for the kind of thing this improves, take the following testcase: extern double c[20]; double f(double x) { double y = x*x; y += c[16]; y += c[17]; y +=

[PATCH 10/11] aarch64: Generalise TFmode load/store pair patterns

2023-10-17 Thread Alex Coplan
This patch generalises the TFmode load/store pair patterns to TImode and TDmode. This brings them in line with the DXmode patterns, and uses the same technique with separate mode iterators (TX and TX2) to allow for distinct modes in each arm of the load/store pair. For example, in combination

Re: [PATCH v3] c++: Fix compile-time-hog in cp_fold_immediate_r [PR111660]

2023-10-17 Thread Jason Merrill
On 10/16/23 20:39, Marek Polacek wrote: On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote: On 10/13/23 14:53, Marek Polacek wrote: On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote: On 10/12/23 17:04, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu,

[PATCH 09/11] aarch64, testsuite: Fix up pr71727.c

2023-10-17 Thread Alex Coplan
The test is trying to check that we don't use q-register stores with -mstrict-align, so actually check specifically for that. This is a prerequisite to avoid regressing: scan-assembler-not "add\tx0, x0, :" with the upcoming ldp fusion pass, as we change where the ldps are formed such that a

[PATCH 08/11] aarch64, testsuite: Tweak sve/pcs/args_9.c to allow stps

2023-10-17 Thread Alex Coplan
With the new ldp/stp pass enabled, there is a change in the codegen for this test as follows: add x8, sp, 16 ptrue p3.h, mul3 str p3, [x8] - str x8, [sp, 8] - str x9, [sp] + stp x9, x8, [sp] ptrue p3.d, vl8 ptrue

[PATCH 07/11] aarch64, testsuite: Prevent stp in lr_free_1.c

2023-10-17 Thread Alex Coplan
The test is looking for individual stores which are able to be merged into stp instructions. The test currently passes -fno-schedule-fusion -fno-peephole2, presumably to prevent these stores from being turned into stps, but this is no longer sufficient with the new ldp/stp fusion pass. As such,

[PATCH 06/11] haifa-sched: Allow for NOTE_INSN_DELETED at start of epilogue

2023-10-17 Thread Alex Coplan
haifa-sched.cc:remove_notes asserts that it lands on a real (non-note) insn after advancing past NOTE_INSN_EPILOGUE_BEG, but with the upcoming post-RA aarch64 load pair pass enabled, we can land on NOTE_INSN_DELETED. This patch adjusts remove_notes to remove these if they occur at the start of

[PATCH 05/11] rtl-ssa: Support for inserting new insns

2023-10-17 Thread Alex Coplan
The upcoming aarch64 load pair pass needs to form store pairs, and can re-order stores over loads when alias analysis determines this is safe. In the case that both mem defs have uses in the RTL-SSA IR, and both stores require re-ordering over their uses, we represent that as (tentative) deletion

[PATCH 04/11] rtl-ssa: Support inferring uses of mem in change_insns

2023-10-17 Thread Alex Coplan
Currently, rtl_ssa::change_insns requires all new uses and defs to be specified explicitly. This turns out to be rather inconvenient for forming load pairs in the new aarch64 load pair pass, as the pass has to determine which mem def the final load pair consumes, and then obtain or create a

[PATCH 03/11] rtl-ssa: Add entry point to allow re-parenting uses

2023-10-17 Thread Alex Coplan
This is needed by the upcoming aarch64 load pair pass, as it can re-order stores (when alias analysis determines this is safe) and thus change which mem def a given use consumes (in the RTL-SSA view, there is no alias disambiguation of memory). Bootstrapped/regtested as a series on

[PATCH 02/11] rtl-ssa: Add drop_memory_access helper

2023-10-17 Thread Alex Coplan
Add a helper routine to access-utils.h which removes the memory access from an access_array, if it has one. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * rtl-ssa/access-utils.h (drop_memory_access): New. --- gcc/rtl-ssa/access-utils.h | 11

[PATCH 01/11] rtl-ssa: Fix bug in function_info::add_insn_after

2023-10-17 Thread Alex Coplan
In the case that !insn->is_debug_insn () && next->is_debug_insn (), this function was missing an update of the prev pointer on the first nondebug insn following the sequence of debug insns starting at next. This can lead to corruption of the insn chain, in that we end up with:

[PATCH 00/11] aarch64: Add new load/store pair fusion pass

2023-10-17 Thread Alex Coplan
Hi, This patch series adds a new aarch64-specific RTL-SSA pass for forming load and store pairs (LDPs and STPS). See the cover letter on patch 11/11 for more details on the pass itself. Patch 1/11 fixes a latent bug in RTL-SSA. Patches 2-5 add features to RTL-SSA that are needed by the pass.

Re: [PATCH] c++: Add missing auto_diagnostic_groups to constexpr.cc

2023-10-17 Thread Jason Merrill
On 10/17/23 12:34, Marek Polacek wrote: On Tue, Oct 17, 2023 at 09:35:21PM +1100, Nathaniel Shead wrote: Marek pointed out in another patch of mine [1] that I was missing an auto_diagnostic_group to correctly associate informative notes with their errors in structured error outputs. This patch

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-17 Thread Patrick O'Neill
Hi Lehua! I ran the gcc testsuite on qemu before/after applying your patches to 305034e3 rv32/64gcv [1]. Baseline    = Summary of gcc testsuite =     | # of unexpected case / # of unique unexpected case     |

[COMMITTED] RISC-V/testsuite/pr111466.c: update test and expected output

2023-10-17 Thread Vineet Gupta
Update the test to potentially generate two SEXT.W instructions: one for incoming function arg, other for function return. But after commit 8eb9cdd14218 ("expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg") the test is not supposed to generate either of them so fix the expected

PING Re: [PATCH v2 RFA] diagnostic: add permerror variants with opt

2023-10-17 Thread Jason Merrill
Ping? On 10/3/23 17:09, Jason Merrill wrote: This revision changes from using DK_PEDWARN for permerror-with-option to using DK_PERMERROR. Tested x86_64-pc-linux-gnu. OK for trunk? -- 8< -- In the discussion of promoting some pedwarns to be errors by default, rather than move them all into

Re: [PATCH v2] RISC-V/testsuite/pr111466.c: update test and expected output

2023-10-17 Thread Jeff Law
On 10/17/23 12:51, Vineet Gupta wrote: Update the test to potentially generate two SEXT.W instructions: one for incoming function arg, other for function return. But after commit 8eb9cdd14218 ("expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg") the test is not supposed to

[x86 PATCH] PR target/110511: Fix reg allocation for widening multiplications.

2023-10-17 Thread Roger Sayle
This patch contains clean-ups of the widening multiplication patterns in i386.md, and provides variants of the existing highpart multiplication peephole2 transformations (that tidy up register allocation after reload), and thereby fixes PR target/110511, which is a superfluous move instruction.

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-17 Thread Prathamesh Kulkarni
On Tue, 17 Oct 2023 at 02:40, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Wed, 11 Oct 2023 at 16:57, Prathamesh Kulkarni > > wrote: > >> > >> On Wed, 11 Oct 2023 at 16:42, Prathamesh Kulkarni > >> wrote: > >> > > >> > On Mon, 9 Oct 2023 at 17:05, Richard Sandiford > >> >

[PATCH v2] RISC-V/testsuite/pr111466.c: update test and expected output

2023-10-17 Thread Vineet Gupta
Update the test to potentially generate two SEXT.W instructions: one for incoming function arg, other for function return. But after commit 8eb9cdd14218 ("expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg") the test is not supposed to generate either of them so fix the expected

[PATCH] RISC-V/testsuite/pr111466.c: fix expected output to not detect SEXT.W

2023-10-17 Thread Vineet Gupta
gcc/testsuite/ChangeLog: * gcc.target/riscv/pr111466.c: Change to scan-assembler-not to not detect sext.w. Signed-off-by: Vineet Gupta --- gcc/testsuite/gcc.target/riscv/pr111466.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-17 Thread Vineet Gupta
On 10/16/23 21:07, Jeff Law wrote: On 9/28/23 15:43, Vineet Gupta wrote: RISC-V suffers from extraneous sign extensions, despite/given the ABI guarantee that 32-bit quantities are sign-extended into 64-bit registers, meaning incoming SI function args need not be explicitly sign extended

Re: [patch] fortran/intrinsic.texi: Improve SIGNAL intrinsic entry

2023-10-17 Thread Harald Anlauf
Hi Tobias, On 10/17/23 19:36, Tobias Burnus wrote: Hi Harald, On 17.10.23 19:02, Harald Anlauf wrote: your latest patch - which you already pushed - removes the intrinsic declaration of signal. Only to 'signal' or also to 'sleep'? I have now added both in the attach patch. you are right:

RE: [x86 PATCH] PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md

2023-10-17 Thread Roger Sayle
Hi Uros, Thanks for the speedy review. > From: Uros Bizjak > Sent: 17 October 2023 17:38 > > On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle > wrote: > > > > > > This patch is the backend piece of a solution to PRs 101955 and > > 106245, that adds a define_insn_and_split to the i386 backend, to

Re: [patch] fortran/intrinsic.texi: Improve SIGNAL intrinsic entry

2023-10-17 Thread Tobias Burnus
Hi Harald, On 17.10.23 19:02, Harald Anlauf wrote: your latest patch - which you already pushed - removes the intrinsic declaration of signal. Only to 'signal' or also to 'sleep'? I have now added both in the attach patch. (Not yet committed.) Tobias - Siemens Electronic

Re: [PATCH v22 02/31] c-family, c++: Look up built-in traits via identifier node

2023-10-17 Thread Patrick Palka
On Tue, 17 Oct 2023, Ken Matsui wrote: > Since RID_MAX soon reaches 255 and all built-in traits are used approximately > once in a C++ translation unit, this patch removes all RID values for built-in > traits and uses the identifier node to look up the specific trait. Rather > than holding

Re: [patch] fortran/intrinsic.texi: Improve SIGNAL intrinsic entry

2023-10-17 Thread Harald Anlauf
Tobias, your latest patch - which you already pushed - removes the intrinsic declaration of signal. This can lead to a user's confusion and undesired results when the code is compiled e.g. with -std=f2018, because call signal (10, 1) ! 10 = SIGUSR1 and 1 = SIG_IGN (on some systems) could

Re: [x86 PATCH] PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md

2023-10-17 Thread Uros Bizjak
On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle wrote: > > > This patch is the backend piece of a solution to PRs 101955 and 106245, > that adds a define_insn_and_split to the i386 backend, to perform sign > extension of a single (least significant) bit using AND $1 then NEG. > > Previously,

Re: [PATCH] c++: Add missing auto_diagnostic_groups to constexpr.cc

2023-10-17 Thread Marek Polacek
On Tue, Oct 17, 2023 at 09:35:21PM +1100, Nathaniel Shead wrote: > Marek pointed out in another patch of mine [1] that I was missing an > auto_diagnostic_group to correctly associate informative notes with > their errors in structured error outputs. This patch goes through > constexpr.cc to

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Robin Dapp writes: > Thank you for the explanation. > > So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along > with the respective helper and expand functions, what would be the > way forward? IMO it'd be worth starting with the _LEN form only. > Generate an IFN_VCOND_MASK(_LEN)

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Richard Biener writes: > On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford > wrote: >> >> Robin Dapp writes: >> >> Why are the contents of this if statement wrong for COND_LEN? >> >> If the "else" value doesn't matter, then the masked form can use >> >> the "then" value for all elements. I

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
Thank you for the explanation. So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along with the respective helper and expand functions, what would be the way forward? Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? How would I make sure all of match.pd's vec_cond

Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2023-10-17 Thread Tobias Burnus
On 17.10.23 15:34, Jakub Jelinek wrote: On Tue, Oct 17, 2023 at 03:12:46PM +0200, Tobias Burnus wrote: C++11 (and C23) attribute do not seem to be properly handled: [[omp::decl (declare target,indirect(1))]] int foo(void) { return 5; } [[omp::decl (declare target indirect)]] int bar(void) {

Re: [PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-17 Thread WANG Xuerui
On 10/17/23 22:06, Xi Ruoyao wrote: During the review of a LLVM change [1], on LA464 we found that zeroing "an" LLVM change (because the word LLVM is pronounced letter-by-letter) a fcc with fcmp.caf.s is much faster than a movgr2cf from $r0. Similarly, "an" fcc [1]:

[PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-17 Thread Xi Ruoyao
During the review of a LLVM change [1], on LA464 we found that zeroing a fcc with fcmp.caf.s is much faster than a movgr2cf from $r0. [1]: https://github.com/llvm/llvm-project/pull/69300 gcc/ChangeLog: * config/loongarch/loongarch.md (movfcc): Use fcmp.caf.s for zeroing a fcc.

Re: Check that passes do not forget to define profile

2023-10-17 Thread Jan Hubicka
> So OK to commit this? > > This patch makes sure the profile_count information is initialized for the > new > bb created in move_sese_region_to_fn. > > gcc/ChangeLog: > > * tree-cfg.cc (move_sese_region_to_fn): Initialize profile_count for > new basic block. > > Bootstrapped and

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Robin Dapp writes: >>> I don't know much about valueisation either :) But it does feel >>> like we're working around the lack of a LEN form of COND_EXPR. >>> In other words, it seems odd that we can do: >>> >>> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >>> >>> but we can't do: >>> >>>

Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2023-10-17 Thread Jakub Jelinek
On Tue, Oct 17, 2023 at 03:12:46PM +0200, Tobias Burnus wrote: > C++11 (and C23) attribute do not seem to be properly handled: > > [[omp::decl (declare target,indirect(1))]] > int foo(void) { return 5; } > [[omp::decl (declare target indirect)]] > int bar(void) { return 8; } Isn't that correct?

Re: [PATCH v8] tree-ssa-sink: Improve code sinking pass

2023-10-17 Thread Ajit Agarwal
Hello Richard: Below review comments are incorporated in version 10 of the patch, Please review and let me know if its okay for trunk. Thanks & Regards Ajit On 17/10/23 2:47 pm, Richard Biener wrote: > On Tue, Oct 17, 2023 at 10:53 AM Ajit Agarwal wrote: >> >> Hello Richard: >> >> On 17/10/23

[PATCH v10] tree-ssa-sink: Improve code sinking pass

2023-10-17 Thread Ajit Agarwal
Currently, code sinking will sink code at the use points with loop having same nesting depth. The following patch improves code sinking by placing the sunk code in immediate dominator with same loop nest depth. Review comments are incorporated. For example : void bar(); int j; void foo(int a,

Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++

2023-10-17 Thread Tobias Burnus
Hi Kwok, hi Jakub, hi all, some first comments based on both playing around and reading the patch - and some generic comments to any patch reader. In general, the patch looks good. I just observe: * There is an issue with [[omp::decl(...)]]' * () - there is a C/C++ inconsistency in what is

[x86 PATCH] PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md

2023-10-17 Thread Roger Sayle
This patch is the backend piece of a solution to PRs 101955 and 106245, that adds a define_insn_and_split to the i386 backend, to perform sign extension of a single (least significant) bit using AND $1 then NEG. Previously, (x<<31)>>31 would be generated as sall$31, %eax // 3

[PATCH] tree-optimization/111846 - put simd-clone-info into SLP tree

2023-10-17 Thread Richard Biener
The following avoids bogously re-using the simd-clone-info we currently hang off stmt_info from two different SLP contexts where a different number of lanes should have chosen a different best simdclone. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR

Re: [PATCH] wide-int-print: Don't print large numbers hexadecimally for print_dec{,s,u}

2023-10-17 Thread Richard Biener
On Tue, 17 Oct 2023, Jakub Jelinek wrote: > Hi! > > The following patch implements printing of wide_int/widest_int numbers > decimally when asked for that using print_dec{,s,u}, even if they have > precision larger than 64 and get_len () above 1 (right now we printed > them hexadecimally and

[PATCH v22 31/31] libstdc++: Optimize std::is_pointer compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_pointer by dispatching to the new __is_pointer built-in trait. libstdc++-v3/ChangeLog: * include/bits/cpp_type_traits.h (__is_pointer): Use __is_pointer built-in trait. * include/std/type_traits (is_pointer):

[PATCH v22 11/31] libstdc++: Optimize std::is_unbounded_array compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_unbounded_array by dispatching to the new __is_unbounded_array built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_unbounded_array_v): Use __is_unbounded_array built-in trait. Signed-off-by: Ken Matsui

[PATCH v22 28/31] c++: Implement __remove_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::remove_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_pointer. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of

[PATCH v22 22/31] c++: Implement __is_reference built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_reference. gcc/cp/ChangeLog: * cp-trait.def: Define __is_reference. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_REFERENCE. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v22 17/31] libstdc++: Optimize std::is_member_pointer compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_member_pointer by dispatching to the new __is_member_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_member_pointer): Use __is_member_pointer built-in trait. (is_member_pointer_v):

[PATCH v22 20/31] c++: Implement __is_member_object_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_member_object_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_member_object_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_MEMBER_OBJECT_POINTER. * semantics.cc (trait_expr_value): Likewise.

[PATCH v22 15/31] libstdc++: Optimize std::is_scoped_enum compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_scoped_enum by dispatching to the new __is_scoped_enum built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_scoped_enum): Use __is_scoped_enum built-in trait. (is_scoped_enum_v): Likewise.

[PATCH v22 25/31] libstdc++: Optimize std::is_function compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_function by dispatching to the new __is_function built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_function): Use __is_function built-in trait. (is_function_v): Likewise. Optimize its

[PATCH v22 04/31] c++: Implement __is_const built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_const. gcc/cp/ChangeLog: * cp-trait.def: Define __is_const. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
>> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len,

[PATCH v22 13/31] libstdc++: Optimize std::is_bounded_array compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_bounded_array by dispatching to the new __is_bounded_array built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_bounded_array_v): Use __is_bounded_array built-in trait. Signed-off-by: Ken Matsui ---

[PATCH v22 09/31] libstdc++: Optimize std::is_array compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_array by dispatching to the new __is_array built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_array): Use __is_array built-in trait. (is_array_v): Likewise. Signed-off-by: Ken Matsui ---

[PATCH v22 18/31] c++: Implement __is_member_function_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_member_function_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_member_function_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_MEMBER_FUNCTION_POINTER. * semantics.cc (trait_expr_value):

[PATCH v21 18/30] libstdc++: Optimize std::is_member_function_pointer compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_member_function_pointer by dispatching to the new __is_member_function_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_member_function_pointer): Use __is_member_function_pointer built-in

[PATCH v22 26/31] c++: Implement __is_object built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_object. gcc/cp/ChangeLog: * cp-trait.def: Define __is_object. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_OBJECT. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v22 30/31] c++: Implement __is_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v21 23/30] c++: Implement __is_function built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_function. gcc/cp/ChangeLog: * cp-trait.def: Define __is_function. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v22 16/31] c++: Implement __is_member_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_member_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_member_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_MEMBER_POINTER. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr):

[PATCH v22 24/31] c++: Implement __is_function built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_function. gcc/cp/ChangeLog: * cp-trait.def: Define __is_function. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v21 16/30] libstdc++: Optimize std::is_member_pointer compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_member_pointer by dispatching to the new __is_member_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_member_pointer): Use __is_member_pointer built-in trait. (is_member_pointer_v):

[PATCH v22 19/31] libstdc++: Optimize std::is_member_function_pointer compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_member_function_pointer by dispatching to the new __is_member_function_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_member_function_pointer): Use __is_member_function_pointer built-in

[PATCH v22 14/31] c++: Implement __is_scoped_enum built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_scoped_enum. gcc/cp/ChangeLog: * cp-trait.def: Define __is_scoped_enum. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_SCOPED_ENUM. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.

[PATCH v21 17/30] c++: Implement __is_member_function_pointer built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_member_function_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __is_member_function_pointer. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_MEMBER_FUNCTION_POINTER. * semantics.cc (trait_expr_value):

[PATCH v22 10/31] c++: Implement __is_unbounded_array built-in trait

2023-10-17 Thread Ken Matsui
This patch implements built-in trait for std::is_unbounded_array. gcc/cp/ChangeLog: * cp-trait.def: Define __is_unbounded_array. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_UNBOUNDED_ARRAY. * semantics.cc (trait_expr_value): Likewise.

[PATCH v22 23/31] libstdc++: Optimize std::is_reference compilation performance

2023-10-17 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_reference by dispatching to the new __is_reference built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (is_reference): Use __is_reference built-in trait. (is_reference_v): Likewise. Signed-off-by:

  1   2   >