[PATCH][_Hashtable] Fix merge

2023-10-18 Thread François Dumont
libstdc++: [_Hashtable] Do not reuse untrusted cached hash code On merge reuse merged node cached hash code only if we are on the same type of hash and this hash is stateless. Usage of function pointers or std::function as hash functor will prevent this optimization. libstdc++-v3/ChangeLog  

[PATCH] aarch64: [PR110986] Emit csinv again for `a ? ~b : b`

2023-10-18 Thread Andrew Pinski
After r14-3110-g7fb65f10285, the canonical form for `a ? ~b : b` changed to be `-(a) ^ b` that means for aarch64 we need to add a few new insn patterns to be able to catch this and change it to be what is the canonical form for the aarch64 backend. A secondary pattern was needed to support a

[PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2023-10-18 Thread Michael Meissner
This patch is a prelimianry patch to add the full 1,024 bit dense math register (DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the DMR register. This patch only adds the new 1,024 bit register support. It does not add support for any instructions that need 1,024 bit

[PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2023-10-18 Thread Michael Meissner
This patch changes the assembler instruction names for MMA instructions from the original name used in power10 to the new name when used with the dense math system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the same bits for either spelling. The patches have been tested on

[PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2023-10-18 Thread Michael Meissner
This patch changes the MMA instructions to use either FPR registers (-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA instruction names are used. A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs. The patches have been tested on both little and big

[PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2023-10-18 Thread Michael Meissner
The MMA subsystem added the notion of accumulator registers as an optional feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with the traditional floating point registers 0..31, but logically the accumulator registers were separate from the FPR registers. In ISA 3.1, it was

[PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2023-10-18 Thread Michael Meissner
This patch re-enables generating load and store vector pair instructions when doing certain memory copy operations when -mcpu=future is used. During power10 development, it was determined that using store vector pair instructions were problematical in a few cases, so we disabled generating load

Re: [PATCH 1/6] PowerPC: Add -mcpu=future option

2023-10-18 Thread Michael Meissner
This patch implements support for a potential future PowerPC cpu. Features added with -mcpu=future, may or may not be added to new PowerPC processors. This patch adds support for the -mcpu=future option. If you use -mcpu=future, the macro __ARCH_PWR_FUTURE__ is defined, and the assembler

[PATCH 0/6] PowerPC Future patches

2023-10-18 Thread Michael Meissner
This patch is very preliminary support for a potential new feature to the PowerPC that extends the current power10 MMA architecture. This feature may or may not be present in any specific future PowerPC processor. In the current MMA subsystem for Power10, there are 8 512-bit accumulator

[COMMITTED] Fix expansion of `(a & 2) != 1`

2023-10-18 Thread Andrew Pinski
I had a thinko in r14-1600-ge60593f3881c72a96a3fa4844d73e8a2cd14f670 where we would remove the `& CST` part if we ended up not calling expand_single_bit_test. This fixes the problem by introducing a new variable that will be used for calling expand_single_bit_test. As afar as I know this can only

[PATCH] c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic [PR89038]

2023-10-18 Thread Lewis Hyatt
Hello- The PR points out that my fix for PR53431 was incomplete and did not handle -Wunknown-pragmas. This is a one-line fix to correct that, is it OK for trunk and for GCC 13 backport please? bootstrap + regtest all languages on x86-64 Linux. Thanks! -Lewis -- >8 -- As noted on the PR, commit

Re: [PATCH V2 7/7] aarch64: Add system register duplication check selftest

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Add a build-time test to check whether system register data, as > imported from `aarch64-sys-reg.def' has any duplicate entries. > > Duplicate entries are defined as any two SYSREG entries in the .def > file which share the same encoding values (as specified by its

Re: [PATCH V2 6/7] aarch64: Add front-end argument type checking for target builtins

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > In implementing the ACLE read/write system register builtins it was > observed that leaving argument type checking to be done at expand-time > meant that poorly-formed function calls were being "fixed" by certain > optimization passes, meaning bad code wasn't being

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Qing Zhao
> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar wrote: > > On 2023-08-25 11:24, Qing Zhao wrote: >> This is the 3rd version of the patch, per our discussion based on the >> review comments for the 1st and 2nd version, the major changes in this >> version are: > > Hi Qing, > > I hope the

Re: [PATCH V2 4/7] aarch64: Add basic target_print_operand support for CONST_STRING

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Motivated by the need to print system register names in output > assembly, this patch adds the required logic to > `aarch64_print_operand' to accept rtxs of type CONST_STRING and > process these accordingly. > > Consequently, an rtx such as: > > (set (reg/i:DI 0

Re: [PATCH V2 2/7] aarch64: Add support for aarch64-sys-regs.def

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > This patch defines the structure of a new .def file used for > representing the aarch64 system registers, what information it should > hold and the basic framework in GCC to process this file. > > Entries in the aarch64-system-regs.def file should be as follows: > >

Re: [PATCH] libcpp: testsuite: Add test for fixed _Pragma bug [PR82335]

2023-10-18 Thread Lewis Hyatt
May I please ping this one, and/or, is it something straightforward enough I can just commit it as obvious? Thanks! https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631814.html -Lewis On Mon, Oct 2, 2023 at 6:23 PM Lewis Hyatt wrote: > > Hello- > >

Re: [V3][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896]

2023-10-18 Thread Qing Zhao
Hi, Sid, Thanks a lot for the detailed comments. See my responds embedded below. Qing > On Oct 5, 2023, at 4:01 PM, Siddhesh Poyarekar wrote: > > > > On 2023-08-25 11:24, Qing Zhao wrote: >> Use the counted_by atribute info in builtin object size to compute the >> subobject size for

Re: [PATCH V2 5/7] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Implement the aarch64 intrinsics for reading and writing system > registers with the following signatures: > > uint32_t __arm_rsr(const char *special_register); > uint64_t __arm_rsr64(const char *special_register); > void* __arm_rsrp(const char

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> Could you by the way add this mention this PR: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 > Add the test of this PR ? Commented in that PR. This patch does not help there. Regards Robin

Re: [PATCH v2] gcc: Introduce -fhardened

2023-10-18 Thread Qing Zhao
Marek, Sorry for the late comment (I was just back from a long vacation immediate after Cauldron). One question: Is the option “-fhandened” for production build or for development build? If it’s for development build, then adding -ftrivial-auto-var-init=pattern is reasonable since the

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Qing Zhao
> On Oct 6, 2023, at 4:01 PM, Martin Uecker wrote: > > Am Freitag, dem 06.10.2023 um 06:50 -0400 schrieb Siddhesh Poyarekar: >> On 2023-10-06 01:11, Martin Uecker wrote: >>> Am Donnerstag, dem 05.10.2023 um 15:35 -0700 schrieb Kees Cook: On Thu, Oct 05, 2023 at 04:08:52PM -0400, Siddhesh

Re: [PATCH V2 3/7] aarch64: Implement system register validation tools

2023-10-18 Thread Richard Sandiford
Generally looks really good. Some comments below. Victor Do Nascimento writes: > Given the implementation of a mechanism of encoding system registers > into GCC, this patch provides the mechanism of validating their use by > the compiler. In particular, this involves: > > 1. Ensuring a

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-18 Thread Prathamesh Kulkarni
On Wed, 18 Oct 2023 at 23:22, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Tue, 17 Oct 2023 at 02:40, Richard Sandiford > > wrote: > >> Prathamesh Kulkarni writes: > >> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > >> > index 4f8561509ff..55a6a68c16c 100644 > >> >

Re: [PATCH 10/11] aarch64: Generalise TFmode load/store pair patterns

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > This patch generalises the TFmode load/store pair patterns to TImode and > TDmode. This brings them in line with the DXmode patterns, and uses the > same technique with separate mode iterators (TX and TX2) to allow for > distinct modes in each arm of the load/store pair. >

Re: [PATCH 09/11] aarch64, testsuite: Fix up pr71727.c

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > The test is trying to check that we don't use q-register stores with > -mstrict-align, so actually check specifically for that. > > This is a prerequisite to avoid regressing: > > scan-assembler-not "add\tx0, x0, :" > > with the upcoming ldp fusion pass, as we change where

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread Jason Merrill
On 10/18/23 13:28, waffl3x wrote: I will try to get something done today, but I was struggling with writing some of the tests, there's also a lot more of them now. I also wrote a bunch of musings in comments that I would like feedback on. My most concrete question is, how exactly should I be

Re: [PATCH 08/11] aarch64, testsuite: Tweak sve/pcs/args_9.c to allow stps

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > With the new ldp/stp pass enabled, there is a change in the codegen for > this test as follows: > > add x8, sp, 16 > ptrue p3.h, mul3 > str p3, [x8] > - str x8, [sp, 8] > - str x9, [sp] > + stp x9, x8, [sp] >

Re: [PATCH 07/11] aarch64, testsuite: Prevent stp in lr_free_1.c

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > The test is looking for individual stores which are able to be merged > into stp instructions. The test currently passes -fno-schedule-fusion > -fno-peephole2, presumably to prevent these stores from being turned > into stps, but this is no longer sufficient with the new

Re: [PATCH 04/11] rtl-ssa: Support inferring uses of mem in change_insns

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > Currently, rtl_ssa::change_insns requires all new uses and defs to be > specified explicitly. This turns out to be rather inconvenient for > forming load pairs in the new aarch64 load pair pass, as the pass has to > determine which mem def the final load pair consumes, and

Re: [PATCH 03/11] rtl-ssa: Add entry point to allow re-parenting uses

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > This is needed by the upcoming aarch64 load pair pass, as it can > re-order stores (when alias analysis determines this is safe) and thus > change which mem def a given use consumes (in the RTL-SSA view, there is > no alias disambiguation of memory). > >

Re: [PATCH 02/11] rtl-ssa: Add drop_memory_access helper

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > Add a helper routine to access-utils.h which removes the memory access > from an access_array, if it has one. > > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? > > gcc/ChangeLog: > > * rtl-ssa/access-utils.h (drop_memory_access): New. > --- >

Re: [PATCH 01/11] rtl-ssa: Fix bug in function_info::add_insn_after

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > In the case that !insn->is_debug_insn () && next->is_debug_insn (), this > function was missing an update of the prev pointer on the first nondebug > insn following the sequence of debug insns starting at next. > > This can lead to corruption of the insn chain, in that we

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-18 Thread Richard Sandiford
Prathamesh Kulkarni writes: > On Tue, 17 Oct 2023 at 02:40, Richard Sandiford > wrote: >> Prathamesh Kulkarni writes: >> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc >> > index 4f8561509ff..55a6a68c16c 100644 >> > --- a/gcc/fold-const.cc >> > +++ b/gcc/fold-const.cc >> > @@ -10684,9

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 05:28:10PM +, waffl3x wrote: > I've seen plenty of these G_ or _ macros on strings around like in > grokfndecl for these errors. > > G_("static member function %qD cannot have cv-qualifier") > G_("non-member function %qD cannot have cv-qualifier") > > G_("static

Re: [x86 PATCH] PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md

2023-10-18 Thread Uros Bizjak
On Tue, Oct 17, 2023 at 7:54 PM Roger Sayle wrote: > > > Hi Uros, > Thanks for the speedy review. > > > From: Uros Bizjak > > Sent: 17 October 2023 17:38 > > > > On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle > > wrote: > > > > > > > > > This patch is the backend piece of a solution to PRs 101955

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread waffl3x
> > I will try to get something done today, but I was struggling with > > writing some of the tests, there's also a lot more of them now. I also > > wrote a bunch of musings in comments that I would like feedback on. > > > > My most concrete question is, how exactly should I be testing a > >

[committed] pru: Implement TARGET_INSN_COST

2023-10-18 Thread Dimitar Dimitrov
This patch slightly improves the embench-iot benchmark score for PRU code size. There is also small improvement in a few real-world firmware programs. Embench-iot size -- Benchmark before afterdelta -

Re: [PATCH V2 00/14] Refactor and cleanup vsetvl pass

2023-10-18 Thread Patrick O'Neill
Hi Luhua, Here's the excerpts from the debug log. I think the full log files are too large to send over email. rv32_gcv avl_single-32.c: Executing on host: /scratch/tc-testing/tc-oct-17-vsetvli-refactor/build/build-gcc-linux-stage2/gcc/xgcc

[avr,committed] LibF7: Implement a function that was missing for devices without MUL.

2023-10-18 Thread Georg-Johann Lay
This implements the worker function for double multiplication for devices without MUL instruction. Johann -- LibF7: Implement mul_mant for devices without MUL instruction. libgcc/config/avr/libf7/ * libf7-asm.sx (mul_mant): Implement for devices without MUL. * asm-defs.h

Re: [PATCH] c++/modules: ICE with lambda initializing local var [PR105322]

2023-10-18 Thread Patrick Palka
On Wed, 18 Oct 2023, Patrick Palka wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > trunk? Note that this doesn't fix the other testcase in the PR, which doesn't use any lambdas and which ICEs in the same way: export module pr105322; auto f() {

[PATCH] c++/modules: ICE with lambda initializing local var [PR105322]

2023-10-18 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- For a local variable initialized by a lambda: auto f = []{}; The corresponding BLOCK_VARS contains the variable declaration first, followed by the closure type declaration, consistent with the

RE: [Patch] nvptx: Use fatal_error when -march= is missing not an assert [PR111093]

2023-10-18 Thread Roger Sayle
Hi Tomas, Tobias and Tom, Thanks for asking. Interestingly, I've a patch (attached) from last year that tackled some of the issues here. The surface problem is that nvptx's march and misa are related in complicated ways. Specifying an arch defines the range of valid isa's, and specifying an

[3/3] WIP/RFC: Fix name mangling for target_clones

2023-10-18 Thread Andrew Carlotti
This is a partial patch to make the mangling of function version names for target_clones match those generated using the target or target_version attributes. It modifies the name of function versions, but does not yet rename the resolved symbol, resulting in a duplicate symbol name (and an error

[2/3] [aarch64] Add function multiversioning support

2023-10-18 Thread Andrew Carlotti
This adds initial support for function multiversion on aarch64 using the target_version and target_clones attributes. This mostly follows the Beta specification in the ACLE [1], with a few diffences that remain to be fixed: - Symbol mangling for target_clones differs from that for target_version

[1/3] Add support for target_version attribute

2023-10-18 Thread Andrew Carlotti
This patch adds support for the "target_version" attribute to the middle end and the C++ frontend, which will be used to implement function multiversioning in the aarch64 backend. Note that C++ is currently the only frontend which supports multiversioning using the "target" attribute, whereas the

[0/3] target_version and aarch64 function multiversioning

2023-10-18 Thread Andrew Carlotti
This series adds support for function multiversioning on aarch64. There are a few minor issues in patch 2/3, that I intend to fix in future versions or follow-up patches. I also have some open questions about the correctness of existing function multiversioning implementations [1], that could

Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-18 Thread Siddhesh Poyarekar
[Sorry, I forgot to respond to this] On 2023-10-06 16:01, Martin Uecker wrote: Am Freitag, dem 06.10.2023 um 06:50 -0400 schrieb Siddhesh Poyarekar: On 2023-10-06 01:11, Martin Uecker wrote: Am Donnerstag, dem 05.10.2023 um 15:35 -0700 schrieb Kees Cook: On Thu, Oct 05, 2023 at 04:08:52PM

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
> On Oct 18, 2023, at 11:18 AM, Siddhesh Poyarekar wrote: > > On 2023-10-18 10:51, Qing Zhao wrote: > + member FIELD_DECL is a valid field of the containing structure's > fieldlist, > + FIELDLIST, Report error and remove this attribute when it's not. */ > +static void

aarch64: Replace duplicated selftests

2023-10-18 Thread Andrew Carlotti
Pushed as obvious. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_test_fractional_cost): Test <= instead of testing < twice. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Siddhesh Poyarekar
On 2023-10-18 10:51, Qing Zhao wrote: + member FIELD_DECL is a valid field of the containing structure's fieldlist, + FIELDLIST, Report error and remove this attribute when it's not. */ +static void +verify_counted_by_attribute (tree fieldlist, tree field_decl) +{ + tree attr_counted_by

Re: [PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Richard Biener
> Am 18.10.2023 um 16:19 schrieb Robin Dapp : > > Hi, > > even though there was no full conclusion yet I took the liberty of > just posting this as a patch in case of further discussion. > > In PR/111794 we miss a vectorization because on riscv type precision and > mode precision differ for

[PATCH V2 3/7] aarch64: Implement system register validation tools

2023-10-18 Thread Victor Do Nascimento
Given the implementation of a mechanism of encoding system registers into GCC, this patch provides the mechanism of validating their use by the compiler. In particular, this involves: 1. Ensuring a supplied string corresponds to a known system register name. System registers can be

[PATCH V2 6/7] aarch64: Add front-end argument type checking for target builtins

2023-10-18 Thread Victor Do Nascimento
In implementing the ACLE read/write system register builtins it was observed that leaving argument type checking to be done at expand-time meant that poorly-formed function calls were being "fixed" by certain optimization passes, meaning bad code wasn't being properly picked up in checking.

[PATCH V2 2/7] aarch64: Add support for aarch64-sys-regs.def

2023-10-18 Thread Victor Do Nascimento
This patch defines the structure of a new .def file used for representing the aarch64 system registers, what information it should hold and the basic framework in GCC to process this file. Entries in the aarch64-system-regs.def file should be as follows: SYSREG (NAME, CPENC (sn,op1,cn,cm,op2),

[PATCH V2 7/7] aarch64: Add system register duplication check selftest

2023-10-18 Thread Victor Do Nascimento
Add a build-time test to check whether system register data, as imported from `aarch64-sys-reg.def' has any duplicate entries. Duplicate entries are defined as any two SYSREG entries in the .def file which share the same encoding values (as specified by its `CPENC' field) and where the

[PATCH V2 1/7] aarch64: Sync system register information with Binutils

2023-10-18 Thread Victor Do Nascimento
This patch adds the `aarch64-sys-regs.def' file, originally written for Binutils, to GCC. In so doing, it provides GCC with the necessary information for teaching the compiler about system registers known to the assembler and how these can be used. By aligning the representation of data common to

[PATCH V2 4/7] aarch64: Add basic target_print_operand support for CONST_STRING

2023-10-18 Thread Victor Do Nascimento
Motivated by the need to print system register names in output assembly, this patch adds the required logic to `aarch64_print_operand' to accept rtxs of type CONST_STRING and process these accordingly. Consequently, an rtx such as: (set (reg/i:DI 0 x0) (unspec:DI [(const_string

[PATCH V2 0/7] aarch64: Add support for __arm_rsr and __arm_wsr ACLE function family

2023-10-18 Thread Victor Do Nascimento
This revision of the patch series addresses the following key pieces of upstream feedback: * `aarch64-sys-regs.def', being identical in content to the file with the same name in Binutils, now retains the copyright header from Binutils. * We migrate away from the binary search handling of

[PATCH V2 5/7] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-10-18 Thread Victor Do Nascimento
Implement the aarch64 intrinsics for reading and writing system registers with the following signatures: uint32_t __arm_rsr(const char *special_register); uint64_t __arm_rsr64(const char *special_register); void* __arm_rsrp(const char *special_register); float

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
>>> + member FIELD_DECL is a valid field of the containing structure's >>> fieldlist, >>> + FIELDLIST, Report error and remove this attribute when it's not. */ >>> +static void >>> +verify_counted_by_attribute (tree fieldlist, tree field_decl) >>> +{ >>> + tree attr_counted_by =

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-18 Thread David Edelsohn
[Resending from correct email.] Hi, Surya Thanks for working on this issue and creating a patch. It helps if you explicitly send patches to Segher and me, and copy gcc-patches. +/* Return true if insn is a non-permuting load/store. */ +static bool +non_permuting_mem_insn (swap_web_entry

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-18 Thread David Edelsohn
Hi, Surya Thanks for working on this issue and creating a patch. It helps if you explicitly send patches to Segher and me, and copy gcc-patches. +/* Return true if insn is a non-permuting load/store. */ +static bool +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i) +{ +

Re: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-10-18 Thread Qing Zhao
Hi, Sid, Thanks a lot for your time and effort to review this patch set! And sorry for my late reply due to a long vacation immediately after Cauldron, just came back this Monday.. See my reply embedded below: > On Oct 5, 2023, at 2:51 PM, Siddhesh Poyarekar wrote: > > On 2023-08-25 11:24,

[PATCH6/8] omp: Reorder call for TARGET_SIMD_CLONE_ADJUST (was Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM)

2023-10-18 Thread Andre Vieira (lists)
This patch moves the call to TARGET_SIMD_CLONE_ADJUST until after the arguments and return types have been transformed into vector types. It also constructs the adjuments and retval modifications after this call, allowing targets to alter the types of the arguments and return of the clone

Re: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-10-18 Thread Andre Vieira (lists)
Rebased, no major changes, still needs review. On 30/08/2023 10:19, Andre Vieira (lists) via Gcc-patches wrote: This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum

Re: [PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485]

2023-10-18 Thread Andre Vieira (lists)
Rebased on top of trunk, minor change to check if loop_vinfo since we now do some slp vectorization for simd_clones. I assume the previous OK still holds. On 30/08/2023 13:54, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: When analyzing a loop and choosing a

[PATCH 0/8] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS

2023-10-18 Thread Andre Vieira (lists)
Refactor simd clone handling code ahead of support for poly simdlen. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_supbarts with TYPE_VECTOR_SUBPARTS. (ipa_simd_modify_function_body):

Re: [PATCH 5/8] vect: Use inbranch simdclones in masked loops

2023-10-18 Thread Andre Vieira (lists)
Rebased, needs review. On 30/08/2023 10:13, Andre Vieira (lists) via Gcc-patches wrote: This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function

Re: [Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones

2023-10-18 Thread Andre Vieira (lists)
Made it a local function and changed prototype according to comments. Is this OK? gcc/ChangeLog: * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. (simd_clone_call_p): New

Re: [Patch 2/8] parloops: Allow poly nit and bound

2023-10-18 Thread Andre Vieira (lists)
Posting the changed patch for completion, already reviewed. On 30/08/2023 13:32, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. Can you use poly_int_tree_p to

Re: [PATCH 1/8] parloops: Copy target and optimizations when creating a function clone

2023-10-18 Thread Andre Vieira (lists)
Just posting a rebase for completion. On 30/08/2023 13:31, Richard Biener wrote: On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy

Re: aarch64, vect, omp: Add SVE support for simd clones [PR 96342]

2023-10-18 Thread Andre Vieira (lists)
Hi, I noticed I had missed one of the preparatory patches at the start of this series (first one) added now, also removed the 'vect: Add vector_mode paramater to simd_clone_usable' since after review we no longer deemed it necessary. And replaced the old vect: Add

RE: [x86 PATCH] PR target/110551: Fix reg allocation for widening multiplications.

2023-10-18 Thread Roger Sayle
Many thanks to Tobias Burnus for pointing out the mistake/typo in the PR number. This fix is for PR 110551, not PR 110511. I'll update the ChangeLog and filename of the new testcase, if approved. Sorry for any inconvenience/confusion. Cheers, Roger -- > -Original Message- > From:

[PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Robin Dapp
Hi, even though there was no full conclusion yet I took the liberty of just posting this as a patch in case of further discussion. In PR/111794 we miss a vectorization because on riscv type precision and mode precision differ for mask types. We can still vectorize when allowing assignments with

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread Jason Merrill
On 10/18/23 07:46, waffl3x wrote: Any progress on this, or do I need to coax the process along? :) Yeah, I've been working on it since the copyright assignment process has finished, originally I was going to note that on my next update which I had hoped to finish today or tomorrow. Well, in

Re: Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread 钟居哲
Could you by the way add this mention this PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 Add the test of this PR ? juzhe.zh...@rivai.ai   From: Robin Dapp Date: 2023-10-18 21:51 To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw CC: rdapp.gcc

Re: [Backport RFA] lra: Avoid unfolded plus-0

2023-10-18 Thread Vladimir Makarov
On 10/18/23 09:37, Richard Sandiford wrote: Vlad, is it OK if I backport the patch below to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has given a conditional OK on irc. Ok.  It should be safe.  I don't expect any issues because of this.

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
I didn't push this yet because it would have introduced an UNRESOLVED that my summary script didn't catch. Normally I go with just contrib/test_summary but that only filters out FAIL and XPASS. I should really be using compare_testsuite_log.py from riscv-gnu-toolchain/scripts. It was caused by

[Backport RFA] lra: Avoid unfolded plus-0

2023-10-18 Thread Richard Sandiford
Vlad, is it OK if I backport the patch below to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has given a conditional OK on irc. Thanks, Richard Richard Sandiford writes: > While backporting another patch to an earlier release, I hit a > situation in which

Re: [PATCH V5] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-18 Thread juzhe.zh...@rivai.ai
Hi, this patch fix V4 issue: Previously as Richard S commented: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633178.html slp_op and mask_vectype are only initialised when mask_index >= 0. Shouldn't this code be under mask_index >= 0 too? Also, when do we encounter mismatched

[PATCH V5] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-18 Thread Juzhe-Zhong
This patch fixes this following FAILs in RISC-V regression: FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-3.c -flto

[PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread juzhe.zh...@rivai.ai
LGTM popcount patch. juzhe.zh...@rivai.ai

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 01:33:40PM +0200, Jakub Jelinek wrote: > Making it guaranteed that it has at least one argument say through > template poly_int(const U &, const T &...) {} > fixes it for 4.8/4.9 as well. So, perhaps (but so far totally untested, the other bootstrap is still running):

RE: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-18 Thread Li, Pan2
Thanks Richard, let's wait for a while incase there are comments from others due to not familiar with these parts. Pan -Original Message- From: Richard Biener Sent: Wednesday, October 18, 2023 2:34 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang ;

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-10-18 Thread waffl3x
> Any progress on this, or do I need to coax the process along? :) Yeah, I've been working on it since the copyright assignment process has finished, originally I was going to note that on my next update which I had hoped to finish today or tomorrow. Well, in truth I was hoping to send one the

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on popcount. Added VLS modes and your test in v2. Testsuite looks unchanged on my side (vect, dg, rvv). Regards Robin Subject: [PATCH v2] RISC-V: Add popcount fallback expander. I didn't manage to get back to the generic

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 11:23:49AM +0100, Richard Sandiford wrote: > > --- a/gcc/cse.cc > > +++ b/gcc/cse.cc > > @@ -4951,8 +4951,14 @@ cse_insn (rtx_insn *insn) > > && is_a (mode, _mode) > > && (extend_op = load_extend_op (int_mode)) != UNKNOWN) > > { > > +#if

Re: [Patch] OpenMP: Avoid ICE with LTO and 'omp allocate (was: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars)

2023-10-18 Thread Jakub Jelinek
On Wed, Oct 18, 2023 at 12:56:01PM +0200, Tobias Burnus wrote: > On 18.10.23 11:36, Jakub Jelinek wrote: > > On Wed, Oct 18, 2023 at 11:12:44AM +0200, Thomas Schwinge wrote: > > > +FAIL: gfortran.dg/gomp/allocate-13.f90 -O (internal compiler > > > error: tree code 'statement_list' is not

[Patch] OpenMP: Avoid ICE with LTO and 'omp allocate (was: [Patch] Fortran: Support OpenMP's 'allocate' directive for stack vars)

2023-10-18 Thread Tobias Burnus
On 18.10.23 11:36, Jakub Jelinek wrote: On Wed, Oct 18, 2023 at 11:12:44AM +0200, Thomas Schwinge wrote: +FAIL: gfortran.dg/gomp/allocate-13.f90 -O (internal compiler error: tree code 'statement_list' is not supported in LTO streams) Any references to GENERIC code in clauses etc.

[PATCH v2] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-18 Thread Dimitrij Mijoski
We can test codecvt::length() with the same data that we test codecvt::in(). For each call of in() we add another call to length(). Some additional small cosmentic changes are applied. libstdc++-v3/ChangeLog: * testsuite/22_locale/codecvt/codecvt_unicode.h: Test length() ---

Re: [PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread juzhe.zh...@rivai.ai
More details of VSETVL bug: Loop: 10ddc: 9ed030d7vmv1r.v v1,v13 10de0: b21040d7vncvt.x.x.w v1,v1 10de4: 5e0785d7vmv.v.v v11,v15 10de8: b700a5d7vmacc.vvv11,v1,v16 10dec:

Re: [PATCH] Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear induction vec_step_op_mul when iteration count is too big. 65;6800;1c There's loop in vect_peel_nonlinear_iv_init to get

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, liuhongt wrote: > Also give up vectorization when niters_skip is negative which will be > used for fully masked loop. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR tree-optimization/111820 > PR

Re: [PATCH] tree-ssa-math-opts: Fix up match_uaddc_usubc [PR111845]

2023-10-18 Thread Richard Biener
On Wed, 18 Oct 2023, Jakub Jelinek wrote: > Hi! > > GCC ICEs on the first testcase. Successful match_uaddc_usubc ends up with > some dead stmts which DCE will remove (hopefully) later all. > The ICE is because one of the dead stmts refers to a freed SSA_NAME. > The code already gsi_removes a

Re: [PATCH] libstdc++: testsuite: Enhance codecvt_unicode with tests for length()

2023-10-18 Thread Dimitrij Mijoski
On Wed, 2023-10-18 at 10:52 +0100, Jonathan Wakely wrote: > On Tue, 17 Oct 2023 at 23:51, Dimitrij Mijoski wrote: > > > > We can test codecvt::length() with the same data that we test > > codecvt::in(). For each call of in() we add another call to length(). > > Some additional small cosmentic

Re: [PATCH] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread juzhe.zh...@rivai.ai
Forget about this patch. Commit log code example is wrong, fixed it in V2: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633420.html Thanks. juzhe.zh...@rivai.ai From: Juzhe-Zhong Date: 2023-10-18 18:21 To: gcc-patches CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc;

[PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread Juzhe-Zhong
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So,

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Richard Sandiford
Jakub Jelinek writes: > On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote: >> It seemed like there was considerable support for bumping the minimum >> to beyond 4.8. I think we should wait until a decision has been made >> before adding more 4.8 workarounds. > > I think adding a

[PATCH] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-18 Thread Juzhe-Zhong
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So,

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Jakub Jelinek
On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote: > It seemed like there was considerable support for bumping the minimum > to beyond 4.8. I think we should wait until a decision has been made > before adding more 4.8 workarounds. I think adding a workaround until that decision

  1   2   >