PR111754

2023-11-27 Thread juzhe.zh...@rivai.ai
Hi, there is a regression in RISC-V caused by this patch: FAIL: gcc.dg/vect/pr111754.c -flto -ffat-lto-objects scan-tree-dump optimized "return { 0.0, 9.0e\\+0, 0.0, 0.0 }" FAIL: gcc.dg/vect/pr111754.c scan-tree-dump optimized "return { 0.0, 9.0e\\+0, 0.0, 0.0 }" I have checked the dump is :

[PATCH v1] LoongArch: Remove duplicate definition of CLZ_DEFINED_VALUE_AT_ZERO.

2023-11-27 Thread Li Wei
In the r14-5547 commit, C[LT]Z_DEFINED_VALUE_AT_ZERO were defined at the same time, but in fact, CLZ_DEFINED_VALUE_AT_ZERO has already been defined, so remove the duplicate definition. gcc/ChangeLog: * config/loongarch/loongarch.h (CTZ_DEFINED_VALUE_AT_ZERO): Add description.

[PATCH] Take register pressure into account for vec_construct when the components are not loaded from memory.

2023-11-27 Thread liuhongt
For vec_contruct, the components must be live at the same time if they're not loaded from memory, when the number of those components exceeds available registers, spill happens. Try to account that with a rough estimation. ??? Ideally, we should have an overall estimation of register pressure if

Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores

2023-11-27 Thread Richard Biener
On Mon, 27 Nov 2023, Jeff Law wrote: > > > On 11/27/23 05:39, Robin Dapp wrote: > >> The easiest way to avoid running into the alias analysis problem is > >> to scrap the MEM_EXPR when we expand the internal functions for > >> partial loads/stores. That avoids the disambiguation we run into >

[PATCH] Expand: Pass down equality only flag to cmpmem expand

2023-11-27 Thread HAO CHEN GUI
Hi, This patch passes down the equality only flags from emit_block_cmp_hints to cmpmem optab so that the target specific expand can generate optimized insns for equality only compare. Targets (e.g. rs6000) can generate more efficient insn sequence if the block compare is equality only.

[PATCH v1 2/2] LoongArch: Optimize vector constant extract-{even/odd} permutation.

2023-11-27 Thread Li Wei
For vector constant extract-{even/odd} permutation replace the default [x]vshuf instruction combination with [x]vilv{l/h} instruction, which can reduce instructions and improves performance. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_is_odd_extraction):

[PATCH v1 1/2] LoongArch: Accelerate optimization of scalar signed/unsigned popcount.

2023-11-27 Thread Li Wei
In LoongArch, the vector popcount has corresponding instructions, while the scalar does not. Currently, the scalar popcount is calculated through a loop, and the value of a non-power of two needs to be iterated several times, so the vector popcount instruction is considered for optimization.

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-11-27 Thread Michael Meissner
I tried using this patch to compare with the vector size attribute patch I posted. I could not build it as a cross compiler on my x86_64 because the assembler gives the following error: Error: operand out of domain (11 is not a multiple of 2) for std_stacktrace-elf.o. If you look at the

Re: [PATCH 3/4] c23: aliasing of compatible tagged types

2023-11-27 Thread Martin Uecker
Am Dienstag, dem 28.11.2023 um 01:00 + schrieb Joseph Myers: > On Sun, 26 Nov 2023, Martin Uecker wrote: > > > My understand is that it is used for aliasing analysis and also > > checking of conversions. TYPE_CANONICAL must be consistent with > > the idea the middle-end has about type

Re: Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 10:04 PM Feng Wang wrote: > > On 2023-11-28 11:06 Andrew Pinski wrote: > >On Mon, Nov 27, 2023 at 6:56 PM Feng Wang > >wrote: > >> > >> The link of PATCH v1: > >> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html > >> This patch add another condition

[V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-27 Thread Jeff Law
I've still got some comments from Richard S to work through, but some folks are trying to play with this and thus I want to get the fixes to date in their hands. Changes since V1: - Fix handling of CALL_INSN_FUNCTION_USAGE so we don't apply PATTERN to an EXPR_LIST. - Various comments and

Re: Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Feng Wang
On 2023-11-28 11:06  Andrew Pinski wrote: >On Mon, Nov 27, 2023 at 6:56 PM Feng Wang wrote: >> >> The link of PATCH v1: >> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html >> This patch add another condition for gimple-cond optimization. Refer to >> the following test case.

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 11:19, Joern Rennecke wrote: You are applying PATTERN to an INSN_LIST. I know :-) That was the late change to clean up some of the horrific control flow in the code. jeff

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 10:36, Joern Rennecke wrote: On 11/20/23 11:26, Richard Sandiford wrote: + + mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit; + if (!mask) + mask = -0x1ULL; Not sure I follow this. What does the -0x1ULL constant indicate? Also, isn't it the

Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Jeff Law
On 11/27/23 19:46, Fei Gao wrote: On 2023-11-20 14:46  Jeff Law wrote: On 10/30/23 21:35, Fei Gao wrote: So just a few notes to further illustrate why I'm currently looking to take the VRULL+Ventana implementation.  The code above would be much better handled by just calling

Re: [PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Jeff Law
On 11/27/23 20:09, Kito Cheng wrote: Personally I don't like to play with the pattern order to tweak the code gen since it kinda introduces implicit relation/rule here, but I guess the only way to prevent that is to duplicate the pattern for SFB again, which is not an ideal solution... I

Re: [PATCH 4/4] [V2] [ifcvt] prefer SFB to Zicond for x=c ? (y op CONST) : y.

2023-11-27 Thread Jeff Law
On 11/27/23 19:32, Fei Gao wrote: In x=c ? (y op CONST) : y cases, Zicond based czero ifcvt generates more true dependency in code sequence than SFB based movcc. So exit noce_try_cond_zero_arith in such cases to have a better code sequence generated by noce_try_cmove_arith. Take the

[PATCH] MATCH: Fix invalid signed boolean type usage

2023-11-27 Thread Andrew Pinski
This fixes the incorrect assumption that was done in r14-3721-ge6bcf839894783, that being able to doing the negative after the conversion would be a valid thing but really it is not valid for boolean types. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-11-27 Thread Michael Meissner
On Fri, Nov 24, 2023 at 05:31:20PM +0800, Kewen.Lin wrote: > Hi Ajit, > > Don't forget to CC David (CC-ed) :), some comments are inlined below. > > on 2023/10/8 03:04, Ajit Agarwal wrote: > > Hello All: > > > > This patch add new pass to replace contiguous addresses vector load lxv > > with

Re: [PATCH 0/4] Add vector pair support to PowerPC attribute((vector_size(32)))

2023-11-27 Thread Michael Meissner
On Fri, Nov 24, 2023 at 05:41:02PM +0800, Kewen.Lin wrote: > on 2023/11/20 16:56, Michael Meissner wrote: > > On Mon, Nov 20, 2023 at 08:24:35AM +0100, Richard Biener wrote: > >> I wouldn't expose the "fake" larger modes to the vectorizer but rather > >> adjust m_suggested_unroll_factor (which you

[PATCH 0/5] LoongArch: Add -mrecip option support

2023-11-27 Thread Jiahao Xu
LoongArch V1.1 instructions adds support for approximate instructions, which are utilized along with additional Newton-Raphson steps implement single precision floating-point division, square root and reciprocal square root operations for better throughput. Control the generation of approximate

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-27 Thread waffl3x
On Sunday, November 26th, 2023 at 7:40 PM, Jason Merrill wrote: > > > On 11/26/23 20:44, waffl3x wrote: > > > > > > > The other problem I'm having is > > > > > > > > > > > > auto f0 = [n = 5, ](this auto const&){ n = 10; }; > > > > > > This errors just fine, the lambda is unconditionally

[PATCH 3/5] LoongArch: Redefine pattern for xvfrecip/vfrecip instructions.

2023-11-27 Thread Jiahao Xu
Redefine pattern for [x]vfrecip instructions use rtx code instead of unspec, and enable [x]vfrecip instructions to be generated during auto-vectorization. gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvfrecip_): Renamed to .. (recip3): .. this. *

[PATCH 5/5] LoongArch: Vectorized loop unrolling is not performed on divf/sqrtf/rsqrtf with turns on -mrecip.

2023-11-27 Thread Jiahao Xu
Using -mrecip generates a sequence of instructions to replace divf, sqrtf and rsqrtf. The number of generated instructions is close to or exceeds the maximum issue of the LoongArch, so vectorized loop unrolling is not performed on them. gcc/ChangeLog: * config/loongarch/loongarch.cc

[PATCH 4/5] LoongArch: New options -mrecip and -mrecip= with ffast-math.

2023-11-27 Thread Jiahao Xu
When -mrecip option is turned on, use approximate reciprocal instructions and approximate reciprocal square root instructions with additional Newton-Raphson steps to implement single precision floating-point division, square root and reciprocal square root operations for better throughput.

[PATCH 2/5] LoongArch: Use standard pattern name for xvfrsqrt/vfrsqrt instructions.

2023-11-27 Thread Jiahao Xu
Rename lasx_xvfrsqrt*/lsx_vfrsqrt* to rsqrt2 to align with standard pattern name. gcc/ChangeLog: * config/loongarch/lasx.md (lasx_xvfrsqrt_): Renamed to .. (*rsqrt2): .. this. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vfrsqrt_d): Redefine to standard

[PATCH 1/5] LoongArch: Add support for approximate instructions.

2023-11-27 Thread Jiahao Xu
LA664 introduces new instructions for reciprocal approximation and reciprocal square root approximation. It includes the scalar instructions frecipe and frsrte, as well as their corresponding vector instructions [x]vfrecipe and [x]vfrsqrte. This patch adds define_insn/builtins/intrinsics for

Re: [PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Kito Cheng
Personally I don't like to play with the pattern order to tweak the code gen since it kinda introduces implicit relation/rule here, but I guess the only way to prevent that is to duplicate the pattern for SFB again, which is not an ideal solution... Anyway, it's obviously a better code gen, so

Re: [PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 6:56 PM Feng Wang wrote: > > The link of PATCH v1: > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html > This patch add another condition for gimple-cond optimization. Refer to > the following test case. > int foo1 (int data, int res) > { > res = data

Re: Re: [PATCH 4/4] [ifcvt] if convert x=c ? y : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 15:10  Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: > >> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >> index 6e341fc4d4b..cfa9bc4b850 100644 >> --- a/gcc/ifcvt.cc >> +++ b/gcc/ifcvt.cc >> @@ -2911,7 +2911,7 @@ noce_try_sign_mask (struct noce_if_info *if_info) >>   static

Re: [PATCH V2] introduce light expander sra

2023-11-27 Thread Jiufu Guo
Hi, Thanks so much for your helpful review! Richard Biener writes: > On Fri, Oct 27, 2023 at 3:51 AM Jiufu Guo wrote: >> >> Hi, >> >> Compare with previous version: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632399.html >> This verion supports TI/VEC mode of the access. >> >>

Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 14:59  Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: >> Conditional add, if zero >> rd = (rc == 0) ? (rs1 + rs2) : rs1 >> --> >> czero.nez rd, rs2, rc >> add rd, rs1, rd >> >> Conditional add, if non-zero >> rd = (rc != 0) ? (rs1 + rs2) : rs1 >> --> >> czero.eqz rd, rs2,

[PATCH v2] gimple-match.pd Add more optimization for gimple_cond

2023-11-27 Thread Feng Wang
The link of PATCH v1: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg326661.html This patch add another condition for gimple-cond optimization. Refer to the following test case. int foo1 (int data, int res) { res = data & 0xf; res |= res << 4; if (res < 0x22) return 0x22;

Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
On 2023-11-20 14:46  Jeff Law wrote: > > > >On 10/30/23 21:35, Fei Gao wrote: > >>> So just a few notes to further illustrate why I'm currently looking to >>> take the VRULL+Ventana implementation.  The code above would be much >>> better handled by just calling noce_emit_cmove.  noce_emit_cmove

[PATCH 2/4] [ifcvt] optimize x=c ? (y op z) : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT] SIGN_EXTEND, ZERO_EXTEND and SUBREG has been considered to support SImode in 64-bit machine. Conditional op, if zero rd = (rc == 0) ? (rs1 op rs2) : rs1 --> czero.nez rd, rs2, rc op rd, rs1, rd Conditional op, if non-zero

[PATCH 4/4] [V2] [ifcvt] prefer SFB to Zicond for x=c ? (y op CONST) : y.

2023-11-27 Thread Fei Gao
In x=c ? (y op CONST) : y cases, Zicond based czero ifcvt generates more true dependency in code sequence than SFB based movcc. So exit noce_try_cond_zero_arith in such cases to have a better code sequence generated by noce_try_cmove_arith. Take the following case for example. CFLAGS:

[PATCH 3/4] [ifcvt] optimize x=c ? (y op const_int) : y by RISC-V Zicond like insns

2023-11-27 Thread Fei Gao
op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT] Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_cond_zero_shift_op_supported): check if OP is shift like operation (noce_cond_zero_binary_op_supported): restructure & call

[PATCH 1/4] [RISC-V] prefer Zicond primitive semantics to SFB

2023-11-27 Thread Fei Gao
Move Zicond md files ahead of SFB to recognize Zicond first. Take the following case for example. CFLAGS: -mtune=sifive-7-series -march=rv64gc_zicond -mabi=lp64d long primitiveSemantics_00(long a, long b) { return a == 0 ? 0 : b; } before patch: primitiveSemantics_00: bne

Re: [PATCH] RISC-V: Fix VSETVL PASS regression

2023-11-27 Thread juzhe.zhong
committed as it passed zvl128/256/512/1024 no regression. Replied Message FromJuzhe-ZhongDate11/27/2023 21:24 Togcc-patches@gcc.gnu.org Cckito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.com,rdapp@gmail.com,Juzhe-ZhongSubject[PATCH] RISC-V: Fix VSETVL PASS regression

Re: [PATCH 5/5] diagnostics: don't print annotation lines when there's no column info

2023-11-27 Thread David Malcolm
On Tue, 2023-11-21 at 17:20 -0500, David Malcolm wrote: > gcc/ChangeLog: > * diagnostic-show-locus.cc > (layout::maybe_add_location_range): > Don't print annotation lines for ranges when there's no > column > info. > (selftest::test_one_liner_no_column): New. >  

Re: [PATCH 4/5] diagnostics: add diagnostic_context::get_location_text

2023-11-27 Thread David Malcolm
On Tue, 2023-11-21 at 17:20 -0500, David Malcolm wrote: > No functional change intended. > > gcc/ChangeLog: > * diagnostic.cc (diagnostic_get_location_text): Convert to... > (diagnostic_context::get_location_text): ...this, and convert > return type from char * to

Re: [PATCH] libcpp: Fix unsigned promotion for unevaluated divide by zero [PR112701]

2023-11-27 Thread Joseph Myers
On Mon, 27 Nov 2023, Lewis Hyatt wrote: > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701 > > Here is a one-line fix to an edge case in libcpp's expression evaluator > noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it OK > please? Thanks! OK. -- Joseph

Re: [PATCH 3/4] c23: aliasing of compatible tagged types

2023-11-27 Thread Joseph Myers
On Sun, 26 Nov 2023, Martin Uecker wrote: > My understand is that it is used for aliasing analysis and also > checking of conversions. TYPE_CANONICAL must be consistent with > the idea the middle-end has about type conversions. But as long > as we do not give the same TYPE_CANONICAL to types

Re: c: tree: target: C2x (...) function prototypes and va_start relaxation

2023-11-27 Thread Joseph Myers
On Sat, 25 Nov 2023, Gerald Pfeifer wrote: > On Fri, 21 Oct 2022, Joseph Myers wrote: > > C2x allows function prototypes to be given as (...), a prototype > > meaning a variable-argument function with no named arguments. > > I noticed this did not make it into gcc-13/changes.html ? Was that >

[PATCH] libcpp: Fix unsigned promotion for unevaluated divide by zero [PR112701]

2023-11-27 Thread Lewis Hyatt
Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701 Here is a one-line fix to an edge case in libcpp's expression evaluator noted in the PR. Bootstrap + regtest all languages on x86-64 Linux. Is it OK please? Thanks! -Lewis -- >8 -- When libcpp encounters a divide by zero while

Re: [committed v2] libstdc++: Define std::ranges::to for C++23 (P1206R7) [PR111055]

2023-11-27 Thread Hans-Peter Nilsson
> From: Jonathan Wakely > Date: Thu, 23 Nov 2023 17:51:38 + > libstdc++-v3/ChangeLog: > > PR libstdc++/111055 > * include/bits/ranges_base.h (from_range_t): Define new tag > type. > (from_range): Define new tag object. > * include/bits/version.def

Re: [PATCH] fold-mem-offsets: Fix powerpc64le-linux profiledbootstrap [PR111601]

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 3:51 PM Jakub Jelinek wrote: > > On Mon, Nov 27, 2023 at 09:52:14PM +0100, Jakub Jelinek wrote: > > On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > > > gcc/ChangeLog: > > > > > > > > * Makefile.in: Add fold-mem-offsets.o. > > > > * passes.def: Schedule a

[PATCH] fold-mem-offsets: Fix powerpc64le-linux profiledbootstrap [PR111601]

2023-11-27 Thread Jakub Jelinek
On Mon, Nov 27, 2023 at 09:52:14PM +0100, Jakub Jelinek wrote: > On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > > gcc/ChangeLog: > > > > > > * Makefile.in: Add fold-mem-offsets.o. > > > * passes.def: Schedule a new pass. > > > * tree-pass.h (make_pass_fold_mem_offsets):

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-27 Thread Michael Meissner
On Fri, Nov 10, 2023 at 06:03:40PM -0600, Peter Bergner wrote: > On 8/25/23 6:20 AM, Kewen.Lin wrote: > > btw, I was also expecting that we don't implicitly set > > OPTION_MASK_PCREL any more for Power10, that is to remove > > OPTION_MASK_PCREL from OTHER_POWER10_MASKS. > > So my patch removes

[PATCH] aarch64: Improve cost of `a ? {-,}1 : b`

2023-11-27 Thread Andrew Pinski
While looking into PR 112454, I found the cost for `(if_then_else (cmp) (const_int 1) (reg))` was being recorded as 8 (or `COSTS_N_INSNS (2)`) but it should have been 4 (or `COSTS_N_INSNS (1)`). This improves the cost by not adding the cost of `(const_int 1)` to the total cost. It does not does

[COMMITTED] Fix time-profiler-3.c after r14-5628-g53ba8d669550d3

2023-11-27 Thread Andrew Pinski
This testcase started to fail after r14-5628-g53ba8d669550d3 because IPA-VRP can now start to figure out the functions return a constant value and there was nothing that profiling needed to profile any more. This disables IPA-VRP for this testcase to be able to profile again. Bootrapped/tested on

RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:40 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 9/21]middle-end: implement vectorizable_early_exit for > codegen of exit code > > Hi All, > >

RE: [PATCH 10/21]middle-end: implement relevancy analysis support for control flow

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:40 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 10/21]middle-end: implement relevancy analysis support for > control flow > > Hi All, > > This

RE: [PATCH 12/21]middle-end: Add remaining changes to peeling and vectorizer to support early breaks

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 12/21]middle-end: Add remaining changes to peeling and > vectorizer to support early breaks > >

RE: [PATCH 13/21]middle-end: Update loop form analysis to support early break

2023-11-27 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 13/21]middle-end: Update loop form analysis to support > early break > > Hi All, > > This sets

RE: [PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits

2023-11-27 Thread Tamar Christina
> > > This is a respun patch with a fix for VLA. > > > > This adds support to vectorizable_live_reduction to handle multiple > > exits by doing a search for which exit the live value should be > > materialized in. > > > > Additionally which value in the index we're after depends on whether > >

[PATCH]middle-end: refactor vectorizable_live_operation into helper method for codegen

2023-11-27 Thread Tamar Christina
Hi All, To make code review of the updates to add multiple exit supports to vectorizable_live_operation easier I've extracted the refactoring part to its own patch. This patch is a straight extract of the function with no functional changes. Bootstrapped Regtested on aarch64-none-linux-gnu and

[PATCH]middle-end: prevent LIM from hoising vector compares from gconds if target does not support it.

2023-11-27 Thread Tamar Christina
Hi All, LIM notices that in some cases the condition and the results are loop invariant and tries to move them out of the loop. While the resulting code is operationally sound, moving the compare out of the gcond results in generating code that no longer branches, so cbranch is no longer

Re: [PATCH v7] Implement new RTL optimizations pass: fold-mem-offsets.

2023-11-27 Thread Jakub Jelinek
On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote: > > gcc/ChangeLog: > > > > * Makefile.in: Add fold-mem-offsets.o. > > * passes.def: Schedule a new pass. > > * tree-pass.h (make_pass_fold_mem_offsets): Declare. > > * common.opt: New options. > > * doc/invoke.texi:

Re: [PATCH v3 00/11] : More warnings as errors by default

2023-11-27 Thread Sam James
Florian Weimer writes: > * Jeff Law: > >> On 11/20/23 02:55, Florian Weimer wrote: >>> This revision addresses Marek's comment about handing >>> -Wdeclaration-missing-parameter-type properly in conjunction with >>> -fpermissive. A new test (permerror-fpermissive-nowarning.c) >>> demonstrates

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 13:03, Richard Sandiford wrote: Joern Rennecke writes: On 11/20/23 11:26, Richard Sandiford wrote: + /* ?!? What is the point of this adjustment to DST_MASK? */ + if (code == PLUS || code == MINUS + || code == MULT || code == ASHIFT) + dst_mask + = dst_mask ?

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Richard Sandiford
[Sorry for the slow response] Jeff Law writes: > On 11/20/23 11:26, Richard Sandiford wrote: >> >>scalar_int_mode outer_mode; >>if (!is_a (GET_MODE (x), _mode) >>|| GET_MODE_BITSIZE (outer_mode) > 64) >> continue; > Wouldn't we also want to verify that the

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Richard Sandiford
Joern Rennecke writes: > On 11/20/23 11:26, Richard Sandiford wrote: >>> + /* ?!? What is the point of this adjustment to DST_MASK? */ >>> + if (code == PLUS || code == MINUS >>> + || code == MULT || code == ASHIFT) >>> + dst_mask >>> + = dst_mask ? ((2ULL << floor_log2 (dst_mask))

[PATCH] Fortran: deferred-length character optional dummy arguments [PR93762,PR100651]

2023-11-27 Thread Harald Anlauf
Dear all, the attached patch fixes the passing of deferred-length character to optional dummy arguments: the character length shall be passed by reference, not by value. Original analysis of the issue by Steve in PR93762, independently done by FX in PR100651. The patch fixes both PRs.

Re: hurd: Ad default-pie and static-pie support

2023-11-27 Thread Samuel Thibault
Thomas Schwinge, le lun. 27 nov. 2023 15:52:02 +0100, a ecrit: > On 2023-10-28T21:20:39+0200, Samuel Thibault wrote: > > This fixes the Hurd spec in the default-pie case, and adds static-pie > > support. > > I understand that your change does work for you as-is, so I've now pushed > that to

Re: hurd: Add multilib paths for gnu-x86_64

2023-11-27 Thread Samuel Thibault
Hello, Thomas Schwinge, le lun. 27 nov. 2023 15:48:33 +0100, a ecrit: > On 2023-10-28T21:19:59+0200, Samuel Thibault wrote: > > This is essentially based on t-linux64 version. > > Yes, but isn't the overall setup diverged from GNU/Linux? Not sure what you mean exactly? I just meant that the

Re: [PATCH v2] Fixed problem with BTF defining smaller enums.

2023-11-27 Thread David Faust
Hi Cupertino, On 11/27/23 09:21, Cupertino Miranda wrote: > Hi everyone, > > David: Thanks for the v1 review. > > This version adds the following; > - test case, > - improves condition logic, > - fixes mask typo. > > Looking forward to your review. v2 LGTM, please apply. Thanks! > > v1

Re: [r14-5666 Regression] FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile "Read tp_first_run: 2" 1 on Linux/x86_64

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 12:00 AM Sebastian Huber wrote: > > On 26.11.23 12:18, haochen.jiang wrote: > > On Linux/x86_64, > > > > 41aacdea55c5d795a7aa195357d966645845d00e is the first bad commit > > commit 41aacdea55c5d795a7aa195357d966645845d00e > > Author: Sebastian Huber > > Date: Mon Nov 20

Re: [PATCH v6 0/21]middle-end: Support early break/return auto-vectorization

2023-11-27 Thread Richard Sandiford
Catching up on backlog, so this might already be resolved, but: Richard Biener writes: > On Tue, 7 Nov 2023, Tamar Christina wrote: > >> > -Original Message- >> > From: Richard Biener >> > Sent: Tuesday, November 7, 2023 9:43 AM >> > To: Tamar Christina >> > Cc:

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Joern Rennecke
You are applying PATTERN to an INSN_LIST. diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index 52032b50951..4523654538c 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -122,10 +122,9 @@ safe_for_live_propagation (rtx_code code) optimziation phase during use handling will be. */ static

Re: [PATCH] tree-sra: Avoid returns of references to SRA candidates

2023-11-27 Thread Andrew Pinski
On Mon, Nov 27, 2023 at 10:16 AM Martin Jambor wrote: > > Hi, > > The enhancement to address PR 109849 contained an importsnt thinko, > and that any reference that is passed to a function and does not > escape, must also not happen to be aliased by the return value of the > function. This has

Re: [PATCH V2 3/3] OpenMP: Use enumerators for names of trait-sets and traits

2023-11-27 Thread Tobias Burnus
On 27.11.23 18:19, Tobias Burnus wrote: + { "unified_address", + (1 << OMP_TRAIT_SET_IMPLEMENTATION), + OMP_TRAIT_PROPERTY_NONE, true, + NULL + }, I don't understand this code. This looks as if "requires" and "unified_address" are on the same level but in my understanding they

[PATCH] tree-sra: Avoid returns of references to SRA candidates

2023-11-27 Thread Martin Jambor
Hi, The enhancement to address PR 109849 contained an importsnt thinko, and that any reference that is passed to a function and does not escape, must also not happen to be aliased by the return value of the function. This has quickly transpired as bugs PR 112711 and PR 112721. Just as

[committed] arm: libgcc: tweak warning from __sync_synchronize

2023-11-27 Thread Richard Earnshaw
My previous patch to add an implementation of __sync_syncrhonize with a warning trips a testsuite failure in fortran (and possibly other languages as well) as the framework expects no blank lines in the output, but this warning was generating one. So remove the newline from the end of the

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Joern Rennecke
On 11/20/23 11:26, Richard Sandiford wrote: >> + /* ?!? What is the point of this adjustment to DST_MASK? */ >> + if (code == PLUS || code == MINUS >> + || code == MULT || code == ASHIFT) >> + dst_mask >> + = dst_mask ? ((2ULL << floor_log2 (dst_mask)) - 1) : 0; > > Yeah, sympathise

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Joern Rennecke
On 11/20/23 11:26, Richard Sandiford wrote: >> + >> + mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit; >> + if (!mask) >> + mask = -0x1ULL; > > Not sure I follow this. What does the -0x1ULL constant indicate? > Also, isn't it the mask of the outer register that

[PATCH v2] Fortran: fix reallocation on assignment of polymorphic variables [PR110415]

2023-11-27 Thread Andrew Jenner
This is the second version of the patch - previous discussion at: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636671.html This patch adds the testcase from PR110415 and fixes the bug. The problem is that in a couple of places in trans_class_assignment in trans-expr.cc, we need to

[PATCH v2] Fixed problem with BTF defining smaller enums.

2023-11-27 Thread Cupertino Miranda
Hi everyone, David: Thanks for the v1 review. This version adds the following; - test case, - improves condition logic, - fixes mask typo. Looking forward to your review. v1 at: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636391.html Cheers, Cupertino commit

Re: [PATCH V2 3/3] OpenMP: Use enumerators for names of trait-sets and traits

2023-11-27 Thread Tobias Burnus
Hi Sandra, {BTW: 1/3 needs to be eventually rebased as it no longer applies cleanly; I have not checked 2/3 or 3/3 yet.] 1/3+2/3 look good to me, unless Jakub has some comments, I think they can go it. Regarding 3/3, some first comments. I still want to read it a bit more careful and play with

Re: PR111754

2023-11-27 Thread Richard Sandiford
Prathamesh Kulkarni writes: > PR111754: Rework encoding of result for VEC_PERM_EXPR with constant input > vectors. > > gcc/ChangeLog: > PR middle-end/111754 > * fold-const.cc (fold_vec_perm_cst): Set result's encoding to sel's > encoding, and set res_nelts_per_pattern to 2 if

Re: [RFC] vect: disable multiple calls of poly simdclones

2023-11-27 Thread Andre Vieira (lists)
On 06/11/2023 07:52, Richard Biener wrote: On Fri, 3 Nov 2023, Andre Vieira (lists) wrote: Hi, The current codegen code to support VF's that are multiples of a simdclone simdlen rely on BIT_FIELD_REF to create multiple input vectors. This does not work for non-constant simdclones, so we

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/27/23 04:30, Andrew Stubbs wrote: I tried this patch for AMD GCN. We have a similar problem with excess extends, but also for vector modes. Each lane has a minimum 32 bits and GCC's normal assumption is that vector registers have precisely the number of bits they need, so the amdgcn

Re: [RFA] New pass for sign/zero extension elimination

2023-11-27 Thread Jeff Law
On 11/26/23 09:42, rep.dot@gmail.com wrote: On 22 November 2023 23:23:41 CET, Jeff Law wrote: On 11/20/23 11:56, Dimitar Dimitrov wrote: On Sun, Nov 19, 2023 at 05:47:56PM -0700, Jeff Law wrote: ... + enum rtx_code xcode = GET_CODE (x); + if (xcode == SET) + { +

RE: [PATCH] aarch64: Improve cost of `a ? {-,}1 : b`

2023-11-27 Thread Andrew Pinski (QUIC)
> -Original Message- > From: Richard Sandiford > Sent: Monday, November 27, 2023 7:35 AM > To: Andrew Pinski (QUIC) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] aarch64: Improve cost of `a ? {-,}1 : b` > > Andrew Pinski writes: > > While looking into PR 112454, I found the cost

Re: Ping: [PATCH] Fix PR112419

2023-11-27 Thread Martin Uecker
Am Montag, dem 27.11.2023 um 16:54 +0100 schrieb Martin Uecker: > Am Montag, dem 27.11.2023 um 08:36 -0700 schrieb Jeff Law: > > > > On 11/23/23 10:05, Hans-Peter Nilsson wrote: > > > > From: Hans-Peter Nilsson > > > > Date: Thu, 16 Nov 2023 05:24:06 +0100 > > > > > > > > > From: Martin Uecker

Re: [PATCH] c++: Implement P2582R1, CTAD from inherited constructors

2023-11-27 Thread Patrick Palka
On Fri, 24 Nov 2023, Patrick Palka wrote: > On Wed, 22 Nov 2023, Patrick Palka wrote: > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > > trunk? > > > > -- >8 -- > > > > This patch implements C++23 class template argument deduction from > > inherited

Re: [PATCH v2 3/7] aarch64: Add eh_return compile tests

2023-11-27 Thread Szabolcs Nagy
The 11/26/2023 14:37, Richard Sandiford wrote: > Szabolcs Nagy writes: > > +++ b/gcc/testsuite/gcc.target/aarch64/eh_return-3.c > > @@ -0,0 +1,30 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -mbranch-protection=pac-ret+leaf" } */ > > Probably best to add -fno-schedule-insns

Re: Ping: [PATCH] Fix PR112419

2023-11-27 Thread Martin Uecker
Am Montag, dem 27.11.2023 um 08:36 -0700 schrieb Jeff Law: > > On 11/23/23 10:05, Hans-Peter Nilsson wrote: > > > From: Hans-Peter Nilsson > > > Date: Thu, 16 Nov 2023 05:24:06 +0100 > > > > > > > From: Martin Uecker > > > > Date: Tue, 07 Nov 2023 06:56:25 +0100 > > > > > > > Am Montag, dem

Re: GCC/Rust libgrust-v2/to-submit branch

2023-11-27 Thread Thomas Schwinge
Hi! On 2023-11-21T16:20:22+0100, Arthur Cohen wrote: > A newer version of the library has been force-pushed to the branch > `libgrust-v2/to-submit`. > On 11/20/23 15:55, Thomas Schwinge wrote: >> Arthur and Pierre-Emmanuel have prepared a GCC/Rust libgrust-v2/to-submit >> branch:

Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores

2023-11-27 Thread Jeff Law
On 11/27/23 05:39, Robin Dapp wrote: The easiest way to avoid running into the alias analysis problem is to scrap the MEM_EXPR when we expand the internal functions for partial loads/stores. That avoids the disambiguation we run into which is realizing that we store to an object of less size

Re: [PATCH] aarch64: Improve cost of `a ? {-,}1 : b`

2023-11-27 Thread Richard Sandiford
Richard Sandiford writes: > Andrew Pinski writes: >> While looking into PR 112454, I found the cost for >> `(if_then_else (cmp) (const_int 1) (reg))` was being recorded as 8 >> (or `COSTS_N_INSNS (2)`) but it should have been 4 (or `COSTS_N_INSNS (1)`). >> This improves the cost by not adding

Re: Ping: [PATCH] Fix PR112419

2023-11-27 Thread Jeff Law
On 11/23/23 10:05, Hans-Peter Nilsson wrote: From: Hans-Peter Nilsson Date: Thu, 16 Nov 2023 05:24:06 +0100 From: Martin Uecker Date: Tue, 07 Nov 2023 06:56:25 +0100 Am Montag, dem 06.11.2023 um 21:01 -0700 schrieb Jeff Law: On 11/6/23 20:58, Hans-Peter Nilsson wrote: This patch

Re: [PATCH] aarch64: Improve cost of `a ? {-,}1 : b`

2023-11-27 Thread Richard Sandiford
Andrew Pinski writes: > While looking into PR 112454, I found the cost for > `(if_then_else (cmp) (const_int 1) (reg))` was being recorded as 8 > (or `COSTS_N_INSNS (2)`) but it should have been 4 (or `COSTS_N_INSNS (1)`). > This improves the cost by not adding the cost of `(const_int 1)` to >

Re: PR111754

2023-11-27 Thread Prathamesh Kulkarni
On Fri, 24 Nov 2023 at 03:13, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Thu, 26 Oct 2023 at 09:43, Prathamesh Kulkarni > > wrote: > >> > >> On Thu, 26 Oct 2023 at 04:09, Richard Sandiford > >> wrote: > >> > > >> > Prathamesh Kulkarni writes: > >> > > On Wed, 25 Oct 2023

Re: [patch] OpenMP: Add -Wopenmp and use it

2023-11-27 Thread Christophe Lyon
On Mon, 27 Nov 2023 at 11:33, Tobias Burnus wrote: > > Hi, > > On 27.11.23 11:20, Christophe Lyon wrote: > > > I think the lack of final '.' in: > > Indeed - but you are lagging a bit behind: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638128.html > > [committed] c-family/c.opt

Re: [PATCH 0/3] [GCC] arm: vld1q_types_xN ACLE intrinsics

2023-11-27 Thread Richard Earnshaw
On 06/10/2023 10:49, ezra.sito...@arm.com wrote: Add xN variants of vld1q_types intrinsic. These patches are all OK, but please fix commit message formatting in line with the comments on the earlier series. R.

Re: [PATCH 0/3] [GCC] arm: vld1_types_xN ACLE intrinsics

2023-11-27 Thread Richard Earnshaw
On 19/10/2023 14:41, ezra.sito...@arm.com wrote: Add xN variants of vld1_types intrinsic for AArch32. These patches are all OK, but please fix the commit message formatting as with earlier series. R.

Re: [PATCH 3/3] [GCC] arm: vst1q_types_x4 ACLE intrinsics

2023-11-27 Thread Richard Earnshaw
On 10/10/2023 15:04, ezra.sito...@arm.com wrote: From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for AArch32. This patch adds the _x4 variants of the vst1q intrinsic. OK, but see earlier comments about formatting. R.

Re: [PATCH 2/3] [GCC] arm: vst1q_types_x3 ACLE intrinsics

2023-11-27 Thread Richard Earnshaw
On 10/10/2023 15:04, ezra.sito...@arm.com wrote: From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for AArch32. This patch adds the _x3 variants of the vst1q intrinsic. OK, but format lines to <= 70 columns please. R.

Re: [PATCH 1/3] [GCC] arm: vst1q_types_x2 ACLE intrinsics

2023-11-27 Thread Richard Earnshaw
On 10/10/2023 15:04, ezra.sito...@arm.com wrote: From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for AArch32. This patch adds the _x2 variants of the vst1q intrinsic. Tests use xN so that the latter variants (_x3, _x4)

  1   2   >