Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > "yanzhang.wang--- via Gcc-patches" writes: >> From: Yanzhang Wang >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1 >> to not. >> >> Signed-off-by: Yanzhang Wang >> --- >> >> Tested on my local arm

Re: [PATCH] testsuite: aarch64: Adjust SVE ACLE tests to new generated code

2023-09-04 Thread Richard Sandiford via Gcc-patches
Thiago Jung Bauermann via Gcc-patches writes: > Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for > vector." these tests fail on aarch64-linux: > > === g++ tests === > > Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ... > FAIL:

Re: [PATCH v3] mklog: handle Signed-off-by, minor cleanup

2023-09-04 Thread Richard Sandiford via Gcc-patches
Marc Poulhiès via Gcc-patches writes: > Richard Sandiford via Gcc-patches writes: >>> +# this regex matches the first line of the "end" in the initial commit >>> message >>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by|co-authored-by|#): >

Re: [PATCH] testsuite: Remove unwanted 'dg-do run' from gcc.dg/vect tests

2023-09-04 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > Tests under gcc.dg/vect use check_vect_support_and_set_flags to set > compilation flags as appropriate for the target, but they also set > dg-do-what-default to 'run' or 'compile', depending on the actual > target hardware (or simulator) capabilities. > >

Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches
"yanzhang.wang--- via Gcc-patches" writes: > From: Yanzhang Wang > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1 > to not. > > Signed-off-by: Yanzhang Wang > --- > > Tested on my local arm environment and passed. Thanks Andrew Pinski's

Re: [PATCH]AArch64 xorsign: Fix scalar xorsign lowering

2023-09-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, September 1, 2023 2:36 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH]AArch64 xorsign: Fix scalar

Re: [PATCH]AArch64 xorsign: Fix scalar xorsign lowering

2023-09-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > In GCC-9 our scalar xorsign pattern broke and we didn't notice it because the > testcase was not strong enough. With this commit > > 8d2d39587d941a40f25ea0144cceb677df115040 is the first bad commit > commit 8d2d39587d941a40f25ea0144cceb677df115040 > Author:

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-01 Thread Richard Sandiford via Gcc-patches
Uros Bizjak via Gcc-patches writes: > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > wrote: >> >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: >> > From: Kong Lingling >> > >> > In inline asm, we do not know if the insn can use EGPR, so disable

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-09-01 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: >> It's not just a question of which byte though. It's also a question >> of which bit. >> >> One option would be to code-generate for even X and for odd X, and select >> between them at runtime. But that doesn't scale well to 2+2X and 1+1X. >> >> Otherwise

[PATCH] lra: Avoid unfolded plus-0

2023-08-31 Thread Richard Sandiford via Gcc-patches
While backporting another patch to an earlier release, I hit a situation in which lra_eliminate_regs_1 would eliminate an address to: (plus (reg:P R) (const_int 0)) This address compared not-equal to plain: (reg:P R) which caused an ICE in a later peephole2. (The ICE showed up in

[PATCH] aarch64: Fix return register handling in untyped_call

2023-08-31 Thread Richard Sandiford via Gcc-patches
While working on another patch, I hit a problem with the aarch64 expansion of untyped_call. The expander emits the usual: (set (mem ...) (reg resN)) instructions to store the result registers to memory, but it didn't say in RTL where those resN results came from. This eventually led to a

Re: Question about dynamic choosing vectorization factor for RVV

2023-08-31 Thread Richard Sandiford via Gcc
"juzhe.zh...@rivai.ai" writes: > Thanks Richi. > > I am trying to figure out how to adjust finish_cost to lower the LMUL > > For example: > > void > foo (int32_t *__restrict a, int32_t *__restrict b, int n) > { > for (int i = 0; i < n; i++) > a[i] = a[i] + b[i]; > } > > preferred_simd_mode

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> But in the VLA case, doesn't it instead have precision 4+4X? >> The problem then is that we can't tell at compile time which >> byte that corresponds to. So... > > Yes 4 + 4x. I keep getting confused with poly modes :) > In this case we want to extract the bitnum [3 4] = 3

RE: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Richard Sandiford via Gcc-patches
[Sorry for any weird MUA issues, don't have access to my usual set-up.] > when looking at a riscv ICE in vect-live-6.c I noticed that we > assume that the variable part (coeffs[1] * x1) of the to-be-extracted > bit number in extract_bit_field_1 is a multiple of BITS_PER_UNIT. > > This means that

[PATCH] attribs: Use existing traits for excl_hash_traits

2023-08-29 Thread Richard Sandiford via Gcc-patches
excl_hash_traits can be defined more simply by reusing existing traits. Tested on aarch64-linux-gnu. OK to install? Richard gcc/ * attribs.cc (excl_hash_traits): Delete. (test_attribute_exclusions): Use pair_hash and nofree_string_hash instead. --- gcc/attribs.cc | 45

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-29 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 8/24/23 08:06, Robin Dapp via Gcc-patches wrote: >> Ping. I refined the code and some comments a bit and added a test >> case. >> >> My question in general would still be: Is this something we want >> given that we potentially move some of combine's work a bit towards >>

Re: [RFC] > WIDE_INT_MAX_PREC support in wide-int

2023-08-29 Thread Richard Sandiford via Gcc-patches
Just some off-the-cuff thoughts. Might think differently when I've had more time... Richard Biener writes: > On Mon, 28 Aug 2023, Jakub Jelinek wrote: > >> Hi! >> >> While the _BitInt series isn't committed yet, I had a quick look at >> lifting the current lowest limitation on maximum _BitInt

Re: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-24 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-24 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 8/22/23 02:08, juzhe.zh...@rivai.ai wrote: >> Yes, I agree long-term we want every-thing be optimized as early as >> possible. >> >> However, IMHO, it's impossible we can support every conditional patterns >> in the middle-end (match.pd). >> It's a really big number. >>

Re: [PATCH 03/11] aarch64: Use br instead of ret for eh_return

2023-08-24 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Rather than hiding this in target code, perhaps we should add a > target-independent concept of an "eh_return taken" flag, say > EH_RETURN_TAKEN_RTX. > > We could define it so that, on targets that define EH_RETURN_TAKEN_RTX, > a register EH_RETURN_STACKADJ_RTX and a

Re: [PATCH] tree-optimization/111115 - SLP of masked stores

2023-08-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following adds the capability to do SLP on .MASK_STORE, I do not > plan to add interleaving support. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? LGTM, thanks. Richard > Thanks, > Richard. > > PR tree-optimization/15 > gcc/ > *

[PATCH] aarch64: Account for different Advanced SIMD fusing options

2023-08-24 Thread Richard Sandiford via Gcc-patches
The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean that either side of a subtraction can start an accumulator chain. However, Advanced SIMD doesn't have an equivalent instruction. This means that, for Advanced SIMD, a subtraction can only be fused if the second operand is a

Re: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121]

2023-08-23 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Hi Richard, > > (that's quick!) > >> + if (size > max_copy_size || size > max_mops_size) >> +return aarch64_expand_cpymem_mops (operands, is_memmove); >> >> Could you explain this a bit more? If I've followed the logic correctly, >> max_copy_size will always be 0

Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Sandiford via Gcc-patches
"Richard Earnshaw (lists)" writes: > On 23/08/2023 16:49, Richard Sandiford via Gcc-patches wrote: >> Richard Earnshaw via Gcc-patches writes: >>> Now that we require C++ 11, we can safely forward declare rtx_code >>> so that we can use it i

Re: [PATCH] rtl: use rtx_code for gen_ccmp_first and gen_ccmp_next

2023-08-23 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw via Gcc-patches writes: > Note, this patch is dependent on the patch I posted yesterday to > forward declare rtx_code in coretypes.h. > > -- > Now that we have a forward declaration of rtx_code in coretypes.h, we > can adjust these hooks to take rtx_code arguments rather than

Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw via Gcc-patches writes: > Now that we require C++ 11, we can safely forward declare rtx_code > so that we can use it in target hooks. > > gcc/ChangeLog > * coretypes.h (rtx_code): Add forward declaration. > * rtl.h (rtx_code): Make compatible with forward declaration.

Re: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121]

2023-08-23 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > A MOPS memmove may corrupt registers since there is no copy of the input > operands to temporary > registers. Fix this by calling aarch64_expand_cpymem which does this. Also > fix an issue with > STRICT_ALIGNMENT being ignored if TARGET_MOPS is true, and avoid

Re: [PATCH 03/11] aarch64: Use br instead of ret for eh_return

2023-08-23 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > The expected way to handle eh_return is to pass the stack adjustment > offset and landing pad address via > > EH_RETURN_STACKADJ_RTX > EH_RETURN_HANDLER_RTX > > to the epilogue that is shared between normal return paths and the > eh_return paths. EH_RETURN_HANDLER_RTX

Re: [PATCH v2] mklog: handle Signed-Off-By, minor cleanup

2023-08-22 Thread Richard Sandiford via Gcc-patches
Marc Poulhiès via Gcc-patches writes: > Consider Signed-Off-By lines as part of the ending of the initial > commit to avoid having these in the middle of the log when the > changelog part is injected after. > > This is particularly usefull with: > > $ git gcc-commit-mklog --amend -s > > that can

Re: [PATCH] Remove XFAIL from gcc/testsuite/gcc.dg/unroll-7.c

2023-08-21 Thread Richard Sandiford via Gcc-patches
Thiago Jung Bauermann via Gcc-patches writes: > This test passes since commit e41103081bfa "Fix undefined behaviour in > profile_count::differs_from_p", so remove the xfail annotation. > > Tested on aarch64-linux-gnu, armv8l-linux-gnueabihf and x86_64-linux-gnu. > > gcc/testsuite/ChangeLog: >

Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 16 Aug 2023, Juzhe-Zhong wrote: > >> Hi, Richard and Richi. >> >> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. >> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support >> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.

Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define

Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm

2023-08-21 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 21 Aug 2023 at 12:26, Richard Biener wrote: >> >> On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: >> >> > On Fri, 18 Aug 2023 at 14:52, Richard Biener wrote: >> > > >> > > On Fri, 18 Aug 2023, Richard Sandiford wrote: >> > > >> > > > Richard Biener writes:

Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-18 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Joseph Myers writes: >> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: >> >>> Would it be OK to add support for: >>> >>> [[__extension__ ...]] >>> >>> to suppress the pedwarn about

Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm

2023-08-18 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following avoids running into somehow flawed logic in fold_vec_perm > for non-VLA vectors. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > Richard. > > PR tree-optimization/111048 > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA >

Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches >> : >> >> Joseph Myers writes: >>>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: >>>> >>>> Would it be OK to add support f

[PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Joseph Myers writes: > On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: > >> Would it be OK to add support for: >> >> [[__extension__ ...]] >> >> to suppress the pedwarn about using [[]] prior to C2X? Then we can > > That seem

Re: [PATCH] doc: Fixes to RTL-SSA sample code

2023-08-17 Thread Richard Sandiford via Gcc-patches
Alex Coplan writes: > Hi, > > This patch fixes up the code examples in the RTL-SSA documentation (the > sections on making insn changes) to reflect the current API. > > The main issues are as follows: > - rtl_ssa::recog takes an obstack_watermark & as the first parameter. >Presumably this is

Re: [WIP RFC] Add support for keyword-based attributes

2023-08-16 Thread Richard Sandiford via Gcc-patches
Joseph Myers writes: > On Mon, 17 Jul 2023, Michael Matz via Gcc-patches wrote: > >> So, essentially you want unignorable attributes, right? Then implement >> exactly that: add one new keyword "__known_attribute__" (invent a better >> name, maybe :) ), semantics exactly as with __attribute__

Re: [PATCH v2][GCC] aarch64: Add support for Cortex-A720 CPU

2023-08-16 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > v2: Add missing PROFILE feature flag. > > This patch adds support for the Cortex-A720 CPU to GCC. > > No regressions on aarch64-none-elf. > > Ok for master? > > gcc/ChangeLog: > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex- > A720 CPU. >

Re: [PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-16 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> However: >> >> | #define vec_extract_direct { 3, 3, false } >> >> This looks wrong. The numbers are argument numbers (or -1 for a return >> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range. >> >> | #define direct_vec_extract_optab_supported_p

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-16 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: >> Unfortunately, the patch regressed following tests on ppc64le and >> armhf respectively: >> gcc.target/powerpc/vec-perm-ctor.c scan-tree-dump-not optimized >> "VIEW_CONVERT_EXPR" >> gcc.dg/tree-ssa/forwprop-20.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR" >> >>

Re: [PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-16 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Robin, Richard and Richi. > > I am wondering whether we can just simply replace the VEC_EXTRACT expander > with binary? > > Like this :? > > DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW, > - vec_extract, vec_extract) > +

Re: [PATCH] Handle TYPE_OVERFLOW_UNDEFINED vectorized BB reductions

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following changes the gate to perform vectorization of BB reductions > to use needs_fold_left_reduction_p which in turn requires handling > TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by > promoting any operations generated there to use unsigned

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> OK, fair enough. So the idea is: see where we end up and then try to >> improve/factor the APIs in a less peephole way? > > Yeah, I think that's the only good way forward. OK, no objection from me. Sorry for holding the patch up. Richard

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 15 Aug 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Tue, 15 Aug 2023, Kewen.Lin wrote: >> > >> >> Hi Stefan, >> >> >> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote: >> >> > Hi everyone, >> >> > >> >> > I have bootstrapped and

Re: [PATCH][RFC] tree-optimization/92335 - Improve sinking heuristics for vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote: >> On Mon, 7 Aug 2023 at 13:19, Richard Biener >> wrote: >> > It doesn't seem to make a difference for x86. That said, the "fix" is >> > probably sticking the correct target on the dump-check, it seems >> > that

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 15 Aug 2023, Kewen.Lin wrote: > >> Hi Stefan, >> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote: >> > Hi everyone, >> > >> > I have bootstrapped and regtested the patch below on s390. For the >> > 64-bit target I do not see any changes regarding the

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote: >> >> on 2023/8/14 22:16, Richard Sandiford wrote: >> > No, it was more that 219-142=77, so it seems like a lot of lines >> > are being duplicated rather than simply being moved. (Unlike for >> > VMAT_LOAD_STORE_LANES,

Re: [PATCH] Add support for vector conitional not

2023-08-14 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71), > this just adds conditional not too. > Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional > not. > > OK? Bootstrapped and tested on x86_64-linux-gnu and

Re: [RFC] GCC Security policy

2023-08-14 Thread Richard Sandiford via Gcc-patches
I think it would help to clarify what the aim of the security policy is. Specifically: (1) What service do we want to provide to users by classifying one thing as a security bug and another thing as not a security bug? (2) What service do we want to provide to the GNU community by the same

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi Richard, > > on 2023/8/14 20:20, Richard Sandiford wrote: >> Thanks for the clean-ups. But... >> >> "Kewen.Lin" writes: >>> Hi, >>> >>> Following Richi's suggestion [1], this patch is to move the >>> handlings on VMAT_GATHER_SCATTER in the final loop nest >>> of

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-14 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Thu, 10 Aug 2023 at 21:27, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> >> static bool >> >> is_simple_vla_size (poly_uint64 size) >> >> { >> >> if (size.is_constant ()) >> >> return false; >> >> for (int i = 1; i < ARRAY_SIZE

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Richard Sandiford via Gcc-patches
Thanks for the clean-ups. But... "Kewen.Lin" writes: > Hi, > > Following Richi's suggestion [1], this patch is to move the > handlings on VMAT_GATHER_SCATTER in the final loop nest > of function vectorizable_load to its own loop. Basically > it duplicates the final loop nest, clean up some

Re: [PATCH] genrecog: Add SUBREG_BYTE.to_constant check to the genrecog

2023-08-14 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, there is genrecog issue happens in RISC-V backend. > > This is the ICE info: > > 0xfa3ba4 poly_int_pod<2u, unsigned short>::to_constant() const > ../../../riscv-gcc/gcc/poly-int.h:504 > 0x28eaa91 recog_5 >

Re: [PATCH V3] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-11 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, 11 Aug 2023, juzhe.zh...@rivai.ai wrote: > >> Hi, Richi. >> >> > 1. Target is using loop MASK as the partial vector loop control. >> >> I don't think it checks for this? >> >> I am not sure whether I understand EXTRACT_LAST correctly. >> But if target doesn't

Re: [PATCH] tree-optimization/110979 - fold-left reduction and partial vectors

2023-08-11 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > When we vectorize fold-left reductions with partial vectors but > no target operation available we use a vector conditional to force > excess elements to zero. But that doesn't correctly preserve > the sign of zero. The following patch disables partial vector > support

Re: [RFC] GCC Security policy

2023-08-10 Thread Richard Sandiford via Gcc-patches
Siddhesh Poyarekar writes: > On 2023-08-08 10:30, Siddhesh Poyarekar wrote: >>> Do you have a suggestion for the language to address libgcc, >>> libstdc++, etc. and libiberty, libbacktrace, etc.? >> >> I'll work on this a bit and share a draft. > > Hi David, > > Here's what I came up with for

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-10 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: >> static bool >> is_simple_vla_size (poly_uint64 size) >> { >> if (size.is_constant ()) >> return false; >> for (int i = 1; i < ARRAY_SIZE (size.coeffs); ++i) >> if (size[i] != (i <= 1 ? size[0] : 0)) > Just wondering is this should be (i == 1 ? size[0] :

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, Aug 10, 2023 at 3:44 PM Richard Sandiford > wrote: >> >> Richard Biener via Gcc-patches writes: >> > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches >> > wrote: >> >> >> >> If `A` has a range of `[0,0][100,INF]` and the comparison >> >> of `A <

Re: [PATCH] VR-VALUES: Simplify comparison using range pairs

2023-08-10 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, Aug 9, 2023 at 6:16 PM Andrew Pinski via Gcc-patches > wrote: >> >> If `A` has a range of `[0,0][100,INF]` and the comparison >> of `A < 50`. This should be optimized to `A <= 0` (which then >> will be optimized to just `A == 0`). >> This patch

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-10 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Wed, Aug 09, 2023 at 06:27:20PM +0100, Richard Sandiford wrote: >> Jakub Jelinek writes: >> > On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote: >> >> Jakub: do you remember what the reason was? I don't mind dropping >> >> "function", but it feels

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Wed, Aug 09, 2023 at 05:55:28PM +0100, Richard Sandiford wrote: >> Jakub: do you remember what the reason was? I don't mind dropping >> "function", but it feels weird to drop the quotes around "simd". >> Seems like, if we do that, there'll one day be a patch to add >>

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-09 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Here is my new version, see inline response to your comments. > > New cover letter: > > This patch enables the use of mixed-types for simd clones for AArch64, > adds aarch64 as a target_vect_simd_clones and corrects the way the > simdlen is chosen for

Re: [PATCH] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-09 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richi. > >>> that should be > >>> || (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) >>> && !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) > >>> I think. It seems to imply that SLP isn't supported with >>> masking/lengthing. > > Oh, yes. At first glance, the

Re: [PATCH] aarch64: SVE/NEON Bridging intrinsics

2023-08-09 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > ACLE has added intrinsics to bridge between SVE and Neon. > > The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and > SVE vectors. > > This patch adds support to GCC for the following 3 intrinsics: > svset_neonq, svget_neonq and svdup_neonq > >

Re: [PATCH][GCC] aarch64: Add support for Cortex-A520 CPU

2023-08-08 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > This patch adds support for the Cortex-A520 CPU to GCC. > > No regressions on aarch64-none-elf. > > Ok for master? > > > gcc/ChangeLog: > >     * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add > Cortex-A520 CPU. >     * config/aarch64/aarch64-tune.md:

Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-08-08 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > This patch enables the use of mixed-types for simd clones for AArch64 > and adds aarch64 as a target_vect_simd_clones. > > Bootstrapped and regression tested on aarch64-unknown-linux-gnu > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-08 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Fri, 4 Aug 2023 at 20:36, Richard Sandiford > wrote: >> >> Full review this time, sorry for the skipping the tests earlier. > Thanks for the detailed review! Please find my responses inline below. >> >> Prathamesh Kulkarni writes: >> > diff --git

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-04 Thread Richard Sandiford via Gcc-patches
Full review this time, sorry for the skipping the tests earlier. Prathamesh Kulkarni writes: > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 7e5494dfd39..680d0e54fd4 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -85,6 +85,10 @@ along with GCC; see the file

Re: [PATCH] tree-optimization/110838 - vectorization of widened right shifts

2023-08-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following fixes a problem with my last attempt of avoiding > out-of-bound shift values for vectorized right shifts of widened > operands. Instead of truncating the shift amount with a bitwise > and we actually need to saturate it to the target precision. > > The

Re: [RFC] Combine zero_extract and sign_extend for TARGET_TRULY_NOOP_TRUNCATION

2023-08-04 Thread Richard Sandiford via Gcc-patches
YunQiang Su writes: > PR #104914 > > On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms, > zero_extract (SI, SI) can be sign-extended. So, if a zero_extract (DI, > DI) following with an sign_extend(SI, DI) can be merged to a single > zero_extract (SI, SI). > > gcc/ChangeLog: >

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-03 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> >> Do you see vect_constant_defs in practice, or is this just for >> >> completeness? >> >> I would expect any constants to appear as direct operands. I don't >> >> mind keeping it if it's just a belt-and-braces thing though. >> > >> > In the latency case where I had

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-03 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Prathamesh Kulkarni writes: >> On Tue, 25 Jul 2023 at 18:25, Richard Sandiford >> wrote: >>> >>> Hi, >>> >>> Thanks for the rework and sorry for the slow review. >> Hi Richard, >> Thanks for the suggestions! Please find my responses inline below. >>> >>> Prathamesh

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-03 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 25 Jul 2023 at 18:25, Richard Sandiford > wrote: >> >> Hi, >> >> Thanks for the rework and sorry for the slow review. > Hi Richard, > Thanks for the suggestions! Please find my responses inline below. >> >> Prathamesh Kulkarni writes: >> > Hi Richard, >> >

Re: [PATCH]AArch64 Undo vec_widen_shiftl optabs [PR106346]

2023-08-03 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> > + >> > +(define_constraint "D3" >> > + "@internal >> > + A constraint that matches vector of immediates that is with 0 to >> > +(bits(mode)/2)-1." >> > + (and (match_code "const,const_vector") >> > + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, >>

[PATCH] poly_int: Handle more can_div_trunc_p cases

2023-08-03 Thread Richard Sandiford via Gcc-patches
can_div_trunc_p (a, b, , ) tries to compute a Q and r that satisfy the usual conditions for truncating division: (1) a = b * Q + r (2) |b * Q| <= |a| (3) |r| < |b| We can compute Q using the constant component (the case when all indeterminates are zero). Since |r| < |b| for the

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-08-03 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > Hi Richard, > > Update the patch with a simple case (see below case and comments). It shows > a live stmt may not have reduction def, which introduce the ICE. > > Is it OK for trunk? OK, thanks. Richard > > Fix the assertion failure on empty reduction define in

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > [...] >> >> in vect_determine_precisions_from_range. Maybe we should drop >> >> the shift handling from there and instead rely on >> >> vect_determine_precisions_from_users, extending: >> >> >> >> if (TREE_CODE (shift) != INTEGER_CST >> >> || !wi::ltu_p

Re: [PATCH][gensupport]: Don't segfault on empty attrs list

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > Currently we segfault when len == 0 for an attribute list. > > essentially [cons: =0, 1, 2, 3; attrs: ] segfaults but should be equivalent to > [cons: =0, 1, 2, 3] and [cons: =0, 1, 2, 3; attrs:]. This fixes it by just > returning early and leaving it to the

Re: [PATCH]AArch64 Undo vec_widen_shiftl optabs [PR106346]

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > In GCC 11 we implemented the vectorizer optab for widening left shifts, > however this optab is only supported for uniform shift constants. > > At the moment GCC still has two loop vectorization strategy (classical loop > and > SLP based loop vec) and the

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 1 Aug 2023, Richard Sandiford wrote: > >> Richard Sandiford writes: >> > Richard Biener via Gcc-patches writes: >> >> The following makes sure to limit the shift operand when vectorizing >> >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> Tamar Christina writes: >> > Hi All, >> > >> > When determining issue rates we currently discount non-constant MLA >> > accumulators for Advanced SIMD but don't do it for the latency. >> > >> > This means the costs for Advanced SIMD with a constant accumulator are >> >

Re: [PATCH]AArch64 update costing for combining vector conditionals

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > boolean comparisons have different cost depending on the mode. e.g. > a && b when predicated doesn't require an addition instruction, the AND is > free Nit (for the commit msg): additional Maybe: for SVE, a && b doesn't require an additional instruction

Re: [PATCH]AArch64 update costing for MLA by invariant

2023-08-02 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > When determining issue rates we currently discount non-constant MLA > accumulators > for Advanced SIMD but don't do it for the latency. > > This means the costs for Advanced SIMD with a constant accumulator are wrong > and > results in us costing SVE and

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-02 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 8/1/23 05:18, Richard Sandiford wrote: >> >> Where were you seeing the requirement for pointer equality? genrecog.cc >> at least uses rtx_equal_p, and I think it has to. E.g. some patterns >> use (match_dup ...) to match output and input mems, and mem

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-01 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 7/19/23 04:11, Xiao Zeng wrote: >> This patch completes the recognition of the basic semantics >> defined in the spec, namely: >> >> Conditional zero, if condition is equal to zero >>rd = (rs2 == 0) ? 0 : rs1 >> Conditional zero, if condition is non zero

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Richard Biener via Gcc-patches writes: >> The following makes sure to limit the shift operand when vectorizing >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift >> operand otherwise invokes undefined behavior. When we determine >> whether we can

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following makes sure to limit the shift operand when vectorizing > (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift > operand otherwise invokes undefined behavior. When we determine > whether we can demote the operand we know we at

Re: [PATCH V2] VECT: Support CALL vectorization for COND_LEN_*

2023-07-31 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Base on the suggestions from Richard: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html > > This patch choose (1) approach that Richard provided, meaning: > > RVV implements cond_* optabs as expanders. RVV therefore supports > both

Re: [PATCH] internal-fn: Refine macro define of COND_* and COND_LEN_* internal functions

2023-07-31 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > > Base on previous disscussions, we should make COND_* and COND_LEN_* > consistent. > > So, this patch define these internal function together by these 2 > wrappers: > > #ifndef DEF_INTERNAL_COND_FN > #define

Re: [PATCH] Add POLY_INT_CST support to fold_ctor_reference in gimple-fold.cc

2023-07-31 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > Add POLY_INT_CST support to code within > fold_ctor_reference. This code previously > only supported INTEGER_CST which caused a > bug when using VEC_PERM_EXPR with SVE vectors. Just to add for others: this is a prerequisite for a follow-on patch, so the change will be

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-31 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: >> Which test case do you see this for? The two tests in the patch still >> seem to report correct latencies for me if I make the change above. > > Not the newly added tests. It is still the existing case causing the > previous ICE (i.e. assertion problem):

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-28 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response. Hao Liu OS writes: >> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing: >> >> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar) >> && vect_is_reduction (stmt_info)) >> >> to: >> >> if ((kind == scalar_stmt ||

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches >> > wrote: >> >> >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that >> >> > we're not

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches > wrote: >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that we're >> > not papering over an issue elsewhere. >> >> Yes, I also wonder if this is an issue in vectorizable_reduction.

Re: vectorizer: Avoid an OOB access from vectorization

2023-07-25 Thread Richard Sandiford via Gcc-patches
Was leaving a bit of time in case Richi had any comments, but: Matthew Malcomson writes: > Our checks for whether the vectorization of a given loop would make an > out of bounds access miss the case when the vector we load is so large > as to span multiple iterations worth of data (while only

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-07-25 Thread Richard Sandiford via Gcc-patches
Hi, Thanks for the rework and sorry for the slow review. Prathamesh Kulkarni writes: > Hi Richard, > This is reworking of patch to extend fold_vec_perm to handle VLA vectors. > The attached patch unifies handling of VLS and VLA vector_csts, while > using fallback code > for ctors. > > For VLS

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. >>> I think we should have an internal-fn helper that returns IFN_COND_LEN_* >>> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD >>> etc. too. > Could you name this helper function for me? Does it call >

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richard. > > Do you suggest we should add a macro like this first: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ > DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) > DEF_INTERNAL_OPTAB_FN

<    1   2   3   4   5   6   7   8   9   10   >