Re: [PATCH/RFC 08/10] aarch64: Don't use CEIL for vector_store in aarch64_stp_sequence_cost

2023-09-18 Thread Richard Sandiford via Gcc-patches
Kewen Lin writes: > This costing adjustment patch series exposes one issue in > aarch64 specific costing adjustment for STP sequence. It > causes the below test cases to fail: > > - gcc/testsuite/gcc.target/aarch64/ldp_stp_15.c > - gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c > - gcc/tests

Re: [PATCH V2] internal-fn: Support undefined rtx for uninitialized SSA_NAME

2023-09-17 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > As Richard and Richi suggested, we recognize uninitialized SSA_NAME and > convert it > into SCRATCH rtx if the target predicate allows SCRATCH. > > It can help to reduce redundant data move instructions

Re: [AArch64][testsuite] Adjust vect_copy_lane_1.c for new code-gen

2023-09-17 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi, > After 27de9aa152141e7f3ee66372647d0f2cd94c4b90, there's a following > regression: > FAIL: gcc.target/aarch64/vect_copy_lane_1.c scan-assembler-times > ins\\tv0.s\\[1\\], v1.s\\[0\\] 3 > > This happens because for the following function from vect_copy_lane_1.c:

Re: [PATCH] AArch64: Improve immediate expansion [PR105928]

2023-09-17 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Support immediate expansion of immediates which can be created from 2 MOVKs > and a shifted ORR or BIC instruction. Change aarch64_split_dimode_const_store > to apply if we save one instruction. > > This reduces the number of 4-instruction immediates in SPECINT/FP by 5%.

Re: [PATCH] internal-fn: Convert uninitialized SSA_NAME into SCRATCH rtx[PR110751]

2023-09-17 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > As Richard and Richi suggested, we recognize uninitialized SSA_NAME and > convert it > into SCRATCH rtx if the target predicate allows SCRATCH. > > It can help to reduce redundant data move instructions

[PATCH] aarch64: Fix loose ldpstp check [PR111411]

2023-09-15 Thread Richard Sandiford via Gcc-patches
aarch64_operands_ok_for_ldpstp contained the code: /* One of the memory accesses must be a mempair operand. If it is not the first one, they need to be swapped by the peephole. */ if (!aarch64_mem_pair_operand (mem_1, GET_MODE (mem_1)) && !aarch64_mem_pair_operand (mem_2, GET

[PATCH] aarch64: Restore SVE WHILE costing

2023-09-14 Thread Richard Sandiford via Gcc-patches
AArch64 previously costed WHILELO instructions on the first call to add_stmt_cost. This was because, at the time, only add_stmt_cost had access to the loop_vec_info. However, after the AVX512 changes, we only calculate the masks later. This patch moves the WHILELO costing to finish_cost, which is

[PATCH] aarch64: Coerce addresses to be suitable for LD1RQ

2023-09-14 Thread Richard Sandiford via Gcc-patches
In the following test: svuint8_t ld(uint8_t *ptr) { return svld1rq(svptrue_b8(), ptr + 2); } ptr + 2 is a valid address for an Advanced SIMD load, but not for an SVE load. We therefore ended up generating: ldr q0, [x0, 2] dup z0.q, z0.q[0] This patch makes us generate

Re: [PATCH] AArch64: List official cores before codenames

2023-09-13 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > List official cores first so that -cpu=native does not show a codename with -v > or in errors/warnings. Nice spot. > Passes regress, OK for commit? > > gcc/ChangeLog: > * config/aarch64/aarch64-cores.def (neoverse-n1): Place before ares. > (neoverse-v1):

[PATCH 17/19] aarch64: Explicitly record probe registers in frame info

2023-09-12 Thread Richard Sandiford via Gcc-patches
The stack frame is currently divided into three areas: A: the area above the hard frame pointer B: the SVE saves below the hard frame pointer C: the outgoing arguments If the stack frame is allocated in one chunk, the allocation needs a probe if the frame size is >= guard_size - 1KiB. In additio

[PATCH 19/19] aarch64: Make stack smash canary protect saved registers

2023-09-12 Thread Richard Sandiford via Gcc-patches
AArch64 normally puts the saved registers near the bottom of the frame, immediately above any dynamic allocations. But this means that a stack-smash attack on those dynamic allocations could overwrite the saved registers without needing to reach as far as the stack smash canary. The same thing co

[PATCH 16/19] aarch64: Simplify probe of final frame allocation

2023-09-12 Thread Richard Sandiford via Gcc-patches
Previous patches ensured that the final frame allocation only needs a probe when the size is strictly greater than 1KiB. It's therefore safe to use the normal 1024 probe offset in all cases. The main motivation for doing this is to simplify the code and remove the number of special cases. gcc/

[PATCH 08/19] aarch64: Rename locals_offset to bytes_above_locals

2023-09-12 Thread Richard Sandiford via Gcc-patches
locals_offset was described as: /* Offset from the base of the frame (incomming SP) to the top of the locals area. This value is always a multiple of STACK_BOUNDARY. */ This is implicitly an “upside down” view of the frame: the incoming SP is at offset 0, and anything N bytes below

[PATCH 18/19] aarch64: Remove below_hard_fp_saved_regs_size

2023-09-12 Thread Richard Sandiford via Gcc-patches
After previous patches, it's no longer necessary to store saved_regs_size and below_hard_fp_saved_regs_size in the frame info. All measurements instead use the top or bottom of the frame as reference points. gcc/ * config/aarch64/aarch64.h (aarch64_frame::saved_regs_size) (aarch64_

[PATCH 14/19] aarch64: Tweak stack clash boundary condition

2023-09-12 Thread Richard Sandiford via Gcc-patches
The AArch64 ABI says that, when stack clash protection is used, there can be a maximum of 1KiB of unprobed space at sp on entry to a function. Therefore, we need to probe when allocating >= guard_size - 1KiB of data (>= rather than >). This is what GCC does. If an allocation is exactly guard_siz

[PATCH 04/19] aarch64: Add bytes_below_saved_regs to frame info

2023-09-12 Thread Richard Sandiford via Gcc-patches
The frame layout code currently hard-codes the assumption that the number of bytes below the saved registers is equal to the size of the outgoing arguments. This patch abstracts that value into a new field of aarch64_frame. gcc/ * config/aarch64/aarch64.h (aarch64_frame::bytes_below_saved

[PATCH 15/19] aarch64: Put LR save probe in first 16 bytes

2023-09-12 Thread Richard Sandiford via Gcc-patches
-fstack-clash-protection uses the save of LR as a probe for the next allocation. The next allocation could be: * another part of the static frame, e.g. when allocating SVE save slots or outgoing arguments * an alloca in the same function * an allocation made by a callee function However, whe

[PATCH 13/19] aarch64: Minor initial adjustment tweak

2023-09-12 Thread Richard Sandiford via Gcc-patches
This patch just changes a calculation of initial_adjust to one that makes it slightly more obvious that the total adjustment is frame.frame_size. gcc/ * config/aarch64/aarch64.cc (aarch64_layout_frame): Tweak calculation of initial_adjust for frames in which all saves are S

[PATCH 10/19] aarch64: Tweak frame_size comment

2023-09-12 Thread Richard Sandiford via Gcc-patches
This patch fixes another case in which a value was described with an “upside-down” view. gcc/ * config/aarch64/aarch64.h (aarch64_frame::frame_size): Tweak comment. --- gcc/config/aarch64/aarch64.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/

[PATCH 03/19] aarch64: Explicitly handle frames with no saved registers

2023-09-12 Thread Richard Sandiford via Gcc-patches
If a frame has no saved registers, it can be allocated in one go. There is no need to treat the areas below and above the saved registers as separate. And if we allocate the frame in one go, it should be allocated as the initial_adjust rather than the final_adjust. This allows the frame size to g

[PATCH 11/19] aarch64: Measure reg_offset from the bottom of the frame

2023-09-12 Thread Richard Sandiford via Gcc-patches
reg_offset was measured from the bottom of the saved register area. This made perfect sense with the original layout, since the bottom of the saved register area was also the hard frame pointer address. It became slightly less obvious with SVE, since we save SVE registers below the hard frame point

[PATCH 09/19] aarch64: Rename hard_fp_offset to bytes_above_hard_fp

2023-09-12 Thread Richard Sandiford via Gcc-patches
Similarly to the previous locals_offset patch, hard_fp_offset was described as: /* Offset from the base of the frame (incomming SP) to the hard_frame_pointer. This value is always a multiple of STACK_BOUNDARY. */ poly_int64 hard_fp_offset; which again took an “upside-down” view: h

[PATCH 06/19] aarch64: Tweak aarch64_save/restore_callee_saves

2023-09-12 Thread Richard Sandiford via Gcc-patches
aarch64_save_callee_saves and aarch64_restore_callee_saves took a parameter called start_offset that gives the offset of the bottom of the saved register area from the current stack pointer. However, it's more convenient for later patches if we use the bottom of the entire frame as the reference po

[PATCH 02/19] aarch64: Avoid a use of callee_offset

2023-09-12 Thread Richard Sandiford via Gcc-patches
When we emit the frame chain, i.e. when we reach Here in this statement of aarch64_expand_prologue: if (emit_frame_chain) { // Here ... } the stack is in one of two states: - We've allocated up to the frame chain, but no more. - We've allocated the whole frame, and the fra

[PATCH 12/19] aarch64: Simplify top of frame allocation

2023-09-12 Thread Richard Sandiford via Gcc-patches
After previous patches, it no longer really makes sense to allocate the top of the frame in terms of varargs_and_saved_regs_size and saved_regs_and_above. gcc/ * config/aarch64/aarch64.cc (aarch64_layout_frame): Simplify the allocation of the top of the frame. --- gcc/config/aarch

[PATCH 07/19] aarch64: Only calculate chain_offset if there is a chain

2023-09-12 Thread Richard Sandiford via Gcc-patches
After previous patches, it is no longer necessary to calculate a chain_offset in cases where there is no chain record. gcc/ * config/aarch64/aarch64.cc (aarch64_expand_prologue): Move the calculation of chain_offset into the emit_frame_chain block. --- gcc/config/aarch64/aarch64.c

[PATCH 05/19] aarch64: Add bytes_below_hard_fp to frame info

2023-09-12 Thread Richard Sandiford via Gcc-patches
Following on from the previous bytes_below_saved_regs patch, this one records the number of bytes that are below the hard frame pointer. This eventually replaces below_hard_fp_saved_regs_size. If a frame pointer is not needed, the epilogue adds final_adjust to the stack pointer before restoring re

[PATCH 01/19] aarch64: Use local frame vars in shrink-wrapping code

2023-09-12 Thread Richard Sandiford via Gcc-patches
aarch64_layout_frame uses a shorthand for referring to cfun->machine->frame: aarch64_frame &frame = cfun->machine->frame; This patch does the same for some other heavy users of the structure. No functional change intended. gcc/ * config/aarch64/aarch64.cc (aarch64_save_callee_saves): U

[PATCH 00/19] aarch64: Fix -fstack-protector issue

2023-09-12 Thread Richard Sandiford via Gcc-patches
This series of patches fixes deficiencies in GCC's -fstack-protector implementation for AArch64 when using dynamically allocated stack space. This is CVE-2023-4039. See: https://developer.arm.com/Arm%20Security%20Center/GCC%20Stack%20Protector%20Vulnerability%20AArch64 https://github.com/metaredt

Re: [PATCH] pretty-print: Fix up pp_wide_int [PR111329]

2023-09-11 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > Hi! > > The recent pp_wide_int changes for _BitInt support (because not all > wide_ints fit into the small fixed size digit_buffer anymore) apparently > broke > +FAIL: gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c (test for excess > errors) > +FAIL: gcc.dg/analyzer/out-o

[PATCH] Allow target attributes in non-gnu namespaces

2023-09-08 Thread Richard Sandiford via Gcc-patches
Currently there are four static sources of attributes: - LANG_HOOKS_ATTRIBUTE_TABLE - LANG_HOOKS_COMMON_ATTRIBUTE_TABLE - LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE - TARGET_ATTRIBUTE_TABLE All of the attributes in these tables go in the "gnu" namespace. This means that they can use the traditional GNU __

Re: [PATCH V3] Support folding min(poly,poly) to const

2023-09-08 Thread Richard Sandiford via Gcc-patches
Lehua Ding writes: > V3 change: Address Richard's comments. > > Hi, > > This patch adds support that tries to fold `MIN (poly, poly)` to > a constant. Consider the following C Code: > > ``` > void foo2 (int* restrict a, int* restrict b, int n) > { > for (int i = 0; i < 3; i += 1) > a[i]

Re: [PATCH V2] Support folding min(poly,poly) to const

2023-09-08 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Lehua Ding writes: >> Hi, >> >> This patch adds support that tries to fold `MIN (poly, poly)` to >> a constant. Consider the following C Code: >> >> ``` >> void foo2 (int* restrict a, int* restrict b, int n) >> { >> for (int i = 0; i < 3; i += 1) >> a[i] += b

Re: [PATCH V2] Support folding min(poly,poly) to const

2023-09-08 Thread Richard Sandiford via Gcc-patches
Lehua Ding writes: > Hi, > > This patch adds support that tries to fold `MIN (poly, poly)` to > a constant. Consider the following C Code: > > ``` > void foo2 (int* restrict a, int* restrict b, int n) > { > for (int i = 0; i < 3; i += 1) > a[i] += b[i]; > } > ``` > > Before this patch: >

Re: [PATCH] Support folding min(poly,poly) to const

2023-09-07 Thread Richard Sandiford via Gcc-patches
Lehua Ding writes: > Hi, > > This patch adds support that tries to fold `MIN (poly, poly)` to > a constant. Consider the following C Code: > > ``` > void foo2 (int* restrict a, int* restrict b, int n) > { > for (int i = 0; i < 3; i += 1) > a[i] += b[i]; > } > ``` > > Before this patch: >

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-07 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: > Hi Richard, > > I did some testing with the attached v2 that does not restrict to UNARY > anymore. As feared ;) there is some more fallout that I'm detailing below. > > On Power there is one guality fail (pr43051-1.c) that I would take > the liberty of ignoring for now. > > O

[PATCH] Tweak language choice in config-list.mk

2023-09-07 Thread Richard Sandiford via Gcc-patches
When I tried to use config-list.mk, the build for every triple except the build machine's failed for m2. This is because, unlike other languages, m2 builds target objects during all-gcc. The build will therefore fail unless you have access to an appropriate binutils (or an equivalent). That's qu

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-06 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: > Hi Richard, > > I did some testing with the attached v2 that does not restrict to UNARY > anymore. As feared ;) there is some more fallout that I'm detailing below. > > On Power there is one guality fail (pr43051-1.c) that I would take > the liberty of ignoring for now. > > O

Re: [PATCH] LoongArch: Fix unintentional bash-ism in r14-3665.

2023-09-06 Thread Richard Sandiford via Gcc-patches
Yang Yujie writes: > gcc/ChangeLog: > > * config.gcc: remove non-POSIX syntax "<<<". OK. Thanks for the quick fix. Richard. > --- > gcc/config.gcc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index b2fe7c7ceef..6d4c8becd28 10

Re: [PATCH v1 2/6] LoongArch: improved target configuration interface

2023-09-05 Thread Richard Sandiford via Gcc-patches
Yang Yujie writes: > @@ -5171,25 +5213,21 @@ case "${target}" in > # ${with_multilib_list} should not contain whitespaces, > # consecutive commas or slashes. > if echo "${with_multilib_list}" \ > - | grep -E -e "[[:space:]]" -e '[,/][,/]' -e '[

Re: [PATCH 10/11] aarch64: Fix branch-protection error message tests

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > Update tests for the new branch-protection parser errors. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/branch-protection-attr.c: Update. > * gcc.target/aarch64/branch-protection-option.c: Update. OK, thanks. (And I agree these are better messages. :))

Re: [PATCH 07/11] aarch64: Disable branch-protection for pcs tests

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > The tests manipulate the return address in abitest-2.h and thus not > compatible with -mbranch-protection=pac-ret+leaf or > -mbranch-protection=gcs. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/aapcs64/func-ret-1.c: Disable branch-protection. > * gcc.ta

Re: [PATCH 06/11] aarch64: Fix pac-ret eh_return tests

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > This is needed since eh_return no longer prevents pac-ret in the > normal return path. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/return_address_sign_1.c: Move func4 to ... > * gcc.target/aarch64/return_address_sign_2.c: ... here and fix the > s

Re: [PATCH 05/11] aarch64: Add eh_return compile tests

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/eh_return-2.c: New test. > * gcc.target/aarch64/eh_return-3.c: New test. OK. I wonder if it's worth using check-function-bodies for -3.c though. It would then be easy to verify that the autiasp only occurs on t

Re: [PATCH 04/11] aarch64: Do not force a stack frame for EH returns

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > EH returns no longer rely on clobbering the return address on the stack > so forcing a stack frame is not necessary. > > This does not actually change the code gen for the unwinder since there > are calls before the EH return. > > gcc/ChangeLog: > > * config/aarch64/a

Re: [PATCH 01/11] aarch64: AARCH64_ISA_RCPC was defined twice

2023-09-05 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > gcc/ChangeLog: > > * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): Remove dup. OK, thanks. Richard > --- > gcc/config/aarch64/aarch64.h | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index 2b0fc

Re: testsuite: Port 'check-function-bodies' to nvptx

2023-09-05 Thread Richard Sandiford via Gcc-patches
Thomas Schwinge writes: > Hi! > > On 2023-09-04T23:05:05+0200, I wrote: >> On 2019-07-16T15:04:49+0100, Richard Sandiford >> wrote: >>> This patch therefore adds a new check-function-bodies dg-final test > >>> The regexps in parse_function_bodies are fairly general, but might >>> still need to b

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-05 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> So I don't think I have a good feel for the advantages and disadvantages >> of doing this. Robin's analysis of the aarch64 changes was nice and >> detailed though. I think the one that worries me most is the addressing >> mode one. fwprop is probably the first chance we ge

Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-04 Thread Richard Sandiford via Gcc-patches
Qing Zhao via Gcc-patches writes: >> On Aug 29, 2023, at 3:42 PM, Marek Polacek via Gcc-patches >> wrote: >> >> Improving the security of software has been a major trend in the recent >> years. Fortunately, GCC offers a wide variety of flags that enable extra >> hardening. These flags aren't

Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > "yanzhang.wang--- via Gcc-patches" writes: >> From: Yanzhang Wang >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1 >> to not. >> >> Signed-off-by: Yanzhang Wang >> --- >> >> Tested on my local arm environm

Re: [PATCH] testsuite: aarch64: Adjust SVE ACLE tests to new generated code

2023-09-04 Thread Richard Sandiford via Gcc-patches
Thiago Jung Bauermann via Gcc-patches writes: > Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for > vector." these tests fail on aarch64-linux: > > === g++ tests === > > Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ... > FAIL: gcc.target/aarch

Re: [PATCH v3] mklog: handle Signed-off-by, minor cleanup

2023-09-04 Thread Richard Sandiford via Gcc-patches
Marc Poulhiès via Gcc-patches writes: > Richard Sandiford via Gcc-patches writes: >>> +# this regex matches the first line of the "end" in the initial commit >>> message >>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by|co-authored-by|#):

Re: [PATCH] testsuite: Remove unwanted 'dg-do run' from gcc.dg/vect tests

2023-09-04 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > Tests under gcc.dg/vect use check_vect_support_and_set_flags to set > compilation flags as appropriate for the target, but they also set > dg-do-what-default to 'run' or 'compile', depending on the actual > target hardware (or simulator) capabilities. > >

Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches
"yanzhang.wang--- via Gcc-patches" writes: > From: Yanzhang Wang > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1 > to not. > > Signed-off-by: Yanzhang Wang > --- > > Tested on my local arm environment and passed. Thanks Andrew Pinski's

Re: [PATCH]AArch64 xorsign: Fix scalar xorsign lowering

2023-09-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, September 1, 2023 2:36 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH]AArch64 xorsign: Fix scalar x

Re: [PATCH]AArch64 xorsign: Fix scalar xorsign lowering

2023-09-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > In GCC-9 our scalar xorsign pattern broke and we didn't notice it because the > testcase was not strong enough. With this commit > > 8d2d39587d941a40f25ea0144cceb677df115040 is the first bad commit > commit 8d2d39587d941a40f25ea0144cceb677df115040 > Author: S

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-01 Thread Richard Sandiford via Gcc-patches
Uros Bizjak via Gcc-patches writes: > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > wrote: >> >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: >> > From: Kong Lingling >> > >> > In inline asm, we do not know if the insn can use EGPR, so disable E

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-09-01 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: >> It's not just a question of which byte though. It's also a question >> of which bit. >> >> One option would be to code-generate for even X and for odd X, and select >> between them at runtime. But that doesn't scale well to 2+2X and 1+1X. >> >> Otherwise I

[PATCH] lra: Avoid unfolded plus-0

2023-08-31 Thread Richard Sandiford via Gcc-patches
While backporting another patch to an earlier release, I hit a situation in which lra_eliminate_regs_1 would eliminate an address to: (plus (reg:P R) (const_int 0)) This address compared not-equal to plain: (reg:P R) which caused an ICE in a later peephole2. (The ICE showed up in gfort

[PATCH] aarch64: Fix return register handling in untyped_call

2023-08-31 Thread Richard Sandiford via Gcc-patches
While working on another patch, I hit a problem with the aarch64 expansion of untyped_call. The expander emits the usual: (set (mem ...) (reg resN)) instructions to store the result registers to memory, but it didn't say in RTL where those resN results came from. This eventually led to a fail

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> But in the VLA case, doesn't it instead have precision 4+4X? >> The problem then is that we can't tell at compile time which >> byte that corresponds to. So... > > Yes 4 + 4x. I keep getting confused with poly modes :) > In this case we want to extract the bitnum [3 4] = 3

RE: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Richard Sandiford via Gcc-patches
[Sorry for any weird MUA issues, don't have access to my usual set-up.] > when looking at a riscv ICE in vect-live-6.c I noticed that we > assume that the variable part (coeffs[1] * x1) of the to-be-extracted > bit number in extract_bit_field_1 is a multiple of BITS_PER_UNIT. > > This means that b

[PATCH] attribs: Use existing traits for excl_hash_traits

2023-08-29 Thread Richard Sandiford via Gcc-patches
excl_hash_traits can be defined more simply by reusing existing traits. Tested on aarch64-linux-gnu. OK to install? Richard gcc/ * attribs.cc (excl_hash_traits): Delete. (test_attribute_exclusions): Use pair_hash and nofree_string_hash instead. --- gcc/attribs.cc | 45

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-29 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 8/24/23 08:06, Robin Dapp via Gcc-patches wrote: >> Ping. I refined the code and some comments a bit and added a test >> case. >> >> My question in general would still be: Is this something we want >> given that we potentially move some of combine's work a bit towards >> t

Re: [RFC] > WIDE_INT_MAX_PREC support in wide-int

2023-08-29 Thread Richard Sandiford via Gcc-patches
Just some off-the-cuff thoughts. Might think differently when I've had more time... Richard Biener writes: > On Mon, 28 Aug 2023, Jakub Jelinek wrote: > >> Hi! >> >> While the _BitInt series isn't committed yet, I had a quick look at >> lifting the current lowest limitation on maximum _BitInt p

Re: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-24 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define TEST_TYPE(T

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-24 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 8/22/23 02:08, juzhe.zh...@rivai.ai wrote: >> Yes, I agree long-term we want every-thing be optimized as early as >> possible. >> >> However, IMHO, it's impossible we can support every conditional patterns >> in the middle-end (match.pd). >> It's a really big number. >> >

Re: [PATCH 03/11] aarch64: Use br instead of ret for eh_return

2023-08-24 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Rather than hiding this in target code, perhaps we should add a > target-independent concept of an "eh_return taken" flag, say > EH_RETURN_TAKEN_RTX. > > We could define it so that, on targets that define EH_RETURN_TAKEN_RTX, > a register EH_RETURN_STACKADJ_RTX and a re

Re: [PATCH] tree-optimization/111115 - SLP of masked stores

2023-08-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following adds the capability to do SLP on .MASK_STORE, I do not > plan to add interleaving support. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? LGTM, thanks. Richard > Thanks, > Richard. > > PR tree-optimization/15 > gcc/ > * tree-v

[PATCH] aarch64: Account for different Advanced SIMD fusing options

2023-08-24 Thread Richard Sandiford via Gcc-patches
The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean that either side of a subtraction can start an accumulator chain. However, Advanced SIMD doesn't have an equivalent instruction. This means that, for Advanced SIMD, a subtraction can only be fused if the second operand is a multiplicati

Re: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121]

2023-08-23 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Hi Richard, > > (that's quick!) > >> + if (size > max_copy_size || size > max_mops_size) >> +return aarch64_expand_cpymem_mops (operands, is_memmove); >> >> Could you explain this a bit more? If I've followed the logic correctly, >> max_copy_size will always be 0 for

Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Sandiford via Gcc-patches
"Richard Earnshaw (lists)" writes: > On 23/08/2023 16:49, Richard Sandiford via Gcc-patches wrote: >> Richard Earnshaw via Gcc-patches writes: >>> Now that we require C++ 11, we can safely forward declare rtx_code >>> so that we can use it i

Re: [PATCH] rtl: use rtx_code for gen_ccmp_first and gen_ccmp_next

2023-08-23 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw via Gcc-patches writes: > Note, this patch is dependent on the patch I posted yesterday to > forward declare rtx_code in coretypes.h. > > -- > Now that we have a forward declaration of rtx_code in coretypes.h, we > can adjust these hooks to take rtx_code arguments rather than

Re: [PATCH] rtl: Forward declare rtx_code

2023-08-23 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw via Gcc-patches writes: > Now that we require C++ 11, we can safely forward declare rtx_code > so that we can use it in target hooks. > > gcc/ChangeLog > * coretypes.h (rtx_code): Add forward declaration. > * rtl.h (rtx_code): Make compatible with forward declaration.

Re: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121]

2023-08-23 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > A MOPS memmove may corrupt registers since there is no copy of the input > operands to temporary > registers. Fix this by calling aarch64_expand_cpymem which does this. Also > fix an issue with > STRICT_ALIGNMENT being ignored if TARGET_MOPS is true, and avoid crashing

Re: [PATCH 03/11] aarch64: Use br instead of ret for eh_return

2023-08-23 Thread Richard Sandiford via Gcc-patches
Szabolcs Nagy writes: > The expected way to handle eh_return is to pass the stack adjustment > offset and landing pad address via > > EH_RETURN_STACKADJ_RTX > EH_RETURN_HANDLER_RTX > > to the epilogue that is shared between normal return paths and the > eh_return paths. EH_RETURN_HANDLER_RTX

Re: [PATCH v2] mklog: handle Signed-Off-By, minor cleanup

2023-08-21 Thread Richard Sandiford via Gcc-patches
Marc Poulhiès via Gcc-patches writes: > Consider Signed-Off-By lines as part of the ending of the initial > commit to avoid having these in the middle of the log when the > changelog part is injected after. > > This is particularly usefull with: > > $ git gcc-commit-mklog --amend -s > > that can

Re: [PATCH] Remove XFAIL from gcc/testsuite/gcc.dg/unroll-7.c

2023-08-21 Thread Richard Sandiford via Gcc-patches
Thiago Jung Bauermann via Gcc-patches writes: > This test passes since commit e41103081bfa "Fix undefined behaviour in > profile_count::differs_from_p", so remove the xfail annotation. > > Tested on aarch64-linux-gnu, armv8l-linux-gnueabihf and x86_64-linux-gnu. > > gcc/testsuite/ChangeLog: >

Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 16 Aug 2023, Juzhe-Zhong wrote: > >> Hi, Richard and Richi. >> >> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. >> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support >> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. >>

Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define TEST_TYPE(T

Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm

2023-08-21 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 21 Aug 2023 at 12:26, Richard Biener wrote: >> >> On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: >> >> > On Fri, 18 Aug 2023 at 14:52, Richard Biener wrote: >> > > >> > > On Fri, 18 Aug 2023, Richard Sandiford wrote: >> > > >> > > > Richard Biener writes:

Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-18 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Joseph Myers writes: >> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: >> >>> Would it be OK to add support for: >>> >>> [[__extension__ ...]] >>> >>> to suppress the pedwarn about u

Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm

2023-08-18 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following avoids running into somehow flawed logic in fold_vec_perm > for non-VLA vectors. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > Richard. > > PR tree-optimization/111048 > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA >

Re: [PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches >> : >> >> Joseph Myers writes: >>>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: >>>> >>>> Would it be OK to add support f

[PATCH] c: Add support for [[__extension__ ...]]

2023-08-17 Thread Richard Sandiford via Gcc-patches
Joseph Myers writes: > On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote: > >> Would it be OK to add support for: >> >> [[__extension__ ...]] >> >> to suppress the pedwarn about using [[]] prior to C2X? Then we can > > That seems lik

Re: [PATCH] doc: Fixes to RTL-SSA sample code

2023-08-17 Thread Richard Sandiford via Gcc-patches
Alex Coplan writes: > Hi, > > This patch fixes up the code examples in the RTL-SSA documentation (the > sections on making insn changes) to reflect the current API. > > The main issues are as follows: > - rtl_ssa::recog takes an obstack_watermark & as the first parameter. >Presumably this is

Re: [WIP RFC] Add support for keyword-based attributes

2023-08-16 Thread Richard Sandiford via Gcc-patches
Joseph Myers writes: > On Mon, 17 Jul 2023, Michael Matz via Gcc-patches wrote: > >> So, essentially you want unignorable attributes, right? Then implement >> exactly that: add one new keyword "__known_attribute__" (invent a better >> name, maybe :) ), semantics exactly as with __attribute__ (i

Re: [PATCH v2][GCC] aarch64: Add support for Cortex-A720 CPU

2023-08-16 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > v2: Add missing PROFILE feature flag. > > This patch adds support for the Cortex-A720 CPU to GCC. > > No regressions on aarch64-none-elf. > > Ok for master? > > gcc/ChangeLog: > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex- > A720 CPU. >

Re: [PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-16 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> However: >> >> | #define vec_extract_direct { 3, 3, false } >> >> This looks wrong. The numbers are argument numbers (or -1 for a return >> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range. >> >> | #define direct_vec_extract_optab_supported_p dir

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-08-16 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: >> Unfortunately, the patch regressed following tests on ppc64le and >> armhf respectively: >> gcc.target/powerpc/vec-perm-ctor.c scan-tree-dump-not optimized >> "VIEW_CONVERT_EXPR" >> gcc.dg/tree-ssa/forwprop-20.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR" >> >> This

Re: [PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-15 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Robin, Richard and Richi. > > I am wondering whether we can just simply replace the VEC_EXTRACT expander > with binary? > > Like this :? > > DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW, > - vec_extract, vec_extract) > + vec_ex

Re: [PATCH] Handle TYPE_OVERFLOW_UNDEFINED vectorized BB reductions

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following changes the gate to perform vectorization of BB reductions > to use needs_fold_left_reduction_p which in turn requires handling > TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by > promoting any operations generated there to use unsigned arith

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> OK, fair enough. So the idea is: see where we end up and then try to >> improve/factor the APIs in a less peephole way? > > Yeah, I think that's the only good way forward. OK, no objection from me. Sorry for holding the patch up. Richard

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 15 Aug 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Tue, 15 Aug 2023, Kewen.Lin wrote: >> > >> >> Hi Stefan, >> >> >> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote: >> >> > Hi everyone, >> >> > >> >> > I have bootstrapped and reg

Re: [PATCH][RFC] tree-optimization/92335 - Improve sinking heuristics for vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote: >> On Mon, 7 Aug 2023 at 13:19, Richard Biener >> wrote: >> > It doesn't seem to make a difference for x86. That said, the "fix" is >> > probably sticking the correct target on the dump-check, it seems >> > that vect_fold_

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 15 Aug 2023, Kewen.Lin wrote: > >> Hi Stefan, >> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote: >> > Hi everyone, >> > >> > I have bootstrapped and regtested the patch below on s390. For the >> > 64-bit target I do not see any changes regarding the te

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-15 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote: >> >> on 2023/8/14 22:16, Richard Sandiford wrote: >> > No, it was more that 219-142=77, so it seems like a lot of lines >> > are being duplicated rather than simply being moved. (Unlike for >> > VMAT_LOAD_STORE_LANES, whi

Re: [PATCH] Add support for vector conitional not

2023-08-14 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71), > this just adds conditional not too. > Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional > not. > > OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-

Re: [RFC] GCC Security policy

2023-08-14 Thread Richard Sandiford via Gcc-patches
I think it would help to clarify what the aim of the security policy is. Specifically: (1) What service do we want to provide to users by classifying one thing as a security bug and another thing as not a security bug? (2) What service do we want to provide to the GNU community by the same

Re: [PATCH] vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

2023-08-14 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi Richard, > > on 2023/8/14 20:20, Richard Sandiford wrote: >> Thanks for the clean-ups. But... >> >> "Kewen.Lin" writes: >>> Hi, >>> >>> Following Richi's suggestion [1], this patch is to move the >>> handlings on VMAT_GATHER_SCATTER in the final loop nest >>> of functio

  1   2   3   4   5   6   7   8   9   10   >