Kewen Lin writes:
> This costing adjustment patch series exposes one issue in
> aarch64 specific costing adjustment for STP sequence. It
> causes the below test cases to fail:
>
> - gcc/testsuite/gcc.target/aarch64/ldp_stp_15.c
> - gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c
> - gcc/tests
Juzhe-Zhong writes:
> According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
>
> As Richard and Richi suggested, we recognize uninitialized SSA_NAME and
> convert it
> into SCRATCH rtx if the target predicate allows SCRATCH.
>
> It can help to reduce redundant data move instructions
Prathamesh Kulkarni writes:
> Hi,
> After 27de9aa152141e7f3ee66372647d0f2cd94c4b90, there's a following
> regression:
> FAIL: gcc.target/aarch64/vect_copy_lane_1.c scan-assembler-times
> ins\\tv0.s\\[1\\], v1.s\\[0\\] 3
>
> This happens because for the following function from vect_copy_lane_1.c:
Wilco Dijkstra writes:
> Support immediate expansion of immediates which can be created from 2 MOVKs
> and a shifted ORR or BIC instruction. Change aarch64_split_dimode_const_store
> to apply if we save one instruction.
>
> This reduces the number of 4-instruction immediates in SPECINT/FP by 5%.
Juzhe-Zhong writes:
> According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
>
> As Richard and Richi suggested, we recognize uninitialized SSA_NAME and
> convert it
> into SCRATCH rtx if the target predicate allows SCRATCH.
>
> It can help to reduce redundant data move instructions
aarch64_operands_ok_for_ldpstp contained the code:
/* One of the memory accesses must be a mempair operand.
If it is not the first one, they need to be swapped by the
peephole. */
if (!aarch64_mem_pair_operand (mem_1, GET_MODE (mem_1))
&& !aarch64_mem_pair_operand (mem_2, GET
AArch64 previously costed WHILELO instructions on the first call
to add_stmt_cost. This was because, at the time, only add_stmt_cost
had access to the loop_vec_info.
However, after the AVX512 changes, we only calculate the masks later.
This patch moves the WHILELO costing to finish_cost, which is
In the following test:
svuint8_t ld(uint8_t *ptr) { return svld1rq(svptrue_b8(), ptr + 2); }
ptr + 2 is a valid address for an Advanced SIMD load, but not for
an SVE load. We therefore ended up generating:
ldr q0, [x0, 2]
dup z0.q, z0.q[0]
This patch makes us generate
Wilco Dijkstra writes:
> List official cores first so that -cpu=native does not show a codename with -v
> or in errors/warnings.
Nice spot.
> Passes regress, OK for commit?
>
> gcc/ChangeLog:
> * config/aarch64/aarch64-cores.def (neoverse-n1): Place before ares.
> (neoverse-v1):
The stack frame is currently divided into three areas:
A: the area above the hard frame pointer
B: the SVE saves below the hard frame pointer
C: the outgoing arguments
If the stack frame is allocated in one chunk, the allocation needs a
probe if the frame size is >= guard_size - 1KiB. In additio
AArch64 normally puts the saved registers near the bottom of the frame,
immediately above any dynamic allocations. But this means that a
stack-smash attack on those dynamic allocations could overwrite the
saved registers without needing to reach as far as the stack smash
canary.
The same thing co
Previous patches ensured that the final frame allocation only needs
a probe when the size is strictly greater than 1KiB. It's therefore
safe to use the normal 1024 probe offset in all cases.
The main motivation for doing this is to simplify the code and
remove the number of special cases.
gcc/
locals_offset was described as:
/* Offset from the base of the frame (incomming SP) to the
top of the locals area. This value is always a multiple of
STACK_BOUNDARY. */
This is implicitly an “upside down” view of the frame: the incoming
SP is at offset 0, and anything N bytes below
After previous patches, it's no longer necessary to store
saved_regs_size and below_hard_fp_saved_regs_size in the frame info.
All measurements instead use the top or bottom of the frame as
reference points.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::saved_regs_size)
(aarch64_
The AArch64 ABI says that, when stack clash protection is used,
there can be a maximum of 1KiB of unprobed space at sp on entry
to a function. Therefore, we need to probe when allocating
>= guard_size - 1KiB of data (>= rather than >). This is what
GCC does.
If an allocation is exactly guard_siz
The frame layout code currently hard-codes the assumption that
the number of bytes below the saved registers is equal to the
size of the outgoing arguments. This patch abstracts that
value into a new field of aarch64_frame.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::bytes_below_saved
-fstack-clash-protection uses the save of LR as a probe for the next
allocation. The next allocation could be:
* another part of the static frame, e.g. when allocating SVE save slots
or outgoing arguments
* an alloca in the same function
* an allocation made by a callee function
However, whe
This patch just changes a calculation of initial_adjust
to one that makes it slightly more obvious that the total
adjustment is frame.frame_size.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Tweak
calculation of initial_adjust for frames in which all saves
are S
This patch fixes another case in which a value was described with
an “upside-down” view.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::frame_size): Tweak comment.
---
gcc/config/aarch64/aarch64.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/config/aarch64/
If a frame has no saved registers, it can be allocated in one go.
There is no need to treat the areas below and above the saved
registers as separate.
And if we allocate the frame in one go, it should be allocated
as the initial_adjust rather than the final_adjust. This allows the
frame size to g
reg_offset was measured from the bottom of the saved register area.
This made perfect sense with the original layout, since the bottom
of the saved register area was also the hard frame pointer address.
It became slightly less obvious with SVE, since we save SVE
registers below the hard frame point
Similarly to the previous locals_offset patch, hard_fp_offset
was described as:
/* Offset from the base of the frame (incomming SP) to the
hard_frame_pointer. This value is always a multiple of
STACK_BOUNDARY. */
poly_int64 hard_fp_offset;
which again took an “upside-down” view: h
aarch64_save_callee_saves and aarch64_restore_callee_saves took
a parameter called start_offset that gives the offset of the
bottom of the saved register area from the current stack pointer.
However, it's more convenient for later patches if we use the
bottom of the entire frame as the reference po
When we emit the frame chain, i.e. when we reach Here in this statement
of aarch64_expand_prologue:
if (emit_frame_chain)
{
// Here
...
}
the stack is in one of two states:
- We've allocated up to the frame chain, but no more.
- We've allocated the whole frame, and the fra
After previous patches, it no longer really makes sense to allocate
the top of the frame in terms of varargs_and_saved_regs_size and
saved_regs_and_above.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Simplify
the allocation of the top of the frame.
---
gcc/config/aarch
After previous patches, it is no longer necessary to calculate
a chain_offset in cases where there is no chain record.
gcc/
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Move the
calculation of chain_offset into the emit_frame_chain block.
---
gcc/config/aarch64/aarch64.c
Following on from the previous bytes_below_saved_regs patch, this one
records the number of bytes that are below the hard frame pointer.
This eventually replaces below_hard_fp_saved_regs_size.
If a frame pointer is not needed, the epilogue adds final_adjust
to the stack pointer before restoring re
aarch64_layout_frame uses a shorthand for referring to
cfun->machine->frame:
aarch64_frame &frame = cfun->machine->frame;
This patch does the same for some other heavy users of the structure.
No functional change intended.
gcc/
* config/aarch64/aarch64.cc (aarch64_save_callee_saves): U
This series of patches fixes deficiencies in GCC's -fstack-protector
implementation for AArch64 when using dynamically allocated stack space.
This is CVE-2023-4039. See:
https://developer.arm.com/Arm%20Security%20Center/GCC%20Stack%20Protector%20Vulnerability%20AArch64
https://github.com/metaredt
Jakub Jelinek writes:
> Hi!
>
> The recent pp_wide_int changes for _BitInt support (because not all
> wide_ints fit into the small fixed size digit_buffer anymore) apparently
> broke
> +FAIL: gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c (test for excess
> errors)
> +FAIL: gcc.dg/analyzer/out-o
Currently there are four static sources of attributes:
- LANG_HOOKS_ATTRIBUTE_TABLE
- LANG_HOOKS_COMMON_ATTRIBUTE_TABLE
- LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE
- TARGET_ATTRIBUTE_TABLE
All of the attributes in these tables go in the "gnu" namespace.
This means that they can use the traditional GNU __
Lehua Ding writes:
> V3 change: Address Richard's comments.
>
> Hi,
>
> This patch adds support that tries to fold `MIN (poly, poly)` to
> a constant. Consider the following C Code:
>
> ```
> void foo2 (int* restrict a, int* restrict b, int n)
> {
> for (int i = 0; i < 3; i += 1)
> a[i]
Richard Sandiford writes:
> Lehua Ding writes:
>> Hi,
>>
>> This patch adds support that tries to fold `MIN (poly, poly)` to
>> a constant. Consider the following C Code:
>>
>> ```
>> void foo2 (int* restrict a, int* restrict b, int n)
>> {
>> for (int i = 0; i < 3; i += 1)
>> a[i] += b
Lehua Ding writes:
> Hi,
>
> This patch adds support that tries to fold `MIN (poly, poly)` to
> a constant. Consider the following C Code:
>
> ```
> void foo2 (int* restrict a, int* restrict b, int n)
> {
> for (int i = 0; i < 3; i += 1)
> a[i] += b[i];
> }
> ```
>
> Before this patch:
>
Lehua Ding writes:
> Hi,
>
> This patch adds support that tries to fold `MIN (poly, poly)` to
> a constant. Consider the following C Code:
>
> ```
> void foo2 (int* restrict a, int* restrict b, int n)
> {
> for (int i = 0; i < 3; i += 1)
> a[i] += b[i];
> }
> ```
>
> Before this patch:
>
Robin Dapp writes:
> Hi Richard,
>
> I did some testing with the attached v2 that does not restrict to UNARY
> anymore. As feared ;) there is some more fallout that I'm detailing below.
>
> On Power there is one guality fail (pr43051-1.c) that I would take
> the liberty of ignoring for now.
>
> O
When I tried to use config-list.mk, the build for every triple except
the build machine's failed for m2. This is because, unlike other
languages, m2 builds target objects during all-gcc. The build will
therefore fail unless you have access to an appropriate binutils
(or an equivalent). That's qu
Robin Dapp writes:
> Hi Richard,
>
> I did some testing with the attached v2 that does not restrict to UNARY
> anymore. As feared ;) there is some more fallout that I'm detailing below.
>
> On Power there is one guality fail (pr43051-1.c) that I would take
> the liberty of ignoring for now.
>
> O
Yang Yujie writes:
> gcc/ChangeLog:
>
> * config.gcc: remove non-POSIX syntax "<<<".
OK. Thanks for the quick fix.
Richard.
> ---
> gcc/config.gcc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b2fe7c7ceef..6d4c8becd28 10
Yang Yujie writes:
> @@ -5171,25 +5213,21 @@ case "${target}" in
> # ${with_multilib_list} should not contain whitespaces,
> # consecutive commas or slashes.
> if echo "${with_multilib_list}" \
> - | grep -E -e "[[:space:]]" -e '[,/][,/]' -e '[
Szabolcs Nagy writes:
> Update tests for the new branch-protection parser errors.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/branch-protection-attr.c: Update.
> * gcc.target/aarch64/branch-protection-option.c: Update.
OK, thanks. (And I agree these are better messages. :))
Szabolcs Nagy writes:
> The tests manipulate the return address in abitest-2.h and thus not
> compatible with -mbranch-protection=pac-ret+leaf or
> -mbranch-protection=gcs.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/aapcs64/func-ret-1.c: Disable branch-protection.
> * gcc.ta
Szabolcs Nagy writes:
> This is needed since eh_return no longer prevents pac-ret in the
> normal return path.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/return_address_sign_1.c: Move func4 to ...
> * gcc.target/aarch64/return_address_sign_2.c: ... here and fix the
> s
Szabolcs Nagy writes:
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/eh_return-2.c: New test.
> * gcc.target/aarch64/eh_return-3.c: New test.
OK.
I wonder if it's worth using check-function-bodies for -3.c though.
It would then be easy to verify that the autiasp only occurs on t
Szabolcs Nagy writes:
> EH returns no longer rely on clobbering the return address on the stack
> so forcing a stack frame is not necessary.
>
> This does not actually change the code gen for the unwinder since there
> are calls before the EH return.
>
> gcc/ChangeLog:
>
> * config/aarch64/a
Szabolcs Nagy writes:
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): Remove dup.
OK, thanks.
Richard
> ---
> gcc/config/aarch64/aarch64.h | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 2b0fc
Thomas Schwinge writes:
> Hi!
>
> On 2023-09-04T23:05:05+0200, I wrote:
>> On 2019-07-16T15:04:49+0100, Richard Sandiford
>> wrote:
>>> This patch therefore adds a new check-function-bodies dg-final test
>
>>> The regexps in parse_function_bodies are fairly general, but might
>>> still need to b
Robin Dapp writes:
>> So I don't think I have a good feel for the advantages and disadvantages
>> of doing this. Robin's analysis of the aarch64 changes was nice and
>> detailed though. I think the one that worries me most is the addressing
>> mode one. fwprop is probably the first chance we ge
Qing Zhao via Gcc-patches writes:
>> On Aug 29, 2023, at 3:42 PM, Marek Polacek via Gcc-patches
>> wrote:
>>
>> Improving the security of software has been a major trend in the recent
>> years. Fortunately, GCC offers a wide variety of flags that enable extra
>> hardening. These flags aren't
Richard Sandiford writes:
> "yanzhang.wang--- via Gcc-patches" writes:
>> From: Yanzhang Wang
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1
>> to not.
>>
>> Signed-off-by: Yanzhang Wang
>> ---
>>
>> Tested on my local arm environm
Thiago Jung Bauermann via Gcc-patches writes:
> Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for
> vector." these tests fail on aarch64-linux:
>
> === g++ tests ===
>
> Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
> FAIL: gcc.target/aarch
Marc Poulhiès via Gcc-patches writes:
> Richard Sandiford via Gcc-patches writes:
>>> +# this regex matches the first line of the "end" in the initial commit
>>> message
>>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by|co-authored-by|#):
Christophe Lyon via Gcc-patches writes:
> Tests under gcc.dg/vect use check_vect_support_and_set_flags to set
> compilation flags as appropriate for the target, but they also set
> dg-do-what-default to 'run' or 'compile', depending on the actual
> target hardware (or simulator) capabilities.
>
>
"yanzhang.wang--- via Gcc-patches" writes:
> From: Yanzhang Wang
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1
> to not.
>
> Signed-off-by: Yanzhang Wang
> ---
>
> Tested on my local arm environment and passed. Thanks Andrew Pinski's
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Friday, September 1, 2023 2:36 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> Subject: Re: [PATCH]AArch64 xorsign: Fix scalar x
Tamar Christina writes:
> Hi All,
>
> In GCC-9 our scalar xorsign pattern broke and we didn't notice it because the
> testcase was not strong enough. With this commit
>
> 8d2d39587d941a40f25ea0144cceb677df115040 is the first bad commit
> commit 8d2d39587d941a40f25ea0144cceb677df115040
> Author: S
Uros Bizjak via Gcc-patches writes:
> On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> wrote:
>>
>> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
>> > From: Kong Lingling
>> >
>> > In inline asm, we do not know if the insn can use EGPR, so disable E
Robin Dapp via Gcc-patches writes:
>> It's not just a question of which byte though. It's also a question
>> of which bit.
>>
>> One option would be to code-generate for even X and for odd X, and select
>> between them at runtime. But that doesn't scale well to 2+2X and 1+1X.
>>
>> Otherwise I
While backporting another patch to an earlier release, I hit a
situation in which lra_eliminate_regs_1 would eliminate an address to:
(plus (reg:P R) (const_int 0))
This address compared not-equal to plain:
(reg:P R)
which caused an ICE in a later peephole2. (The ICE showed up in
gfort
While working on another patch, I hit a problem with the aarch64
expansion of untyped_call. The expander emits the usual:
(set (mem ...) (reg resN))
instructions to store the result registers to memory, but it didn't
say in RTL where those resN results came from. This eventually led
to a fail
Robin Dapp writes:
>> But in the VLA case, doesn't it instead have precision 4+4X?
>> The problem then is that we can't tell at compile time which
>> byte that corresponds to. So...
>
> Yes 4 + 4x. I keep getting confused with poly modes :)
> In this case we want to extract the bitnum [3 4] = 3
[Sorry for any weird MUA issues, don't have access to my usual set-up.]
> when looking at a riscv ICE in vect-live-6.c I noticed that we
> assume that the variable part (coeffs[1] * x1) of the to-be-extracted
> bit number in extract_bit_field_1 is a multiple of BITS_PER_UNIT.
>
> This means that b
excl_hash_traits can be defined more simply by reusing existing traits.
Tested on aarch64-linux-gnu. OK to install?
Richard
gcc/
* attribs.cc (excl_hash_traits): Delete.
(test_attribute_exclusions): Use pair_hash and nofree_string_hash
instead.
---
gcc/attribs.cc | 45
Jeff Law writes:
> On 8/24/23 08:06, Robin Dapp via Gcc-patches wrote:
>> Ping. I refined the code and some comments a bit and added a test
>> case.
>>
>> My question in general would still be: Is this something we want
>> given that we potentially move some of combine's work a bit towards
>> t
Just some off-the-cuff thoughts. Might think differently when
I've had more time...
Richard Biener writes:
> On Mon, 28 Aug 2023, Jakub Jelinek wrote:
>
>> Hi!
>>
>> While the _BitInt series isn't committed yet, I had a quick look at
>> lifting the current lowest limitation on maximum _BitInt p
Juzhe-Zhong writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(T
Jeff Law writes:
> On 8/22/23 02:08, juzhe.zh...@rivai.ai wrote:
>> Yes, I agree long-term we want every-thing be optimized as early as
>> possible.
>>
>> However, IMHO, it's impossible we can support every conditional patterns
>> in the middle-end (match.pd).
>> It's a really big number.
>>
>
Richard Sandiford writes:
> Rather than hiding this in target code, perhaps we should add a
> target-independent concept of an "eh_return taken" flag, say
> EH_RETURN_TAKEN_RTX.
>
> We could define it so that, on targets that define EH_RETURN_TAKEN_RTX,
> a register EH_RETURN_STACKADJ_RTX and a re
Richard Biener writes:
> The following adds the capability to do SLP on .MASK_STORE, I do not
> plan to add interleaving support.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
LGTM, thanks.
Richard
> Thanks,
> Richard.
>
> PR tree-optimization/15
> gcc/
> * tree-v
The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean
that either side of a subtraction can start an accumulator chain.
However, Advanced SIMD doesn't have an equivalent instruction.
This means that, for Advanced SIMD, a subtraction can only be
fused if the second operand is a multiplicati
Wilco Dijkstra writes:
> Hi Richard,
>
> (that's quick!)
>
>> + if (size > max_copy_size || size > max_mops_size)
>> +return aarch64_expand_cpymem_mops (operands, is_memmove);
>>
>> Could you explain this a bit more? If I've followed the logic correctly,
>> max_copy_size will always be 0 for
"Richard Earnshaw (lists)" writes:
> On 23/08/2023 16:49, Richard Sandiford via Gcc-patches wrote:
>> Richard Earnshaw via Gcc-patches writes:
>>> Now that we require C++ 11, we can safely forward declare rtx_code
>>> so that we can use it i
Richard Earnshaw via Gcc-patches writes:
> Note, this patch is dependent on the patch I posted yesterday to
> forward declare rtx_code in coretypes.h.
>
> --
> Now that we have a forward declaration of rtx_code in coretypes.h, we
> can adjust these hooks to take rtx_code arguments rather than
Richard Earnshaw via Gcc-patches writes:
> Now that we require C++ 11, we can safely forward declare rtx_code
> so that we can use it in target hooks.
>
> gcc/ChangeLog
> * coretypes.h (rtx_code): Add forward declaration.
> * rtl.h (rtx_code): Make compatible with forward declaration.
Wilco Dijkstra writes:
> A MOPS memmove may corrupt registers since there is no copy of the input
> operands to temporary
> registers. Fix this by calling aarch64_expand_cpymem which does this. Also
> fix an issue with
> STRICT_ALIGNMENT being ignored if TARGET_MOPS is true, and avoid crashing
Szabolcs Nagy writes:
> The expected way to handle eh_return is to pass the stack adjustment
> offset and landing pad address via
>
> EH_RETURN_STACKADJ_RTX
> EH_RETURN_HANDLER_RTX
>
> to the epilogue that is shared between normal return paths and the
> eh_return paths. EH_RETURN_HANDLER_RTX
Marc Poulhiès via Gcc-patches writes:
> Consider Signed-Off-By lines as part of the ending of the initial
> commit to avoid having these in the middle of the log when the
> changelog part is injected after.
>
> This is particularly usefull with:
>
> $ git gcc-commit-mklog --amend -s
>
> that can
Thiago Jung Bauermann via Gcc-patches writes:
> This test passes since commit e41103081bfa "Fix undefined behaviour in
> profile_count::differs_from_p", so remove the xfail annotation.
>
> Tested on aarch64-linux-gnu, armv8l-linux-gnueabihf and x86_64-linux-gnu.
>
> gcc/testsuite/ChangeLog:
>
Richard Biener writes:
> On Wed, 16 Aug 2023, Juzhe-Zhong wrote:
>
>> Hi, Richard and Richi.
>>
>> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
>> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support
>> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>>
Juzhe-Zhong writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(T
Prathamesh Kulkarni writes:
> On Mon, 21 Aug 2023 at 12:26, Richard Biener wrote:
>>
>> On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote:
>>
>> > On Fri, 18 Aug 2023 at 14:52, Richard Biener wrote:
>> > >
>> > > On Fri, 18 Aug 2023, Richard Sandiford wrote:
>> > >
>> > > > Richard Biener writes:
Richard Sandiford writes:
> Joseph Myers writes:
>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>>
>>> Would it be OK to add support for:
>>>
>>> [[__extension__ ...]]
>>>
>>> to suppress the pedwarn about u
Richard Biener writes:
> The following avoids running into somehow flawed logic in fold_vec_perm
> for non-VLA vectors.
>
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> PR tree-optimization/111048
> * fold-const.cc (fold_vec_perm_cst): Check for non-VLA
>
Richard Biener writes:
>> Am 17.08.2023 um 13:25 schrieb Richard Sandiford via Gcc-patches
>> :
>>
>> Joseph Myers writes:
>>>> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>>>>
>>>> Would it be OK to add support f
Joseph Myers writes:
> On Wed, 16 Aug 2023, Richard Sandiford via Gcc-patches wrote:
>
>> Would it be OK to add support for:
>>
>> [[__extension__ ...]]
>>
>> to suppress the pedwarn about using [[]] prior to C2X? Then we can
>
> That seems lik
Alex Coplan writes:
> Hi,
>
> This patch fixes up the code examples in the RTL-SSA documentation (the
> sections on making insn changes) to reflect the current API.
>
> The main issues are as follows:
> - rtl_ssa::recog takes an obstack_watermark & as the first parameter.
>Presumably this is
Joseph Myers writes:
> On Mon, 17 Jul 2023, Michael Matz via Gcc-patches wrote:
>
>> So, essentially you want unignorable attributes, right? Then implement
>> exactly that: add one new keyword "__known_attribute__" (invent a better
>> name, maybe :) ), semantics exactly as with __attribute__ (i
Richard Ball writes:
> v2: Add missing PROFILE feature flag.
>
> This patch adds support for the Cortex-A720 CPU to GCC.
>
> No regressions on aarch64-none-elf.
>
> Ok for master?
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex-
> A720 CPU.
>
Robin Dapp writes:
>> However:
>>
>> | #define vec_extract_direct { 3, 3, false }
>>
>> This looks wrong. The numbers are argument numbers (or -1 for a return
>> value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range.
>>
>> | #define direct_vec_extract_optab_supported_p dir
Prathamesh Kulkarni writes:
>> Unfortunately, the patch regressed following tests on ppc64le and
>> armhf respectively:
>> gcc.target/powerpc/vec-perm-ctor.c scan-tree-dump-not optimized
>> "VIEW_CONVERT_EXPR"
>> gcc.dg/tree-ssa/forwprop-20.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR"
>>
>> This
"juzhe.zh...@rivai.ai" writes:
> Hi, Robin, Richard and Richi.
>
> I am wondering whether we can just simply replace the VEC_EXTRACT expander
> with binary?
>
> Like this :?
>
> DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, ECF_CONST | ECF_NOTHROW,
> - vec_extract, vec_extract)
> + vec_ex
Richard Biener writes:
> The following changes the gate to perform vectorization of BB reductions
> to use needs_fold_left_reduction_p which in turn requires handling
> TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by
> promoting any operations generated there to use unsigned arith
Richard Biener writes:
>> OK, fair enough. So the idea is: see where we end up and then try to
>> improve/factor the APIs in a less peephole way?
>
> Yeah, I think that's the only good way forward.
OK, no objection from me. Sorry for holding the patch up.
Richard
Richard Biener writes:
> On Tue, 15 Aug 2023, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > On Tue, 15 Aug 2023, Kewen.Lin wrote:
>> >
>> >> Hi Stefan,
>> >>
>> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote:
>> >> > Hi everyone,
>> >> >
>> >> > I have bootstrapped and reg
Richard Biener writes:
> On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote:
>> On Mon, 7 Aug 2023 at 13:19, Richard Biener
>> wrote:
>> > It doesn't seem to make a difference for x86. That said, the "fix" is
>> > probably sticking the correct target on the dump-check, it seems
>> > that vect_fold_
Richard Biener writes:
> On Tue, 15 Aug 2023, Kewen.Lin wrote:
>
>> Hi Stefan,
>>
>> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote:
>> > Hi everyone,
>> >
>> > I have bootstrapped and regtested the patch below on s390. For the
>> > 64-bit target I do not see any changes regarding the te
Richard Biener writes:
> On Tue, Aug 15, 2023 at 4:44 AM Kewen.Lin wrote:
>>
>> on 2023/8/14 22:16, Richard Sandiford wrote:
>> > No, it was more that 219-142=77, so it seems like a lot of lines
>> > are being duplicated rather than simply being moved. (Unlike for
>> > VMAT_LOAD_STORE_LANES, whi
Andrew Pinski via Gcc-patches writes:
> Like the support conditional neg (r12-4470-g20dcda98ed376cb61c74b2c71),
> this just adds conditional not too.
> Also we should be able to turn `(a ? -1 : 0) ^ b` into a conditional
> not.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-
I think it would help to clarify what the aim of the security policy is.
Specifically:
(1) What service do we want to provide to users by classifying one thing
as a security bug and another thing as not a security bug?
(2) What service do we want to provide to the GNU community by the same
"Kewen.Lin" writes:
> Hi Richard,
>
> on 2023/8/14 20:20, Richard Sandiford wrote:
>> Thanks for the clean-ups. But...
>>
>> "Kewen.Lin" writes:
>>> Hi,
>>>
>>> Following Richi's suggestion [1], this patch is to move the
>>> handlings on VMAT_GATHER_SCATTER in the final loop nest
>>> of functio
1 - 100 of 1077 matches
Mail list logo