Re: [V2 PATCH] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect.

2023-11-12 Thread Hongtao Liu
On Fri, Nov 10, 2023 at 5:12 PM Richard Biener wrote: > > On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu wrote: > > > > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener > > wrote: > > > > > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote: > > > > > > > > On Tue, Nov 7, 2023 at 10:34 PM Richard

Re: [PATCH] Fix (fcopysign x, NEGATIVE_CONST) -> (fneg (fabs x)) simplification [PR112483]

2023-11-12 Thread Andrew Pinski
On Sun, Nov 12, 2023, 23:10 Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Monday, November 13, 2023 6:55 AM > > To: Xi Ruoyao > > Cc: gcc-patches@gcc.gnu.org; chenglulu ; > > i...@xen0n.name; xucheng...@loongson.cn; Tamar Christina > > ;

Re: [PATCH] Fix (fcopysign x, NEGATIVE_CONST) -> (fneg (fabs x)) simplification [PR112483]

2023-11-12 Thread Xi Ruoyao
On Mon, 2023-11-13 at 07:09 +, Tamar Christina wrote: > In the case of e.g. longaarch64 It looks like the target actually has an > fcopysign > Instruction, so wouldn't this rewriting by simplify-rtx be a de-optimization? Yes it seems a de-optimization on LoongArch. For this micro-benchmark:

Re: [PATCH] Simplify vector ((VCE?(a cmp b ? -1 : 0)) < 0) ? c : d to just VCE:((a cmp b) ? (VCE c) : (VCE d)).

2023-11-12 Thread Hongtao Liu
On Fri, Nov 10, 2023 at 2:14 PM liuhongt wrote: > > When I'm working on PR112443, I notice there's some misoptimizations: > after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend > fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is > too complicated, so I think

Re: [x86 PATCH] Improve reg pressure of double-word right-shift then truncate.

2023-11-12 Thread Uros Bizjak
On Sun, Nov 12, 2023 at 10:03 PM Roger Sayle wrote: > > > This patch improves register pressure during reload, inspired by PR 97756. > Normally, a double-word right-shift by a constant produces a double-word > result, the highpart of which is dead when followed by a truncation. > The dead code

RE: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-12 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, November 13, 2023 7:09 AM > To: Andrew Pinski > Cc: Tamar Christina ; Prathamesh Kulkarni > ; gcc-patches@gcc.gnu.org; nd > ; j...@ventanamicro.com > Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to >

RE: [PATCH] Fix (fcopysign x, NEGATIVE_CONST) -> (fneg (fabs x)) simplification [PR112483]

2023-11-12 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, November 13, 2023 6:55 AM > To: Xi Ruoyao > Cc: gcc-patches@gcc.gnu.org; chenglulu ; > i...@xen0n.name; xucheng...@loongson.cn; Tamar Christina > ; tschwi...@gcc.gnu.org; Roger Sayle > > Subject: Re: [PATCH] Fix (fcopysign x,

Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-12 Thread Richard Biener
On Fri, 10 Nov 2023, Andrew Pinski wrote: > On Fri, Nov 10, 2023 at 5:12?AM Richard Biener wrote: > > > > On Fri, 10 Nov 2023, Tamar Christina wrote: > > > > > > > > Hi Prathamesh, > > > > > > Yes Arm requires SIMD for copysign. The testcases fail because they don't > > > turn on Neon. > > > >

Re: [PATCH] gimple-range-cache: Fix ICEs when dumping details [PR111967]

2023-11-12 Thread Richard Biener
On Sat, Nov 11, 2023 at 9:36 AM Jakub Jelinek wrote: > > Hi! > > The following testcase ICEs when dumping details. > When m_ssa_ranges vector is created, it is safe_grow_cleared (num_ssa_names), > but when when some new SSA_NAME is added, we strangely grow it to > num_ssa_names + 1 instead and

Re: [RFC] Intel AVX10.1 Compiler Design and Support

2023-11-12 Thread Hongtao Liu
On Fri, Nov 10, 2023 at 6:15 PM Richard Biener wrote: > > On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang wrote: > > > > Hi all, > > > > This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512 > > support, it makes a lot easier to add them comparing to the August version. > >

Re: [PATCH] Fix (fcopysign x, NEGATIVE_CONST) -> (fneg (fabs x)) simplification [PR112483]

2023-11-12 Thread Richard Biener
On Sun, Nov 12, 2023 at 9:27 PM Xi Ruoyao wrote: > > (fcopysign x, NEGATIVE_CONST) can be simplified to (fneg (fabs x)), but > a logic error in the code caused it mistakenly simplified to (fneg x) > instead. OK. > gcc/ChangeLog: > > PR rtl-optimization/112483 > * simplify-rtx.cc

Re: [PATCH] testsuite: Fix bad-mapper-1.C test failures with posix_spawn

2023-11-12 Thread Richard Biener
On Sun, Nov 12, 2023 at 12:12 AM Brendan Shanks wrote: > > bad-mapper-1.C has been failing since the posix_spawn codepath was added > to libiberty, adjust the check to accept the changed error message. > > Patch has been verified on x86_64 Linux. OK > gcc/testsuite: > > *

Re: [PATCH] Avoid generate vblendps with ymm16+

2023-11-12 Thread Hongtao Liu
On Sat, Nov 11, 2023 at 4:11 AM Jakub Jelinek wrote: > > On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote: > > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote: > > > > > > This patch aims to avoid generate vblendps with ymm16+, And have > > > bootstrapped and tested on

[PATCH 2/1] c++/modules: Allow exporting a typedef redeclaration

2023-11-12 Thread Nathaniel Shead
I happened to be browsing the standard a bit later and noticed that we incorrectly reject the example given below. Bootstrapped on x86_64-pc-linux-gnu; regtesting ongoing but modules.exp completed with no errors. -- >8 -- A typedef doesn't create a new entity, and thus should be allowed to be

[PATCH 3/4] c-family, C++: Handle clang attributes [PR109877].

2023-11-12 Thread Iain Sandoe
This adds the ability to defer the validation of numeric attribute arguments until the sequence is parsed if the attribute being handled is one known to be 'clang form'. We do this by considering the arguments to be strings regardless of content and defer the interpretation of those strings until

[PATCH 4/4] Darwin: Implement clang availability attribute [PR109877].

2023-11-12 Thread Iain Sandoe
This implements the handling of the clang-form "availability" attribute, which is the most important case used in the the macOS SDKs. PR c++/109877 gcc/ChangeLog: * config/darwin-protos.h (darwin_handle_weak_import_attribute): New.

[PATCH 2/4] c-family, C: handle clang attributes [PR109877].

2023-11-12 Thread Iain Sandoe
This adds the ability to defer the validation of numeric attribute arguments until the sequence is parsed if the attribute being handled is one known to be 'clang form'. We do this by considering the arguments to be strings regardless of content and defer the interpretation of those strings until

[PATCH 1/4] c-family: Add handling for clang-style attributes [PR109877].

2023-11-12 Thread Iain Sandoe
This patch set is not actually particualry new, I have been maintaining it locally one Darwin branches and it has been tested on several versions of Darwin both with and without Alex's __has_{feature, extension} patch. This is one of the three most significant blockers to importing the macOS SDKs

[no subject]

2023-11-12 Thread Iain Sandoe
This patch set is not actually particulalry new, I have been maintaining it locally one Darwin branches and it has been tested on several versions of Darwin both with and without Alex's __has_{feature, extension} patch. This is one of the three most significant blockers to importing the macOS

Re: [PATCH] LoongArch: Use simplify_gen_subreg instead of gen_rtx_SUBREG in loongarch_expand_vec_cond_mask_expr [PR112476]

2023-11-12 Thread chenglulu
在 2023/11/12 上午9:00, Xi Ruoyao 写道: GCC internal says: 'subreg's of 'subreg's are not supported. Using 'simplify_gen_subreg' is the recommended way to avoid this problem. Unfortunately loongarch_expand_vec_cond_mask_expr might create nested subreg under certain circumstances,

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
On 2023/11/13 9:11, juzhe.zh...@rivai.ai wrote: Ah, nice!  How configurable are the bit ranges? I think Lehua's patch is configurable for bit ranges. Since his patch allow target flexible tracking subreg livenesss according to REGMODE_NATURAL_SIZE +/* Return true if REGNO is a pseudo and

Re: [PATCH v2] LoongArch: Optimize single-used address with -mexplicit-relocs=auto for fld/fst

2023-11-12 Thread chenglulu
在 2023/11/11 下午6:58, Xi Ruoyao 写道: fld and fst have same address mode as ld.w and st.w, so the same optimization as r14-4851 should be applied for them too. gcc/ChangeLog: * config/loongarch/loongarch.md (LD_AT_LEAST_32_BIT): New mode iterator. (ST_ANY): New mode

RE: [PATCH v2] DSE: Allow vector type for get_stored_val when read < store

2023-11-12 Thread Li, Pan2
Update v4 in below link, please help to ignore v3. https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636216.html Sorry for inconvenience. Pan -Original Message- From: Li, Pan2 Sent: Sunday, November 12, 2023 10:31 AM To: Richard Sandiford ; Jeff Law Cc:

[PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-12 Thread pan2 . li
From: Pan Li Update in v4: * Merge upstream and removed some independent changes. Update in v3: * Take known_le instead of known_lt for vector size. * Return NULL_RTX when gap is not equal 0 and not constant. Update in v2: * Move vector type support to get_stored_val. Original log: This

RE: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure

2023-11-12 Thread Li, Pan2
Committed, thanks Juzhe. Pan From: juzhe.zh...@rivai.ai Sent: Monday, November 13, 2023 11:11 AM To: Li, Pan2 ; gcc-patches Cc: Li, Pan2 ; Wang, Yanzhang ; kito.cheng Subject: Re: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure OK

Re: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure

2023-11-12 Thread juzhe.zh...@rivai.ai
OK juzhe.zh...@rivai.ai From: pan2.li Date: 2023-11-13 11:10 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure From: Pan Li The hancement of mode-switching performs some optimization when emit the frm backup

[PATCH v1] RISC-V: Fix RVV dynamic frm tests failure

2023-11-12 Thread pan2 . li
From: Pan Li The hancement of mode-switching performs some optimization when emit the frm backup insn, some redudant fsrm insns are removed for the following test cases. This patch would like to adjust the asm check for above optimization. gcc/testsuite/ChangeLog: *

Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

2023-11-12 Thread Jeff Law
On 11/12/23 19:16, Jin Ma wrote: Unfortunately this patch has triggered a bootstrap comparison failure on loongarch64-linux-gnu: https://gcc.gnu.org/PR112497. It's also causing simple build failures on other targets. For example c6x-elf aborts when compiling gcc.c-torture/execute/pr82210

Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

2023-11-12 Thread Jin Ma
> > > > Unfortunately this patch has triggered a bootstrap comparison failure on > > loongarch64-linux-gnu: https://gcc.gnu.org/PR112497. > It's also causing simple build failures on other targets. For example > c6x-elf aborts when compiling gcc.c-torture/execute/pr82210 (and others) > with

RE: [PATCH] Avoid generate vblendps with ymm16+

2023-11-12 Thread Hu, Lin1
On Saturday, November 11, 2023 4:11 AM, Jakub Jelinek wrote: > On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote: > > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote: > > > > > > This patch aims to avoid generate vblendps with ymm16+, And have > > > bootstrapped and tested on

Re: [PATCH v1] RISC-V: Support FP l/ll round and rint HF mode autovec

2023-11-12 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: pan2.li Date: 2023-11-12 21:47 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support FP l/ll round and rint HF mode autovec From: Pan Li This patch would like to support the FP below API auto vectorization

Re: Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread juzhe.zh...@rivai.ai
>> Ah, nice! How configurable are the bit ranges? I think Lehua's patch is configurable for bit ranges. Since his patch allow target flexible tracking subreg livenesss according to REGMODE_NATURAL_SIZE +/* Return true if REGNO is a pseudo and MODE is a multil regs size. */ +bool

Re: [PATCH V2] VECT: Support mask_len_strided_load/mask_len_strided_store in loop vectorize

2023-11-12 Thread juzhe.zh...@rivai.ai
Hi. Ping this patch which is last optab pattern for RVV support. The mask_len_strided_load/mask_len_strided_store document has been approved: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635103.html Bootstrap on X86 and regtest no regression. Tested on aarch64 no regression. Tested

Re: [PATCH v2 1/7] aarch64: Use br instead of ret for eh_return

2023-11-12 Thread Hans-Peter Nilsson
> From: Szabolcs Nagy > Date: Fri, 3 Nov 2023 15:36:08 + I don't see others commenting on this patch, and you're not mentioning this aspect, so I wonder: > * config/aarch64/aarch64.h (EH_RETURN_TAKEN_RTX): Define. > (EH_RETURN_STACKADJ_RTX): Change to R5. >

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind [PR66487]

2023-11-12 Thread Sam James
Sam James writes: > Alexander Monakov writes: > [...] >> >> I'm very curious what you mean by "this has come up with LLVM [] too": >> ttbomk, >> LLVM doesn't do such lifetime-based optimization yet, which is why compiling >> LLVM with LLVM doesn't break it. Can you share some examples? Or do

[PATCH] PR112380: Defend against CLOBBERs in RTX expressions in combine.cc

2023-11-12 Thread Roger Sayle
This patch addresses PR rtl-optimization/112380, an ICE-on-valid regression where a (clobber (const_int 0)) encounters a sanity checking gcc_assert (at line 7554) in simplify-rtx.cc. These CLOBBERs are used internally by GCC's combine pass much like error_mark_node is used by various language

Re: [PATCH] libgccjit: Fix GGC segfault when using -flto

2023-11-12 Thread David Malcolm
On Fri, 2023-11-10 at 18:14 -0500, David Malcolm wrote: > On Fri, 2023-11-10 at 11:02 -0500, Antoni Boucher wrote: > > Hi. > > This patch fixes the segfault when using -flto with libgccjit (bug > > 111396). > > > > You mentioned in bugzilla that this didn't fix the reproducer for > > you, > >

[x86 PATCH] Improve reg pressure of double-word right-shift then truncate.

2023-11-12 Thread Roger Sayle
This patch improves register pressure during reload, inspired by PR 97756. Normally, a double-word right-shift by a constant produces a double-word result, the highpart of which is dead when followed by a truncation. The dead code calculating the high part gets cleaned up post-reload, so the

[PATCH] Fix (fcopysign x, NEGATIVE_CONST) -> (fneg (fabs x)) simplification [PR112483]

2023-11-12 Thread Xi Ruoyao
(fcopysign x, NEGATIVE_CONST) can be simplified to (fneg (fabs x)), but a logic error in the code caused it mistakenly simplified to (fneg x) instead. gcc/ChangeLog: PR rtl-optimization/112483 * simplify-rtx.cc (simplify_binary_operation_1) : Fix the simplification of

Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

2023-11-12 Thread Xi Ruoyao
On Sun, 2023-11-12 at 11:02 -0700, Jeff Law wrote: > > > On 11/12/23 10:41, Xi Ruoyao wrote: > > On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote: > > > > > > > > > On 8/14/23 05:22, Jin Ma wrote: > > > > CLOBBER and USE does not represent real instructions, but in the > > > > process of

Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

2023-11-12 Thread Jeff Law
On 11/12/23 10:41, Xi Ruoyao wrote: On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote: On 8/14/23 05:22, Jin Ma wrote: CLOBBER and USE does not represent real instructions, but in the process of pipeline optimization, they will wait for transmission in ready list like other insns, without

Re: [PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

2023-11-12 Thread Xi Ruoyao
On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote: > > > On 8/14/23 05:22, Jin Ma wrote: > > CLOBBER and USE does not represent real instructions, but in the > > process of pipeline optimization, they will wait for transmission > > in ready list like other insns, without considering resource > >

[committed] i386: Remove *stack_protect_set_4s__di alternative that will never match

2023-11-12 Thread Uros Bizjak
The relevant peephole2 will never generate alternative (=m,=,0,m) because operand 1 is not dead before the peephole2 pattern. gcc/ChangeLog: * config/i386/i386.md (*stack_protect_set_4s__di): Remove alternative 0. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Uros. diff

[PATCH 5/5] Add an aligned_register_operand predicate

2023-11-12 Thread Richard Sandiford
This patch adds a target-independent aligned_register_operand predicate, for use with register constraints that use filters to impose an alignment. The definition deliberately jetisons some of the historical baggage in general_operand. gcc/ * common.md (aligned_register_operand): New

[PATCH 4/5] ira: Handle register filters

2023-11-12 Thread Richard Sandiford
This patch makes IRA apply register filters when picking hard registers. All the new code should be optimised away on targets that don't use register filters. On targets that do use them, the new register_filters bitfield is expected to be only a handful of bits. Information about register

[PATCH 3/5] lra: Handle register filters

2023-11-12 Thread Richard Sandiford
This patch makes LRA apply register filters. This plus the recog change is enough for correct code generation, but a follow-on IRA patch improves the allocation. All the new code should be optimised away on targets that don't use register filters. That's because get_register_filter just wraps

[PATCH 2/5] recog: Handle register filters

2023-11-12 Thread Richard Sandiford
The main (but simplest) part of this patch makes constrain_operands take register filters into account. The rest of the patch adds register filter information to operand_alternative. Generally, if two register constraints have different register filters, it's better if they're in separate

[PATCH 1/5] Add register filter operand to define_register_constraint

2023-11-12 Thread Richard Sandiford
The main way of enforcing registers to be aligned is through HARD_REGNO_MODE_OK. But this is a global property that applies to all operands. A given (regno, mode) pair is either globally valid or globally invalid. This patch instead adds a way of specifying that individual operands must be

[PATCH 0/5] Add support for operand-specific alignment requirements

2023-11-12 Thread Richard Sandiford
SME has various instructions that require aligned register tuples. However, the associated tuple modes are already widely used and do not need to be aligned in other contexts. It therefore isn't appropriate to force alignment in TARGET_HARD_REGNO_MODE_OK. There are also strided loads and stores

[PATCH v1] RISC-V: Support FP l/ll round and rint HF mode autovec

2023-11-12 Thread pan2 . li
From: Pan Li This patch would like to support the FP below API auto vectorization with different type size ++---+--+ | API| RV64 | RV32 | ++---+--+ | lrintf16 | HF => DI | HF => SI | | llrintf16 | HF => DI | HF => DI

[PATCH v3] DSE: Allow vector type for get_stored_val when read < store

2023-11-12 Thread pan2 . li
From: Pan Li Update in v3: * Take known_le instead of known_lt for vector size. * Return NULL_RTX when gap is not equal 0 and not constant. Update in v2: * Move vector type support to get_stored_val. Original log: This patch would like to allow the vector mode in the get_stored_val in the

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
Hi Vladimir, While you're starting your review, please review v3 version that fixes some ICE issues, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636178.html On 2023/11/12 20:01, Lehua Ding wrote: Hi Vladimir, On 2023/11/10 4:24, Vladimir Makarov wrote: On 11/7/23

[PATCH V3 7/7] lra: Support subreg live range track and conflict detect

2023-11-12 Thread Lehua Ding
This patch supports tracking the liveness of a subreg in a lra pass, with the goal of getting it to agree with ira's register allocation scheme. There is some duplication, maybe in the future this part of the code logic can be harmonized. gcc/ChangeLog: * ira-build.cc

[PATCH V3 5/7] ira: Add all nregs >= 2 pseudos to tracke subreg list

2023-11-12 Thread Lehua Ding
This patch relax the subreg track capability to all subreg registers. gcc/ChangeLog: * ira-build.cc (get_reg_unit_size): New. (has_same_nregs): New. (ira_set_allocno_class): Adjust. --- gcc/ira-build.cc | 41 - 1 file changed, 36

[PATCH V3 3/7] ira: Support subreg live range track

2023-11-12 Thread Lehua Ding
This patch supports tracking subreg liveness. It first extends ira_object_t objects[2] to std::vector objects, which can hold more than one object, and is used to collect all access via subreg in program and the partial_in and partial_out of the basic block live in/out. Then there is a

[PATCH V3 6/7] lra: Switch to live_subreg data flow

2023-11-12 Thread Lehua Ding
This patch switches the live_reg data in lra to live_subreg data, and the situation will be more complicated than in ira because this part of the data is modified in lra also and the live_subreg data will be recalculated. gcc/ChangeLog: * lra-coalesce.cc (update_live_info):

Re: [PATCH V2 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
These patches found a new bug and I resend a v3 version, I'm sorry about this. V3: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636178.html On 2023/11/12 17:58, Lehua Ding wrote: Hi, These patchs try to support subreg coalesce feature in register allocation passes (ira and lra).

[PATCH V3 4/7] ira: Support subreg copy

2023-11-12 Thread Lehua Ding
This patch changes the previous way of creating a copy between allocnos to objects. gcc/ChangeLog: * ira-build.cc (find_allocno_copy): Removed. (find_object): New. (ira_create_copy): Adjust. (add_allocno_copy_to_list): Adjust.

[PATCH V3 1/7] df: Add DF_LIVE_SUBREG problem

2023-11-12 Thread Lehua Ding
This patch adds a live_subreg problem to extend the original live_reg to track the liveness of subreg. We will only try to trace speudo registers who's mode size is a multiple of nature size and eventually a small portion of the inside will appear to use subreg. With live_reg problem, live_subreg

[PATCH V3 2/7] ira: Switch to live_subreg data

2023-11-12 Thread Lehua Ding
This patch switch the use of live_reg data to live_subreg data. gcc/ChangeLog: * ira-build.cc (create_bb_allocnos): Switch. (create_loop_allocnos): Ditto. * ira-color.cc (ira_loop_edge_freq): Ditto. * ira-emit.cc (generate_edge_moves): Ditto.

[PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
V3 Changes: 1. fix three ICE. 2. rebase Hi, These patchs try to support subreg coalesce feature in register allocation passes (ira and lra). Let's consider a RISC-V program (https://godbolt.org/z/ec51d91aT): ``` #include void foo (int32_t *in, int32_t *out, size_t m) { vint32m2_t

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
Hi Vladimir, On 2023/11/10 4:24, Vladimir Makarov wrote: On 11/7/23 22:47, Lehua Ding wrote: Lehua Ding (7):    ira: Refactor the handling of register conflicts to make it more general    ira: Add live_subreg problem and apply to ira pass    ira: Support subreg live range track    ira:

[PATCH] c++/modules: check mismatching exports for class tags [PR98885]

2023-11-12 Thread Nathaniel Shead
I think the error message is still a little bit unclear but I couldn't come up with something clearer that was similarly concise and matching the existing style. (Also I noticed that the linked PR was assigned to Nathan but there hadn't been activity for a while, and I've been looking into these

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Richard Sandiford
钟居哲 writes: > Hi, Richard. > >>> Maybe dead lanes are better tracked at the gimple level though, not sure. >>> (But AArch64 might need to lower lane operations more than it does now if >>> we want gimple to handle it.) > > We were trying to address such issue at GIMPLE leve at the beginning. >

Re: [PATCH 2/3] Add generated .opt.urls files

2023-11-12 Thread Iain Buclaw
Excerpts from David Malcolm's message of November 10, 2023 10:42 pm: > gcc/d/ChangeLog: > * lang.opt.urls: New file, autogenerated by > regenerate-opt-urls.py. > --- > gcc/d/lang.opt.urls | 95 + > create mode 100644 gcc/d/lang.opt.urls > [abridged view of

Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
Hi Dimitar, I solved the problem you reported in V2 patch (https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636166.html), is it possible for you to help confirm this? Thank you very much. On 2023/11/9 0:56, Dimitar Dimitrov wrote: On Wed, Nov 08, 2023 at 11:47:33AM +0800, Lehua Ding

[PATCH V2 6/7] lra: Switch to live_subreg data flow

2023-11-12 Thread Lehua Ding
This patch switches the live_reg data in lra to live_subreg data, and the situation will be more complicated than in ira because this part of the data is modified in lra also and the live_subreg data will be recalculated. gcc/ChangeLog: * lra-coalesce.cc (update_live_info):

[PATCH V2 1/7] df: Add DF_LIVE_SUBREG problem

2023-11-12 Thread Lehua Ding
This patch adds a live_subreg problem to extend the original live_reg to track the liveness of subreg. We will only try to trace speudo registers who's mode size is a multiple of nature size and eventually a small portion of the inside will appear to use subreg. With live_reg problem, live_subreg

[PATCH V2 5/7] ira: Add all nregs >= 2 pseudos to tracke subreg list

2023-11-12 Thread Lehua Ding
This patch relax the subreg track capability to all subreg registers. gcc/ChangeLog: * ira-build.cc (get_reg_unit_size): New. (has_same_nregs): New. (ira_set_allocno_class): Adjust. --- gcc/ira-build.cc | 41 - 1 file changed, 36

[PATCH V2 7/7] lra: Support subreg live range track and conflict detect

2023-11-12 Thread Lehua Ding
This patch supports tracking the liveness of a subreg in a lra pass, with the goal of getting it to agree with ira's register allocation scheme. There is some duplication, maybe in the future this part of the code logic can be harmonized. gcc/ChangeLog: * ira-build.cc

[PATCH V2 0/7] ira/lra: Support subreg coalesce

2023-11-12 Thread Lehua Ding
Hi, These patchs try to support subreg coalesce feature in register allocation passes (ira and lra). Let's consider a RISC-V program (https://godbolt.org/z/ec51d91aT): ``` #include void foo (int32_t *in, int32_t *out, size_t m) { vint32m2_t result = __riscv_vle32_v_i32m2 (in, 32);

[PATCH V2 4/7] ira: Support subreg copy

2023-11-12 Thread Lehua Ding
This patch changes the previous way of creating a copy between allocnos to objects. gcc/ChangeLog: * ira-build.cc (find_allocno_copy): Removed. (find_object): New. (ira_create_copy): Adjust. (add_allocno_copy_to_list): Adjust.

[PATCH V2 3/7] ira: Support subreg live range track

2023-11-12 Thread Lehua Ding
This patch supports tracking subreg liveness. It first extends ira_object_t objects[2] to std::vector objects, which can hold more than one object, and is used to collect all access via subreg in program and the partial_in and partial_out of the basic block live in/out. Then there is a

[PATCH V2 2/7] ira: Switch to live_subreg data

2023-11-12 Thread Lehua Ding
This patch switch the use of live_reg data to live_subreg data. gcc/ChangeLog: * ira-build.cc (create_bb_allocnos): Switch. (create_loop_allocnos): Ditto. * ira-color.cc (ira_loop_edge_freq): Ditto. * ira-emit.cc (generate_edge_moves): Ditto.

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind [PR66487]

2023-11-12 Thread Sam James
Alexander Monakov writes: > On Sat, 11 Nov 2023, Sam James wrote: > >> > Valgrind client requests are offered as macros that emit inline asm. For >> > use >> > in code generation, we need to wrap it in a built-in. We know that >> > implementing >> > such a built-in in libgcc is

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind [PR66487]

2023-11-12 Thread Alexander Monakov
On Sat, 11 Nov 2023, Sam James wrote: > > Valgrind client requests are offered as macros that emit inline asm. For > > use > > in code generation, we need to wrap it in a built-in. We know that > > implementing > > such a built-in in libgcc is undesirable, [...]. > > Perhaps less

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind

2023-11-12 Thread Alexander Monakov
On Sat, 11 Nov 2023, Arsen Arsenović wrote: > > +#else > > +# define VALGRIND_MAKE_MEM_UNDEFINED(ptr, sz) __builtin_trap () > > +#endif > > + > > +void __valgrind_make_mem_undefined (void *ptr, unsigned long sz) > > +{ > > + VALGRIND_MAKE_MEM_UNDEFINED (ptr, sz); > > +} > > Would it be