[Bug rtl-optimization/105753] [avr] ICE: in add_clobbers, at config/avr/avr-dimode.md:2705
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105753 --- Comment #19 from CVS Commits --- The master branch has been updated by Georg-Johann Lay : https://gcc.gnu.org/g:80348e6aec44966e20ca1ca823247ce1381071eb commit r14-1016-g80348e6aec44966e20ca1ca823247ce1381071eb Author: Triffid Hunter Date: Sat May 20 07:50:00 2023 +0200 target/105753: Fix ICE in add_clobbers due to extra PARALLEL in insn. This patch removes the superfluous parallel in [u]divmod patterns in the AVR backend. Effect of extra parallel is that add_clobbers reaches gcc_unreachable() because the clobbers for [u]divmod are missing. If an insn has multiple parts like clobbers, the parallel around the parts of the insn pattern is implicit. gcc/ PR target/105753 * config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi): Remove superfluous "parallel" in insn pattern. ([u]divmod4): Tidy code. Use gcc_unreachable() instead of printing error text to assembly. gcc/testsuite/ PR target/105753 * gcc.target/avr/torture/pr105753.c: New test.
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #10 from Andrew Pinski --- I applied the patches now after approval, r14-1014-gc5df248509b48 is the one that makes the difference here. I am not working on improving the ^1 part though so leaving it open for that.
[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181 --- Comment #28 from Andrew Pinski --- I forgot to mention this was fixed by r14-1014-gc5df248509b48 .
[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181 Andrew Pinski changed: What|Removed |Added Target||avr Target Milestone|10.5|14.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #27 from Andrew Pinski --- This is now fixed on the trunk for GCC 14. I have no plans on backporting the patches.
Re: [PATCH] [RISC-V] Fix riscv_expand_conditional_move.
On 4/27/23 20:21, Die Li wrote: Two issues have been observed in current riscv_expand_conditional_move implementation. 1. Before introduction of TARGET_XTHEADCONDMOV, op0 of comparision expression is used for mode comparision with word_mode, but after TARGET_XTHEADCONDMOV megered with TARGET_SFB_ALU, dest of if-then-else is used for mode comparision with word_mode, and from md file mode of dest is DI or SI which can be different with word_mode in RV64. 2. TARGET_XTHEADCONDMOV cannot be generated when the mode of the comparison is E_VOID. This patch solves the issues above. Provide an example from the newly added test case. Testcase: int ConNmv_reg_reg_reg(int x, int y, int z, int n){ if (x != y) return z; return n; } Cflags: -O2 -march=rv64gc_xtheadcondmov -mabi=lp64d before patch: ConNmv_reg_reg_reg: bne a0,a1,.L23 mv a2,a3 .L23: mv a0,a2 ret after patch: ConNmv_reg_reg_reg: sub a1,a0,a1 th.mveqza2,zero,a1 th.mvneza3,zero,a1 or a0,a2,a3 ret Co-Authored by: Fei Gao Signed-off-by: Die Li gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_conditional_move): Fix mode checking. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: New test. * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: New test. --- gcc/config/riscv/riscv.cc | 4 +- .../riscv/xtheadcondmov-indirect-rv32.c | 116 ++ .../riscv/xtheadcondmov-indirect-rv64.c | 116 ++ 3 files changed, 234 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect-rv32.c create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect-rv64.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 1529855a2b4..30ace45dc5f 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3411,7 +3411,7 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx cons, rtx alt) && GET_MODE_CLASS (mode) == MODE_INT && reg_or_0_operand (cons, mode) && reg_or_0_operand (alt, mode) - && GET_MODE (op) == mode + && (GET_MODE (op) == mode || GET_MODE (op) == E_VOIDmode) So I nearly suggested we just drop this check. In general comparisons don't have modes. But I don't think it's going to hurt and it lines up with the predicates that test for conditions. Note that some of the new tests are still failing (though they certainly do much better after your patch) . FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c -O1 check-function-bodies ConNmv_imm_imm_r > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c -O2 check-function-bodies ConNmv_imm_imm_reg FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c -O2 -flto -fno-use-linker-plugin -flto-partition=none check-function-bodies ConNmv_imm_imm_reg FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects check-function-bodies ConNmv_imm_imm_reg FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c -O3 -g check-function-bodies ConNmv_imm_imm_reg [ ... and a few more instances omitted ... ] I went ahead and pushed the patch, but you might want to double-check the state of those failing tests. Jeff
Re: [PATCH 7/7] Expand directly for single bit test
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Instead of using creating trees to the expansion, just expand directly which makes the code a little simplier but also reduces how much GC memory will be used during the expansion. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Rename to ... (expand_single_bit_test): This and expand directly. (do_store_flag): Update for the rename function. OK. jeff
[PATCH] RISC-V: Add RVV comparison autovectorization
From: Juzhe-Zhong This patch enable RVV auto-vectorization including floating-point unorder and order comparison. The testcases are leveraged from Richard. So include Richard as co-author. Co-Authored-By: Richard Sandiford gcc/ChangeLog: * config/riscv/autovec.md (vcond): New pattern. (vcondu): Ditto. (vcond): Ditto. (vec_cmp): Ditto. (vec_cmpu): Ditto. (vcond_mask_): Ditto. * config/riscv/riscv-protos.h (expand_vec_cmp_int): New function. (expand_vec_cmp_float): New function. (expand_vcond): New function. (emit_merge_op): Adapt function. * config/riscv/riscv-v.cc (emit_pred_op): Ditto. (emit_pred_binop): Ditto. (emit_pred_unop): New function. (emit_len_binop): Adapt function. (emit_len_unop): New function. (emit_index_op): Adapt function. (emit_merge_op): Ditto. (expand_vcond): New function. (emit_pred_cmp): Ditto. (emit_len_cmp): Ditto. (expand_vec_cmp_int): Ditto. (expand_vec_cmp_float): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp: * gcc.target/riscv/rvv/autovec/cmp/vcond-1.c: New test. * gcc.target/riscv/rvv/autovec/cmp/vcond-2.c: New test. * gcc.target/riscv/rvv/autovec/cmp/vcond-3.c: New test. * gcc.target/riscv/rvv/autovec/cmp/vcond_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cmp/vcond_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cmp/vcond_run-3.c: New test. --- gcc/config/riscv/autovec.md | 141 + gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-v.cc | 482 -- .../riscv/rvv/autovec/cmp/vcond-1.c | 157 ++ .../riscv/rvv/autovec/cmp/vcond-2.c | 75 +++ .../riscv/rvv/autovec/cmp/vcond-3.c | 13 + .../riscv/rvv/autovec/cmp/vcond_run-1.c | 49 ++ .../riscv/rvv/autovec/cmp/vcond_run-2.c | 76 +++ .../riscv/rvv/autovec/cmp/vcond_run-3.c | 6 + gcc/testsuite/gcc.target/riscv/rvv/rvv.exp| 2 + 10 files changed, 970 insertions(+), 35 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cmp/vcond_run-3.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index ce0b46537ad..5d8ba66f0c3 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -180,3 +180,144 @@ NULL_RTX, mode); DONE; }) + +;; = +;; == Comparisons and selects +;; = + +;; - +;; [INT,FP] Compare and select +;; - +;; The patterns in this section are synthetic. +;; - + +;; Integer (signed) vcond. Don't enforce an immediate range here, since it +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead. +(define_expand "vcond" + [(set (match_operand:V 0 "register_operand") + (if_then_else:V + (match_operator 3 "comparison_operator" + [(match_operand:VI 4 "register_operand") +(match_operand:VI 5 "nonmemory_operand")]) + (match_operand:V 1 "nonmemory_operand") + (match_operand:V 2 "nonmemory_operand")))] + "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode), + GET_MODE_NUNITS (mode))" + { +riscv_vector::expand_vcond (mode, operands); +DONE; + } +) + +;; Integer vcondu. Don't enforce an immediate range here, since it +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead. +(define_expand "vcondu" + [(set (match_operand:V 0 "register_operand") + (if_then_else:V + (match_operator 3 "comparison_operator" + [(match_operand:VI 4 "register_operand") +(match_operand:VI 5 "nonmemory_operand")]) + (match_operand:V 1 "nonmemory_operand") + (match_operand:V 2 "nonmemory_operand")))] + "TARGET_VECTOR && known_eq (GET_MODE_NUNITS (mode), + GET_MODE_NUNITS (mode))" + { +riscv_vector::expand_vcond (mode, operands); +DONE; + } +) + +;; Floating-point vcond. Don't enforce an immediate range here, since it +;; depends on the comparison; leave it to riscv_vector::expand_vcond instead. +(define_expand "vcond" + [(set
Re: [PATCH 6/7] Use BIT_FIELD_REF inside fold_single_bit_test
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Instead of depending on combine to do the extraction, Let's create a tree which will expand directly into the extraction. This improves code generation on some targets. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Use BIT_FIELD_REF instead of shift/and. OK. jeff
Re: [PATCH 5/7] Simplify fold_single_bit_test with respect to code
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Since we know that fold_single_bit_test is now only passed NE_EXPR or EQ_EXPR, we can simplify it and just use a gcc_assert to assert that is the code that is being passed. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Add an assert and simplify based on code being NE_EXPR or EQ_EXPR. OK. jeff
Re: [PATCH 4/7] Simplify fold_single_bit_test slightly
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Now the only use of fold_single_bit_test is in do_store_flag, we can change it such that to pass the inner arg and bitnum instead of building a tree. There is no code generation changes due to this change, only a decrease in GC memory that is produced during expansion. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Take inner and bitnum instead of arg0 and arg1. Update the code. (do_store_flag): Don't create a tree when calling fold_single_bit_test instead just call it with the bitnum and the inner tree. OK. jeff
Re: [PATCH 3/7] Use get_def_for_expr in fold_single_bit_test
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: The code in fold_single_bit_test, checks if the inner was a right shift and improve the bitnum based on that. But since the inner will always be a SSA_NAME at this point, the code is dead. Move it over to use the helper function get_def_for_expr instead. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Use get_def_for_expr instead of checking the inner's code. OK. jeff
Re: [PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Since the last use of fold_single_bit_test is fold_single_bit_test, we can inline it and even simplify the inlined version. This has no behavior change. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test_into_sign_test): Inline into ... (fold_single_bit_test): This and simplify. Just to be clear, based on the NFC assumption, this is OK for the trunk. jeff
Re: [PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: Since the last use of fold_single_bit_test is fold_single_bit_test, we can inline it and even simplify the inlined version. This has no behavior change. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test_into_sign_test): Inline into ... (fold_single_bit_test): This and simplify. Going to trust the inlining and simpification is really NFC. It's not really obvious from the patch. jeff
Re: [PATCH 1/7] Move fold_single_bit_test to expr.cc from fold-const.cc
On 5/19/23 20:14, Andrew Pinski via Gcc-patches wrote: This is part 1 of N patch set that will change the expansion of `(A & C) != 0` from using trees to directly expanding so later on we can do some cost analysis. Since the only user of fold_single_bit_test is now expand, move it to there. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * fold-const.cc (fold_single_bit_test_into_sign_test): Move to expr.cc. (fold_single_bit_test): Likewise. * expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc (fold_single_bit_test): Likewise and make static. * fold-const.h (fold_single_bit_test): Remove declaration. I'm assuming this is purely moving the bits around. OK. jeff
[Bug c/60090] For expression without ~, gcc -O1 emits "comparison of promoted ~unsigned with unsigned"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60090 --- Comment #7 from Andrew Pinski --- This one still happens on the trunk even with PR 107465 fixed. The reason is because even though a warning here is correct, it is not wanted due to requiring constant folding. Note you can get also the incorrect warning wording at -O0 with constexpr in GCC 13+ (and -std=c2x).
[Bug c/107465] [10 Regression] Bogus warning: promoted bitwise complement of an unsigned value is always nonzero
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107465 Andrew Pinski changed: What|Removed |Added CC||jmattsson at dius dot com.au --- Comment #22 from Andrew Pinski --- *** Bug 59098 has been marked as a duplicate of this bug. ***
[Bug c/59098] Unwarranted warning: promoted ~unsigned is always non-zero [-Wsign-compare]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59098 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #5 from Andrew Pinski --- Marking as a dup of bug 107465 as that is what fixed the issue here. *** This bug has been marked as a duplicate of bug 107465 ***
[Bug c/107465] [10 Regression] Bogus warning: promoted bitwise complement of an unsigned value is always nonzero
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107465 Andrew Pinski changed: What|Removed |Added CC||fredrik.hederstierna@securi ||tas-direct.com --- Comment #21 from Andrew Pinski --- *** Bug 38341 has been marked as a duplicate of this bug. ***
[Bug c/38341] Wrong warning comparison of promoted ~unsigned with unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38341 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #13 from Andrew Pinski --- So this has been fixed on all of the active branches. Since PR 107465 was the one recorded in the changelog, closing as a dup of that one. *** This bug has been marked as a duplicate of bug 107465 ***
Re: [PATCH] Mode-Switching: Fix local array maybe uninitialized warning
On 5/19/23 17:56, pan2...@intel.com wrote: From: Pan Li There are 2 local array in function optimize_mode_switching. It will be initialized conditionally at the beginning but then always consumed in another loop. It may trigger the warning maybe-uninitialized, and may result in build failure when enable werror, aka warning as error. This patch will initialize the local array to zero explictly when declaration. Signed-off-by: Pan Li gcc/ChangeLog: * mode-switching.cc (entity_map): Initialize the array to zero. (bb_info): Ditto. OK. jeff
[Bug c/52050] Want an option to warn about a declaration inside a for/while/if statements.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52050 Andrew Pinski changed: What|Removed |Added Blocks||87403 --- Comment #7 from Andrew Pinski --- This is now warning with -Wc90-c99-compat (since GCC 9). Though it does not have its own option. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403 [Bug 87403] [Meta-bug] Issues that suggest a new warning
[Bug c++/66555] Fails to warn for if (j == 0 && i == i)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66555 Andrew Pinski changed: What|Removed |Added CC||trt at alumni dot duke.edu --- Comment #4 from Andrew Pinski --- *** Bug 17534 has been marked as a duplicate of this bug. ***
[Bug c/17534] gcc fails to diagnose suspect expressions that have incompatible bit masks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17534 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |6.0 Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #10 from Andrew Pinski --- Fixed for GCC 6 by r6-2453-g05b28fd6f91016 (aka PR 66555) so just marking as a dup of that bug. *** This bug has been marked as a duplicate of bug 66555 ***
[Bug middle-end/49617] gcc misses uninititialized variables in contained functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49617 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2011-07-04 09:48:32 |2023-5-19 Component|c |middle-end --- Comment #2 from Andrew Pinski --- Hmm, for the C testcase with GCC 5, we do get a warning: : In function 'main': :11:6: warning: 'FRAME.0.y' is used uninitialized in this function [-Wuninitialized] x = y; ^ There is no warning in GCC 6+ though. Plus the diagnostic mentions FRAME.0. which is not in the original source.
[Bug c/20110] format checking and non-ASCII character sets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20110 Andrew Pinski changed: What|Removed |Added CC||bonzini at gnu dot org --- Comment #4 from Andrew Pinski --- *** Bug 33748 has been marked as a duplicate of this bug. ***
[Bug c/33748] format warnings don't take input charset into account
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33748 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Andrew Pinski --- Dup of bug 20110. *** This bug has been marked as a duplicate of bug 20110 ***
Re: [PATCH v2] RISC-V: Add bext pattern for ZBS
On 5/8/23 08:11, Raphael Moreira Zinsly wrote: Changes since v1: - Removed name clash change. - Fix new pattern indentation. -- >8 -- When (a & (1 << bit_no)) is tested inside an IF we can use a bit extract. gcc/ChangeLog: * config/riscv/bitmanip.md (branch_bext): New split pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbs-bext-02.c: New test. I went ahead and pushed this. jeff
[Bug middle-end/55279] New pseudo registers aren't supported in CSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55279 --- Comment #10 from Andrew Pinski --- (In reply to Andrew Pinski from comment #3) > I think combine was changed for the similar reason to support psedudos but I > cannot find the patch right now. Note combine was only fully fixed recently in GCC 12 with r12-8030-g61bee6aed26eb3.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #12 from Jeffrey A. Law --- Should be fixed with Raphael's patch on the trunk.
Re: [PATCH v2] RISC-V: Fix CTZ unnecessary sign extension [PR #106888]
On 5/8/23 08:12, Raphael Moreira Zinsly wrote: Changes since v1: - Remove subreg from operand 1. -- >8 -- We were not able to match the CTZ sign extend pattern on RISC-V because it gets optimized to zero extend and/or to ANDI patterns. For the ANDI case, combine scrambles the RTL and generates the extension by using subregs. gcc/ChangeLog: PR target/106888 * config/riscv/bitmanip.md (disi2): Match with any_extend. (disi2_sext): New pattern to match with sign extend using an ANDI instruction. gcc/testsuite/ChangeLog: PR target/106888 * gcc.target/riscv/pr106888.c: New test. * gcc.target/riscv/zbbw.c: Check for ANDI. THanks. I went ahead and retested this against the trunk and pushed it. jeff
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 --- Comment #11 from CVS Commits --- The master branch has been updated by Jeff Law : https://gcc.gnu.org/g:9000da00dd70988f30d43806bae33b22ee6b9904 commit r14-1006-g9000da00dd70988f30d43806bae33b22ee6b9904 Author: Raphael Moreira Zinsly Date: Fri May 19 20:54:34 2023 -0600 RISC-V: Fix CTZ unnecessary sign extension [PR #106888] Changes since v1: - Remove subreg from operand 1. -- >8 -- We were not able to match the CTZ sign extend pattern on RISC-V because it gets optimized to zero extend and/or to ANDI patterns. For the ANDI case, combine scrambles the RTL and generates the extension by using subregs. gcc/ChangeLog: PR target/106888 * config/riscv/bitmanip.md (disi2): Match with any_extend. (disi2_sext): New pattern to match with sign extend using an ANDI instruction. gcc/testsuite/ChangeLog: PR target/106888 * gcc.target/riscv/pr106888.c: New test. * gcc.target/riscv/zbbw.c: Check for ANDI.
[Bug middle-end/31271] Missing simple optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31271 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |4.7.0 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > > I think we could do slightly better > ((~in_2(D)) & 224) == 0 > > But only at exand time. > This gives: > notl%edi > xorl%eax, %eax > testb $-32, %dil > setne %al x86_64 produces that in GCC 13 with r13-792-g29ae455901ac71 . > > Or for aarch64: > mov w8, #224 > bicswzr, w8, w0 > csetw0, ne > ret For aarch64, it could define an instruction to catch: (set (reg:CC_NZV 66 cc) (compare:CC_NZV (and:SI (not:SI (reg:SI 100)) (const_int 224 [0xe0])) (const_int 0 [0]))) Anyways the original issue was fixed in GCC 4.7.0 and the small improvement for x86_64 is in GCC 13. The aarch64 code generation is currently: and w0, w0, 224 cmp w0, 224 csetw0, ne ret Which is only slightly worse than what I proposed too.
[Bug middle-end/31631] Folding of A & (1 << B) pessimizes FRE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31631 --- Comment #2 from Andrew Pinski --- We do a LIM before PRE now which allows PRE to handle it.
[Bug middle-end/100798] a?~t:t and (-(!!a))^t don't produce the same assembly code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100798 --- Comment #1 from Andrew Pinski --- To produce the same code we could do a match pattern: (simplify (cond @0 (bit_not @1) @1) (bit_xor (neg (convert @0)) @1))
[Bug middle-end/64334] Common .opt handling: Support flags which take a list of values (-fopt=a,b,c ...)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64334 Andrew Pinski changed: What|Removed |Added Keywords||internal-improvement Target Milestone|--- |12.0 --- Comment #2 from Andrew Pinski --- EnumBitSet was added with r12-6842-g0ebb09f5e49c8c . EnumSet/Set was added with r12-6839-g385196adb52d85 . So fixed with GCC 12. Note fsanitize= is still not using those for other reasons.
[PATCH 5/7] Simplify fold_single_bit_test with respect to code
Since we know that fold_single_bit_test is now only passed NE_EXPR or EQ_EXPR, we can simplify it and just use a gcc_assert to assert that is the code that is being passed. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Add an assert and simplify based on code being NE_EXPR or EQ_EXPR. --- gcc/expr.cc | 108 ++-- 1 file changed, 53 insertions(+), 55 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 67a9f82ca17..b5bc3fabb7e 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12909,72 +12909,70 @@ fold_single_bit_test (location_t loc, enum tree_code code, tree inner, int bitnum, tree result_type) { - if ((code == NE_EXPR || code == EQ_EXPR)) -{ - tree type = TREE_TYPE (inner); - scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type); - int ops_unsigned; - tree signed_type, unsigned_type, intermediate_type; - tree one; - gimple *inner_def; + gcc_assert (code == NE_EXPR || code == EQ_EXPR); - /* First, see if we can fold the single bit test into a sign-bit -test. */ - if (bitnum == TYPE_PRECISION (type) - 1 - && type_has_mode_precision_p (type)) - { - tree stype = signed_type_for (type); - return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR, - result_type, - fold_convert_loc (loc, stype, inner), - build_int_cst (stype, 0)); - } + tree type = TREE_TYPE (inner); + scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type); + int ops_unsigned; + tree signed_type, unsigned_type, intermediate_type; + tree one; + gimple *inner_def; - /* Otherwise we have (A & C) != 0 where C is a single bit, -convert that into ((A >> C2) & 1). Where C2 = log2(C). -Similarly for (A & C) == 0. */ + /* First, see if we can fold the single bit test into a sign-bit + test. */ + if (bitnum == TYPE_PRECISION (type) - 1 + && type_has_mode_precision_p (type)) +{ + tree stype = signed_type_for (type); + return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR, + result_type, + fold_convert_loc (loc, stype, inner), + build_int_cst (stype, 0)); +} - /* If INNER is a right shift of a constant and it plus BITNUM does -not overflow, adjust BITNUM and INNER. */ - if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR)) - && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST - && bitnum < TYPE_PRECISION (type) - && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)), - TYPE_PRECISION (type) - bitnum)) - { - bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def)); - inner = gimple_assign_rhs1 (inner_def); - } + /* Otherwise we have (A & C) != 0 where C is a single bit, + convert that into ((A >> C2) & 1). Where C2 = log2(C). + Similarly for (A & C) == 0. */ - /* If we are going to be able to omit the AND below, we must do our -operations as unsigned. If we must use the AND, we have a choice. -Normally unsigned is faster, but for some machines signed is. */ - ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND - && !flag_syntax_only) ? 0 : 1; + /* If INNER is a right shift of a constant and it plus BITNUM does + not overflow, adjust BITNUM and INNER. */ + if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR)) + && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST + && bitnum < TYPE_PRECISION (type) + && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)), +TYPE_PRECISION (type) - bitnum)) +{ + bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def)); + inner = gimple_assign_rhs1 (inner_def); +} - signed_type = lang_hooks.types.type_for_mode (operand_mode, 0); - unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1); - intermediate_type = ops_unsigned ? unsigned_type : signed_type; - inner = fold_convert_loc (loc, intermediate_type, inner); + /* If we are going to be able to omit the AND below, we must do our + operations as unsigned. If we must use the AND, we have a choice. + Normally unsigned is faster, but for some machines signed is. */ + ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND + && !flag_syntax_only) ? 0 : 1; - if (bitnum != 0) - inner = build2 (RSHIFT_EXPR, intermediate_type, - inner, size_int (bitnum)); + signed_type = lang_hooks.types.type_for_mode (operand_mode, 0); + unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1); + intermediate_type =
[PATCH 7/7] Expand directly for single bit test
Instead of using creating trees to the expansion, just expand directly which makes the code a little simplier but also reduces how much GC memory will be used during the expansion. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Rename to ... (expand_single_bit_test): This and expand directly. (do_store_flag): Update for the rename function. --- gcc/expr.cc | 63 - 1 file changed, 28 insertions(+), 35 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index d04e8ed0204..6849c9627d0 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12899,15 +12899,14 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree *arg0, tree *arg1) } -/* If CODE with arguments INNER & (1<
[PATCH 4/7] Simplify fold_single_bit_test slightly
Now the only use of fold_single_bit_test is in do_store_flag, we can change it such that to pass the inner arg and bitnum instead of building a tree. There is no code generation changes due to this change, only a decrease in GC memory that is produced during expansion. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Take inner and bitnum instead of arg0 and arg1. Update the code. (do_store_flag): Don't create a tree when calling fold_single_bit_test instead just call it with the bitnum and the inner tree. --- gcc/expr.cc | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index a61772b6808..67a9f82ca17 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12899,23 +12899,19 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree *arg0, tree *arg1) } -/* If CODE with arguments ARG0 and ARG1 represents a single bit +/* If CODE with arguments INNER & (1<
[PATCH 6/7] Use BIT_FIELD_REF inside fold_single_bit_test
Instead of depending on combine to do the extraction, Let's create a tree which will expand directly into the extraction. This improves code generation on some targets. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Use BIT_FIELD_REF instead of shift/and. --- gcc/expr.cc | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index b5bc3fabb7e..d04e8ed0204 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12957,22 +12957,21 @@ fold_single_bit_test (location_t loc, enum tree_code code, intermediate_type = ops_unsigned ? unsigned_type : signed_type; inner = fold_convert_loc (loc, intermediate_type, inner); - if (bitnum != 0) -inner = build2 (RSHIFT_EXPR, intermediate_type, - inner, size_int (bitnum)); + tree bftype = build_nonstandard_integer_type (1, 1); + int bitpos = bitnum; - one = build_int_cst (intermediate_type, 1); + if (BYTES_BIG_ENDIAN) +bitpos = GET_MODE_BITSIZE (operand_mode) - 1 - bitpos; - if (code == EQ_EXPR) -inner = fold_build2_loc (loc, BIT_XOR_EXPR, intermediate_type, inner, one); + inner = build3_loc (loc, BIT_FIELD_REF, bftype, inner, + bitsize_int (1), bitsize_int (bitpos)); - /* Put the AND last so it can combine with more things. */ - inner = build2 (BIT_AND_EXPR, intermediate_type, inner, one); + one = build_int_cst (bftype, 1); - /* Make sure to return the proper type. */ - inner = fold_convert_loc (loc, result_type, inner); + if (code == EQ_EXPR) +inner = fold_build2_loc (loc, BIT_XOR_EXPR, bftype, inner, one); - return inner; + return fold_convert_loc (loc, result_type, inner); } /* Generate code to calculate OPS, and exploded expression -- 2.17.1
[PATCH 3/7] Use get_def_for_expr in fold_single_bit_test
The code in fold_single_bit_test, checks if the inner was a right shift and improve the bitnum based on that. But since the inner will always be a SSA_NAME at this point, the code is dead. Move it over to use the helper function get_def_for_expr instead. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test): Use get_def_for_expr instead of checking the inner's code. --- gcc/expr.cc | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 6221b6991c5..a61772b6808 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12920,6 +12920,7 @@ fold_single_bit_test (location_t loc, enum tree_code code, int ops_unsigned; tree signed_type, unsigned_type, intermediate_type; tree one; + gimple *inner_def; /* First, see if we can fold the single bit test into a sign-bit test. */ @@ -12939,14 +12940,14 @@ fold_single_bit_test (location_t loc, enum tree_code code, /* If INNER is a right shift of a constant and it plus BITNUM does not overflow, adjust BITNUM and INNER. */ - if (TREE_CODE (inner) == RSHIFT_EXPR - && TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST + if ((inner_def = get_def_for_expr (inner, RSHIFT_EXPR)) + && TREE_CODE (gimple_assign_rhs2 (inner_def)) == INTEGER_CST && bitnum < TYPE_PRECISION (type) - && wi::ltu_p (wi::to_wide (TREE_OPERAND (inner, 1)), + && wi::ltu_p (wi::to_wide (gimple_assign_rhs2 (inner_def)), TYPE_PRECISION (type) - bitnum)) { - bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); - inner = TREE_OPERAND (inner, 0); + bitnum += tree_to_uhwi (gimple_assign_rhs2 (inner_def)); + inner = gimple_assign_rhs1 (inner_def); } /* If we are going to be able to omit the AND below, we must do our -- 2.17.1
[PATCH 2/7] Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test
Since the last use of fold_single_bit_test is fold_single_bit_test, we can inline it and even simplify the inlined version. This has no behavior change. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * expr.cc (fold_single_bit_test_into_sign_test): Inline into ... (fold_single_bit_test): This and simplify. --- gcc/expr.cc | 51 ++- 1 file changed, 10 insertions(+), 41 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index f999f81af4a..6221b6991c5 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12899,42 +12899,6 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree *arg0, tree *arg1) } - -/* If CODE with arguments ARG0 and ARG1 represents a single bit - equality/inequality test, then return a simplified form of the test - using a sign testing. Otherwise return NULL. TYPE is the desired - result type. */ - -static tree -fold_single_bit_test_into_sign_test (location_t loc, -enum tree_code code, tree arg0, tree arg1, -tree result_type) -{ - /* If this is testing a single bit, we can optimize the test. */ - if ((code == NE_EXPR || code == EQ_EXPR) - && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1) - && integer_pow2p (TREE_OPERAND (arg0, 1))) -{ - /* If we have (A & C) != 0 where C is the sign bit of A, convert -this into A < 0. Similarly for (A & C) == 0 into A >= 0. */ - tree arg00 = sign_bit_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg0, 1)); - - if (arg00 != NULL_TREE - /* This is only a win if casting to a signed type is cheap, -i.e. when arg00's type is not a partial mode. */ - && type_has_mode_precision_p (TREE_TYPE (arg00))) - { - tree stype = signed_type_for (TREE_TYPE (arg00)); - return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR, - result_type, - fold_convert_loc (loc, stype, arg00), - build_int_cst (stype, 0)); - } -} - - return NULL_TREE; -} - /* If CODE with arguments ARG0 and ARG1 represents a single bit equality/inequality test, then return a simplified form of the test using shifts and logical operations. Otherwise return @@ -12955,14 +12919,19 @@ fold_single_bit_test (location_t loc, enum tree_code code, scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type); int ops_unsigned; tree signed_type, unsigned_type, intermediate_type; - tree tem, one; + tree one; /* First, see if we can fold the single bit test into a sign-bit test. */ - tem = fold_single_bit_test_into_sign_test (loc, code, arg0, arg1, -result_type); - if (tem) - return tem; + if (bitnum == TYPE_PRECISION (type) - 1 + && type_has_mode_precision_p (type)) + { + tree stype = signed_type_for (type); + return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR, + result_type, + fold_convert_loc (loc, stype, inner), + build_int_cst (stype, 0)); + } /* Otherwise we have (A & C) != 0 where C is a single bit, convert that into ((A >> C2) & 1). Where C2 = log2(C). -- 2.17.1
[PATCH 1/7] Move fold_single_bit_test to expr.cc from fold-const.cc
This is part 1 of N patch set that will change the expansion of `(A & C) != 0` from using trees to directly expanding so later on we can do some cost analysis. Since the only user of fold_single_bit_test is now expand, move it to there. OK? Bootstrapped and tested on x86_64-linux. gcc/ChangeLog: * fold-const.cc (fold_single_bit_test_into_sign_test): Move to expr.cc. (fold_single_bit_test): Likewise. * expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc (fold_single_bit_test): Likewise and make static. * fold-const.h (fold_single_bit_test): Remove declaration. --- gcc/expr.cc | 113 ++ gcc/fold-const.cc | 112 - gcc/fold-const.h | 1 - 3 files changed, 113 insertions(+), 113 deletions(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index 5ede094e705..f999f81af4a 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -12898,6 +12898,119 @@ maybe_optimize_sub_cmp_0 (enum tree_code code, tree *arg0, tree *arg1) *arg1 = treeop1; } + + +/* If CODE with arguments ARG0 and ARG1 represents a single bit + equality/inequality test, then return a simplified form of the test + using a sign testing. Otherwise return NULL. TYPE is the desired + result type. */ + +static tree +fold_single_bit_test_into_sign_test (location_t loc, +enum tree_code code, tree arg0, tree arg1, +tree result_type) +{ + /* If this is testing a single bit, we can optimize the test. */ + if ((code == NE_EXPR || code == EQ_EXPR) + && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1) + && integer_pow2p (TREE_OPERAND (arg0, 1))) +{ + /* If we have (A & C) != 0 where C is the sign bit of A, convert +this into A < 0. Similarly for (A & C) == 0 into A >= 0. */ + tree arg00 = sign_bit_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg0, 1)); + + if (arg00 != NULL_TREE + /* This is only a win if casting to a signed type is cheap, +i.e. when arg00's type is not a partial mode. */ + && type_has_mode_precision_p (TREE_TYPE (arg00))) + { + tree stype = signed_type_for (TREE_TYPE (arg00)); + return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR : LT_EXPR, + result_type, + fold_convert_loc (loc, stype, arg00), + build_int_cst (stype, 0)); + } +} + + return NULL_TREE; +} + +/* If CODE with arguments ARG0 and ARG1 represents a single bit + equality/inequality test, then return a simplified form of + the test using shifts and logical operations. Otherwise return + NULL. TYPE is the desired result type. */ + +static tree +fold_single_bit_test (location_t loc, enum tree_code code, + tree arg0, tree arg1, tree result_type) +{ + /* If this is testing a single bit, we can optimize the test. */ + if ((code == NE_EXPR || code == EQ_EXPR) + && TREE_CODE (arg0) == BIT_AND_EXPR && integer_zerop (arg1) + && integer_pow2p (TREE_OPERAND (arg0, 1))) +{ + tree inner = TREE_OPERAND (arg0, 0); + tree type = TREE_TYPE (arg0); + int bitnum = tree_log2 (TREE_OPERAND (arg0, 1)); + scalar_int_mode operand_mode = SCALAR_INT_TYPE_MODE (type); + int ops_unsigned; + tree signed_type, unsigned_type, intermediate_type; + tree tem, one; + + /* First, see if we can fold the single bit test into a sign-bit +test. */ + tem = fold_single_bit_test_into_sign_test (loc, code, arg0, arg1, +result_type); + if (tem) + return tem; + + /* Otherwise we have (A & C) != 0 where C is a single bit, +convert that into ((A >> C2) & 1). Where C2 = log2(C). +Similarly for (A & C) == 0. */ + + /* If INNER is a right shift of a constant and it plus BITNUM does +not overflow, adjust BITNUM and INNER. */ + if (TREE_CODE (inner) == RSHIFT_EXPR + && TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST + && bitnum < TYPE_PRECISION (type) + && wi::ltu_p (wi::to_wide (TREE_OPERAND (inner, 1)), + TYPE_PRECISION (type) - bitnum)) + { + bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); + inner = TREE_OPERAND (inner, 0); + } + + /* If we are going to be able to omit the AND below, we must do our +operations as unsigned. If we must use the AND, we have a choice. +Normally unsigned is faster, but for some machines signed is. */ + ops_unsigned = (load_extend_op (operand_mode) == SIGN_EXTEND + && !flag_syntax_only) ? 0 : 1; + + signed_type = lang_hooks.types.type_for_mode (operand_mode, 0); + unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1); +
[PATCH 0/7] Improve do_store_flag
This patch set improves do_store_flag for the single bit case. We go back to expanding the code directly rather than building some trees. Plus instead of using shift+and we use directly bit_field extraction; this improves code generation on avr. Andrew Pinski (7): Move fold_single_bit_test to expr.cc from fold-const.cc Inline and simplify fold_single_bit_test_into_sign_test into fold_single_bit_test Use get_def_for_expr in fold_single_bit_test Simplify fold_single_bit_test slightly Simplify fold_single_bit_test with respect to code Use BIT_FIELD_REF inside fold_single_bit_test Expand directly for single bit test gcc/expr.cc | 91 - gcc/fold-const.cc | 112 -- gcc/fold-const.h | 1 - 3 files changed, 81 insertions(+), 123 deletions(-) -- 2.17.1
[Bug rtl-optimization/46943] Unnecessary ZERO_EXTEND
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46943 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2018-04-22 00:00:00 |2023-5-19 Severity|normal |enhancement
[Bug middle-end/98961] Failure to optimize successive comparisons with 0 into clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98961 --- Comment #5 from Andrew Pinski --- or could be a cost thing ...
[Bug middle-end/98961] Failure to optimize successive comparisons with 0 into clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98961 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Component|rtl-optimization|middle-end Last reconfirmed||2023-05-20 --- Comment #4 from Andrew Pinski --- Confirmed, I think this should happen at expand time and only if the target does not have conditional compares (e.g. like aarch64).
[Bug rtl-optimization/89680] Redundant moves with -march=skylake for long long shift on 32bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89680 Andrew Pinski changed: What|Removed |Added Known to work||10.1.0 --- Comment #2 from Andrew Pinski --- Looks like this was fixed in GCC 10.
[Bug tree-optimization/109287] Optimizing sal shr pairs when inlining function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109287 --- Comment #2 from Andrew Pinski --- Actually it is closer to: unsigned f(unsigned t, unsigned b, unsigned *tt) { if (b >= 16) __builtin_unreachable(); t *= 16; t+= b; *tt = t%16; unsigned ttt = t/16; return ttt; } As we know the range of b will be [0,15] due to the loop
[Bug tree-optimization/109287] Optimizing sal shr pairs when inlining function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109287 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2023-05-20 Component|middle-end |tree-optimization Severity|normal |enhancement --- Comment #1 from Andrew Pinski --- Reduced down to: unsigned f(unsigned t, unsigned b, unsigned *tt) { t *= 16; t+= b; unsigned ttt = t/16; *tt = t%16; return ttt; } Confirmed.
[Bug middle-end/108847] unnecessary bitwise AND on boolean types and shifting of the "sign" bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108847 Andrew Pinski changed: What|Removed |Added Target|x86_64-*-* |x86_64-*-* aarch64-*-* Status|NEW |ASSIGNED --- Comment #2 from Andrew Pinski --- I am messing around in this area
[Bug c/109912] #pragma GCC diagnostic ignored "-Wall" is ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109912 Andrew Pinski changed: What|Removed |Added Keywords||diagnostic Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-20 --- Comment #1 from Andrew Pinski --- So it is all of the "meta"-options which have this issue as shown by: ``` #pragma GCC diagnostic warning "-Wunused" #pragma GCC diagnostic ignored "-Wunused" static int f() {return 0;} ``` Confirmed.
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #9 from Andrew Pinski --- (In reply to Georg-Johann Lay from comment #6) > Quite impressive improvement. Maybe the last step can be achieved with a > combiner pattern that combines extzv with a bit flip. > > One problem is usually that there is no canonical form (sometimes > zero_extract, sometimes shift+and, sometimes with subregs for extraction or > paradoxical subregs for wider types, different behaviour for MSB, etc.). Right, In this case combine tries: (set (reg/i:QI 24 r24) (zero_extract:QI (xor:QI (reg:QI 54) (const_int 64 [0x40])) (const_int 1 [0x1]) (const_int 6 [0x6]))) Which puts the xor inside the zero_extract even but I think you could handle that once my patch set goes in.
[PATCH] Mode-Switching: Fix local array maybe uninitialized warning
From: Pan Li There are 2 local array in function optimize_mode_switching. It will be initialized conditionally at the beginning but then always consumed in another loop. It may trigger the warning maybe-uninitialized, and may result in build failure when enable werror, aka warning as error. This patch will initialize the local array to zero explictly when declaration. Signed-off-by: Pan Li gcc/ChangeLog: * mode-switching.cc (entity_map): Initialize the array to zero. (bb_info): Ditto. --- gcc/mode-switching.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/mode-switching.cc b/gcc/mode-switching.cc index 2d2818f5674..64ae2bc29c3 100644 --- a/gcc/mode-switching.cc +++ b/gcc/mode-switching.cc @@ -499,8 +499,8 @@ optimize_mode_switching (void) bool need_commit = false; static const int num_modes[] = NUM_MODES_FOR_MODE_SWITCHING; #define N_ENTITIES ARRAY_SIZE (num_modes) - int entity_map[N_ENTITIES]; - struct bb_info *bb_info[N_ENTITIES]; + int entity_map[N_ENTITIES] = {}; + struct bb_info *bb_info[N_ENTITIES] = {}; int i, j; int n_entities = 0; int max_num_modes = 0; -- 2.34.1
[Bug tree-optimization/109038] Miss optimization to simplify bit_and + rotate to shift
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109038 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-19 Ever confirmed|0 |1 Component|middle-end |tree-optimization --- Comment #2 from Andrew Pinski --- Confirmed. (simplify (rrotate (bit_and @0 INTEGER_CST@1) INTEGER_CST@2) (if (@1 == (type)(~0) >> (typebits-@2)) (lshift @0 { typebits - @2; })) (simplify (lrotate (bit_and @0 INTEGER_CST@1) INTEGER_CST@2) (if (@1 == (type)(~0) >> (@2)) (lshift @0 { @2; })) There could be more dealing with the result being logical shift right.
[Bug target/55181] [10/11/12/13/14 Regression] Expensive shift loop where a bit-testing instruction could be used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55181 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #26 from Andrew Pinski --- So I guess this is mine too. With my patches I created to improve PR 109907 (attached there), the initial RTL now looks like: ;; _9 = (unsigned char) _8; (insn 6 5 0 (set (reg/v:QI 46 [ ]) (zero_extract:QI (subreg:QI (reg/v:SI 47 [ number ]) 3) (const_int 1 [0x1]) (const_int 5 [0x5]))) "t2.c":4:6 -1 (nil)) Where it was before: ;; _9 = (unsigned char) _8; (insn 6 5 7 (set (reg:SI 48) (lshiftrt:SI (reg/v:SI 47 [ number ]) (const_int 29 [0x1d]))) "t2.c":4:6 -1 (nil)) (insn 7 6 0 (set (reg/v:QI 46 [ ]) (and:QI (subreg:QI (reg:SI 48) 0) (const_int 1 [0x1]))) "t2.c":4:6 -1 (nil))
Re: [V7][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]
On Fri, 19 May 2023 20:49:47 + Qing Zhao via Gcc-patches wrote: > GCC extension accepts the case when a struct with a flexible array member > is embedded into another struct or union (possibly recursively). Do you mean TYPE_TRAILING_FLEXARRAY()? > diff --git a/gcc/tree.h b/gcc/tree.h > index 0b72663e6a1..237644e788e 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -786,7 +786,12 @@ extern void omp_clause_range_check_failed (const_tree, > const char *, int, > (...) prototype, where arguments can be accessed with va_start and > va_arg), as opposed to an unprototyped function. */ > #define TYPE_NO_NAMED_ARGS_STDARG_P(NODE) \ > - (TYPE_CHECK (NODE)->type_common.no_named_args_stdarg_p) > + (FUNC_OR_METHOD_CHECK (NODE)->type_common.no_named_args_stdarg_p) > + > +/* True if this RECORD_TYPE or UNION_TYPE includes a flexible array member > + at the last field recursively. */ > +#define TYPE_INCLUDE_FLEXARRAY(NODE) \ > + (RECORD_OR_UNION_CHECK (NODE)->type_common.no_named_args_stdarg_p) Until i read the description above i read TYPE_INCLUDE_FLEXARRAY as an option to include or not include something. The description hints more at TYPE_INCLUDES_FLEXARRAY (with an S) to be a type which has at least one member which has a trailing flexible array or which itself has a trailing flexible array. > > /* In an IDENTIFIER_NODE, this means that assemble_name was called with > this string as an argument. */
[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913 --- Comment #3 from Andrew Pinski --- Note for powerpc-darwin, VECTOR_TYPE_P might need to be defined too.
[Bug c++/99451] [plugin] cannot enable specific dump for plugin passes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99451 --- Comment #2 from CVS Commits --- The trunk branch has been updated by Nathan Sidwell : https://gcc.gnu.org/g:97a36b466ba1420210294f0a1dd7002054ba3b7e commit r14-1004-g97a36b466ba1420210294f0a1dd7002054ba3b7e Author: Nathan Sidwell Date: Wed May 17 19:27:13 2023 -0400 Allow plugin dumps Defer dump option parsing until plugins are initialized. This allows one to use plugin names for dumps. PR other/99451 gcc/ * opts.h (handle_deferred_dump_options): Declare. * opts-global.cc (handle_common_deferred_options): Do not handle dump options here. (handle_deferred_dump_options): New. * toplev.cc (toplev::main): Call it after plugin init.
[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913 --- Comment #2 from Andrew Pinski --- Created attachment 55123 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55123=edit Patch to test Does this patch work? If so assign it to me and I will apply it.
[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
gcc-12-20230519 is now available
Snapshot gcc-12-20230519 is now available on https://gcc.gnu.org/pub/gcc/snapshots/12-20230519/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 12 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-12 revision a4d13e54822a4a53137c9f5e23770a798a0b You'll find: gcc-12-20230519.tar.xz Complete GCC SHA256=64fb521d2d038412618b78a00b2bbe74328e6e3ab8af8afbb88991afea74300e SHA1=6c870d3256a6c9fa566114922397f43eeb1d24ab Diffs from 12-20230512 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-12 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
[Bug objc/109913] [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913 --- Comment #1 from Andrew Pinski --- The problem is ROUND_TYPE_ALIGN is used in libobjc and then RECORD_OR_UNION_TYPE_P is not defined there ...
[Bug middle-end/21161] [10/11/12/13/14 Regression] "clobbered by longjmp" warning ignores the data flow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21161 Paul Eggert changed: What|Removed |Added CC||eggert at cs dot ucla.edu --- Comment #26 from Paul Eggert --- Created attachment 55122 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55122=edit GCC bug 21161 as triggered by GNU diffutils I ran into a similar problem when compiling GNU diffutils with gcc (GCC) 13.1.1 20230511 (Red Hat 13.1.1-2) on x86.64. Here is a stripped-down illustrating of the diffutils problem. Compile the attached program with: gcc -O2 -W -S pr21161.i The output, which is a false positive, is: pr21161.i: In function ‘find_dir_file_pathname’: pr21161.i:22:15: warning: variable ‘match’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] 22 | char const *match = file; | ^
Re: [PATCH 1/2] Improve do_store_flag for single bit comparison against 0
On Fri, May 19, 2023 at 9:40 AM Jeff Law via Gcc-patches wrote: > > > > On 5/18/23 20:14, Andrew Pinski via Gcc-patches wrote: > > While working something else, I noticed we could improve > > the following function code generation: > > ``` > > unsigned f(unsigned t) > > { > >if (t & ~(1<<30)) __builtin_unreachable(); > >return t != 0; > > } > > ``` > > Right know we just emit a comparison against 0 instead > > of just a shift right by 30. > > There is code in do_store_flag which already optimizes > > `(t & 1<<30) != 0` to `(t >> 30) & 1`. This patch > > extends it to handle the case where we know t has a > > nonzero of just one bit set. > > > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > > > gcc/ChangeLog: > > > > * expr.cc (do_store_flag): Extend the one bit checking case > > to handle the case where we don't have an and but rather still > > one bit is known to be non-zero. > So as we touched on in IRC, the concern is targets where the cost of the > shift depends on the number of bits shifted. Can we look at costing > here to determine the initial RTL generation approach? > > Another approach that would work for some targets is a single bit > extract. In theory we should be discovering the extract idiom from the > shift+and form, but I'm always concerned that it's going to be missed > for one or more oddball reasons. I now have a patch set which does the extraction directly rather than having combine try to combine it later on. This actually fixes an issue with avr target which expands out the shift by doing a loop. Since we are using extract_bit_field, if a target does not have an extract pattern, it will expand using shift+and form instead. I will resubmit this and the other patch after this new patch set is completed. Thanks, Andrew Pinski > > jeff >
[Bug c/91093] Error on implicit int by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91093 Martin Uecker changed: What|Removed |Added CC||muecker at gwdg dot de --- Comment #3 from Martin Uecker --- *** Bug 106425 has been marked as a duplicate of this bug. ***
[Bug c/106425] implicit-int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106425 Martin Uecker changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #3 from Martin Uecker --- Duplicate. *** This bug has been marked as a duplicate of bug 91093 ***
[Bug tree-optimization/101770] -Wmaybe-uninitialized false alarm with only locals in GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770 --- Comment #5 from Paul Eggert --- I can no longer reproduce the bug in bleeding-edge GNU diffutils, so this bug is not so important in its own right - that is, it's merely that GCC 13.1.1 still mishandles w.i.
Re: [PATCH] nvptx: Add suppport for __builtin_nvptx_brev instrinsic.
On 5/6/23 10:04, Roger Sayle wrote: This patch adds support for (a pair of) bit reversal intrinsics __builtin_nvptx_brev and __builtin_nvptx_brevll which perform 32-bit and 64-bit bit reversal (using nvptx's brev instruction) matching the __brev and __brevll instrinsics provided by NVidia's nvcc compiler. https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__INT .html This patch has been tested on nvptx-none which make and make -k check with no new failures. Ok for mainline? 2023-05-06 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.cc (nvptx_expand_brev): Expand target builtin for bit reversal using brev instruction. (enum nvptx_builtins): Add NVPTX_BUILTIN_BREV and NVPTX_BUILTIN_BREVLL. (nvptx_init_builtins): Define "brev" and "brevll". (nvptx_expand_builtin): Expand NVPTX_BUILTIN_BREV and NVPTX_BUILTIN_BREVLL via nvptx_expand_brev function. * doc/extend.texi (Nvidia PTX Builtin-in Functions): New section, document __builtin_nvptx_brev{,ll}. gcc/testsuite/ChangeLog * gcc.target/nvptx/brev-1.c: New 32-bit test case. * gcc.target/nvptx/brev-2.c: Likewise. * gcc.target/nvptx/brevll-1.c: New 64-bit test case. * gcc.target/nvptx/brevll-2.c: Likewise. OK jeff
Re: [PATCH] Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode.
On 5/17/23 00:57, liuhongt via Gcc-patches wrote: r14-172-g0368d169492017 replaces GENERAL_REGS with NO_REGS in cost calculation when the preferred register class are not known yet. It regressed powerpc PR109610 and PR109858, it looks too aggressive to use NO_REGS when mode can be allocated with GENERAL_REGS. The patch takes a step back, still use GENERAL_REGS when hard_regno_mode_ok for mode and GENERAL_REGS, otherwise uses NO_REGS. Kewen confirmed the patch fixed PR109858, I vefiried it also fixed PR109610. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. No big performance impact for SPEC2017 on icelake server. Ok for trunk? gcc/ChangeLog: * ira-costs.cc (scan_one_insn): Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode, otherwise still use GENERAL_REGS. BTW, Vlad is on PTO right now. I'm sure he'll handle this after he returns and starts digging out of all the stuff that's piled up. jeff
Re: [PATCH] configure: Implement --enable-host-bind-now
On 5/16/23 09:37, Marek Polacek via Gcc-patches wrote: As promised in the --enable-host-pie patch, this patch adds another configure option, --enable-host-bind-now, which adds -z now when linking the compiler executables in order to extend hardening. BIND_NOW with RELRO allows the GOT to be marked RO; this prevents GOT modification attacks. This option does not affect linking of target libraries; you can use LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW. With this patch: $ readelf -Wd cc1{,plus} | grep FLAGS 0x001e (FLAGS) BIND_NOW 0x6ffb (FLAGS_1)Flags: NOW PIE 0x001e (FLAGS) BIND_NOW 0x6ffb (FLAGS_1)Flags: NOW PIE Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? c++tools/ChangeLog: * configure.ac (--enable-host-bind-now): New check. * configure: Regenerate. gcc/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Add -Wl,-z,now to LD_PICFLAG if --enable-host-bind-now. * configure: Regenerate. * doc/install.texi: Document --enable-host-bind-now. lto-plugin/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Link with -z,now. * configure: Regenerate. OK jeff
[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639 Bug 24639 depends on bug 101770, which changed state. Bug 101770 Summary: -Wmaybe-uninitialized false alarm with only locals in GNU diffutils https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |---
[Bug tree-optimization/101770] -Wmaybe-uninitialized false alarm with only locals in GNU diffutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101770 Paul Eggert changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED Version|11.2.1 |13.1.1 --- Comment #4 from Paul Eggert --- I seeing the bug with gcc (GCC) 13.1.1 20230511 (Red Hat 13.1.1-2) on x86-64 when compiling GNU diffutils, so although the bug was reported fixed on the trunk last year, it appears that the fix hasn't propagated GCC 13 despite the Target Milestone being 13.0. The symptoms are: $ gcc -O2 -Wmaybe-uninitialized -S w.i w.i: In function ‘edit’: w.i:50:18: warning: ‘cmd1’ may be used uninitialized [-Wmaybe-uninitialized] 50 | return !cmd1; | ^ w.i:7:11: note: ‘cmd1’ was declared here 7 | int cmd1; | ^~~~ This appears to be the same bug as before so I am taking the liberty of reopening the bug report.
Re: [V7][PATCH 2/2] Update documentation to clarify a GCC extension [PR77650]
On Fri, 19 May 2023, Qing Zhao via Gcc-patches wrote: > +GCC extension accepts a structure containing an ISO C99 @dfn{flexible array "The GCC extension" or "A GCC extension". > +@item > +A structure containing a C99 flexible array member, or a union containing > +such a structure, is the middle field of another structure, for example: There might be more than one middle field, and I think this case also includes where it's the *first* field - any field other than the last. > +@smallexample > +struct flex @{ int length; char data[]; @}; > + > +struct mid_flex @{ int m; struct flex flex_data; int n; @}; > +@end smallexample > + > +In the above, @code{mid_flex.flex_data.data[]} has undefined behavior. And it's not literally mid_flex.flex_data.data[] that has undefined behavior, but trying to access a member of that array. > +Compilers do not handle such case consistently, Any code relying on "such a case", and "," should be "." at the end of a sentence. -- Joseph S. Myers jos...@codesourcery.com
Re: [C PATCH] Remove dead code related to type compatibility across TUs.
On Fri, 19 May 2023, Martin Uecker via Gcc-patches wrote: > Repost for stage 1. > > > C: Remove dead code related to type compatibility across TUs. > > Code to detect struct/unions across the same TU is not needed > anymore. Code for determining compatibility of tagged types is > preserved as it will be used for C2X. Some errors in the unused > code are fixed. > > Bootstrapped with no regressions for x86_64-pc-linux-gnu. > > gcc/c/ > * c-decl.cc (set_type_context): Remove. > (pop_scope, diagnose_mismatched_decls, pushdecl): > Remove dead code. > * c-typeck.cc (comptypes_internal): Remove dead code. > (same_translation_unit_p): Remove. > (tagged_types_tu_compatible_p): Some fixes. OK. -- Joseph S. Myers jos...@codesourcery.com
[Bug objc/109913] New: [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109913 Bug ID: 109913 Summary: [14 regression] r14-976-g9907413a3a6aa3 causes more than 300 objc/objc++ failures Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: objc Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:9907413a3a6aa30a4a6db4756c445b40f04597f3, r14-976-g9907413a3a6aa3 commit 9907413a3a6aa30a4a6db4756c445b40f04597f3 (HEAD) Author: Bernhard Reutner-Fischer Date: Sun May 14 00:38:33 2023 +0200 gcc/config/*: use _P() defines from tree.h FAIL: obj-c++.dg/basic.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/bitfield-1.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/bitfield-2.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/bitfield-4.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/cxx-ivars-1.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/cxx-scope-1.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/defs.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/demangle-1.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/demangle-2.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-10.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-3.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-4.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-5.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-6.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/encode-9.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/except-1.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-class-meta.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-class.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-ivar.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-method.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-objc.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-objc_msg_lookup.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-object.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-property.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-protocol.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-resolve-method.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-api-2-sel.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/gnu-runtime-3.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/lookup-2.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/lto/trivial-1 obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O0 -flto -fgnu-runtime -Wno-objc-root-class FAIL: obj-c++.dg/lto/trivial-1 obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O0 -flto -flto-partition=none -fgnu-runtime -Wno-objc-root-class FAIL: obj-c++.dg/lto/trivial-1 obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O2 -flto -fgnu-runtime -Wno-objc-root-class FAIL: obj-c++.dg/lto/trivial-1 obj_cpp_lto_trivial-1_0.o-obj_cpp_lto_trivial-1_0.o link, -O2 -flto -flto-partition=none -fgnu-runtime -Wno-objc-root-class FAIL: obj-c++.dg/method-10.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/method-17.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/method-19.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/method-22.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/method-23.mm -fgnu-runtime (test for excess errors) FAIL: obj-c++.dg/property/at-property-10.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-11.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-12.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-13.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-19.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-22.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-24.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-26.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-27.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-6.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-7.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL: obj-c++.dg/property/at-property-8.mm -fgnu-runtime -Wno-objc-root-class (test for excess errors) FAIL:
[V7][PATCH 2/2] Update documentation to clarify a GCC extension [PR77650]
on a structure with a C99 flexible array member being nested in another structure. "GCC extension accepts a structure containing an ISO C99 "flexible array member", or a union containing such a structure (possibly recursively) to be a member of a structure. There are two situations: * A structure containing a C99 flexible array member, or a union containing such a structure, is the last field of another structure, for example: struct flex { int length; char data[]; }; union union_flex { int others; struct flex f; }; struct out_flex_struct { int m; struct flex flex_data; }; struct out_flex_union { int n; union union_flex flex_data; }; In the above, both 'out_flex_struct.flex_data.data[]' and 'out_flex_union.flex_data.f.data[]' are considered as flexible arrays too. * A structure containing a C99 flexible array member, or a union containing such a structure, is the middle field of another structure, for example: struct flex { int length; char data[]; }; struct mid_flex { int m; struct flex flex_data; int n; }; In the above, 'mid_flex.flex_data.data[]' has undefined behavior. Compilers do not handle such case consistently, Any code relying on such case should be modified to ensure that flexible array members only end up at the ends of structures. Please use warning option '-Wflex-array-member-not-at-end' to identify all such cases in the source code and modify them. This warning will be on by default starting from GCC 15. " gcc/c-family/ChangeLog: * c.opt: New option -Wflex-array-member-not-at-end. gcc/c/ChangeLog: * c-decl.cc (finish_struct): Issue warnings for new option. gcc/ChangeLog: * doc/extend.texi: Document GCC extension on a structure containing a flexible array member to be a member of another structure. gcc/testsuite/ChangeLog: * gcc.dg/variable-sized-type-flex-array.c: New test. --- gcc/c-family/c.opt| 5 +++ gcc/c/c-decl.cc | 9 gcc/doc/extend.texi | 45 ++- .../gcc.dg/variable-sized-type-flex-array.c | 31 + 4 files changed, 89 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index cddeece..c26d9801b63 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -737,6 +737,11 @@ Wformat-truncation= C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) Warning LangEnabledBy(C ObjC C++ LTO ObjC++,Wformat=, warn_format >= 1, 0) IntegerRange(0, 2) Warn about calls to snprintf and similar functions that truncate output. +Wflex-array-member-not-at-end +C C++ Var(warn_flex_array_member_not_at_end) Warning +Warn when a structure containing a C99 flexible array member as the last +field is not at the end of another structure. + Wif-not-aligned C ObjC C++ ObjC++ Var(warn_if_not_aligned) Init(1) Warning Warn when the field in a struct is not aligned. diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc index 2c620b681d9..9a48f28788d 100644 --- a/gcc/c/c-decl.cc +++ b/gcc/c/c-decl.cc @@ -9293,6 +9293,15 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes, TYPE_INCLUDE_FLEXARRAY (t) = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x)); + if (warn_flex_array_member_not_at_end + && !is_last_field + && RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)) + && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x))) + warning_at (DECL_SOURCE_LOCATION (x), + OPT_Wflex_array_member_not_at_end, + "structure containing a flexible array member" + " is not at the end of another structure"); + if (DECL_NAME (x) || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x))) saw_named_field = true; diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index ed8b9c8a87b..6425ba57e88 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1751,7 +1751,50 @@ Flexible array members may only appear as the last member of a A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are -permitted by GCC as extensions.) +permitted by GCC as extensions, see details below.) +@end itemize + +GCC extension accepts a structure containing an ISO C99 @dfn{flexible array +member}, or a union containing such a structure (possibly recursively) +to be a member of a structure. + +There are two situations: + +@itemize @bullet +@item +A structure containing a C99 flexible array member, or a union containing +such a structure, is the last field of another structure, for example: + +@smallexample +struct flex @{ int length; char data[];
[Bug testsuite/101528] [11 regression] gcc.target/powerpc/int_128bit-runnable.c fails after r11-8743
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101528 Carl Love changed: What|Removed |Added CC||cel at us dot ibm.com --- Comment #6 from Carl Love --- I will look into this and see if the instruction counts have changed for some reason.
[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876 --- Comment #7 from Marek Polacek --- // PR c++/109876 using size_t = decltype(sizeof 0); namespace std { template struct initializer_list { const int *_M_array; size_t _M_len; constexpr size_t size() const { return _M_len; } }; } // namespace std template struct Array {}; template void g() { static constexpr std::initializer_list num{2}; Array ctx; }
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #8 from Georg-Johann Lay --- avr.md has this: > ;; ??? do_store_flag emits a hard-coded right shift to extract a bit without > ;; even considering rtx_costs, extzv, or a bit-test. See PR55181 for an > example. And I already tried to work around it in that PR, but forgot about it...
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #7 from Andrew Pinski --- (In reply to Georg-Johann Lay from comment #6) > (define_expand "extzv" > [(set (match_operand:QI 0 "register_operand") > (zero_extract:QI (match_operand:QI 1 "register_operand") > (match_operand:QI 2 "const1_operand") > (match_operand:QI 3 "const_0_to_7_operand")))]) > > Maybe QI for op1 is not optimal, but it's not possible to use mode iterator > because there's only one gen_extzv. Dunno if VOIDmode would help or is sane. Note extzv pattern has been deprecate since 4.8 with r0-120368-gd2eeb2d179a435 which added extzv and co as being supported. So maybe moving over to using that instead on avr backend might help here ...
[Bug c/70418] VM structure type specifier in list of parameter declarations within nested function definition ices.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70418 --- Comment #8 from Martin Uecker --- https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618911.html
Re: [C PATCH v2] Fix ICEs related to VM types in C [PR106465, PR107557, PR108423, PR109450]
On Fri, 19 May 2023, Martin Uecker via Gcc-patches wrote: > Thanks Joseph! > > Revised version attached. Ok? The C front-end changes and tests are OK. > But I wonder whether we generally need to do something > about > > sizeof *x > > when x is NULL or not initialized. This is quite commonly > used in C code and if the type is not of variable size, > it is also unproblematic. So the UB for variable size is > unfortunate and certainly also affects existing code in > the wild. In practice it does not seem to cause > problems because there is no lvalue conversion and this > then seems to work. Maybe we document this as an > extension? (and make sure in the C FE that it > works) This would also make this idiom valid: There's certainly a tricky question of what exactly it means to evaluate *x as far as producing an lvalue but without converting it to an rvalue - but right now the C standard wording on unary '*' is clear that "if it points to an object, the result is an lvalue designating the object" and "If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.", i.e. it's the evaluation as far as producing an lvalue that produces undefined behavior, rather than the lvalue conversion (that doesn't happen in sizeof) that does so. And indeed we probably would be able to define semantics that avoid UB if desired. -- Joseph S. Myers jos...@codesourcery.com
[Bug c/70418] VM structure type specifier in list of parameter declarations within nested function definition ices.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70418 Martin Uecker changed: What|Removed |Added CC||muecker at gwdg dot de --- Comment #7 from Martin Uecker --- *** Bug 106465 has been marked as a duplicate of this bug. ***
[Bug c/106465] ICE for VLA in struct in parameter of nested function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106465 Martin Uecker changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #6 from Martin Uecker --- Was filed previously as PR70418 *** This bug has been marked as a duplicate of bug 70418 ***
[V7][PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]
GCC extension accepts the case when a struct with a flexible array member is embedded into another struct or union (possibly recursively). __builtin_object_size should treat such struct as flexible size. gcc/c/ChangeLog: PR tree-optimization/101832 * c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for struct/union type. gcc/lto/ChangeLog: PR tree-optimization/101832 * lto-common.cc (compare_tree_sccs_1): Compare bit TYPE_NO_NAMED_ARGS_STDARG_P or TYPE_INCLUDE_FLEXARRAY properly for its corresponding type. gcc/ChangeLog: PR tree-optimization/101832 * print-tree.cc (print_node): Print new bit type_include_flexarray. * tree-core.h (struct tree_type_common): Use bit no_named_args_stdarg_p as type_include_flexarray for RECORD_TYPE or UNION_TYPE. * tree-object-size.cc (addr_object_size): Handle structure/union type when it has flexible size. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream in bit no_named_args_stdarg_p properly for its corresponding type. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream out bit no_named_args_stdarg_p properly for its corresponding type. * tree.h (TYPE_INCLUDE_FLEXARRAY): New macro TYPE_INCLUDE_FLEXARRAY. gcc/testsuite/ChangeLog: PR tree-optimization/101832 * gcc.dg/builtin-object-size-pr101832.c: New test. --- gcc/c/c-decl.cc | 11 ++ gcc/lto/lto-common.cc | 5 +- gcc/print-tree.cc | 5 + .../gcc.dg/builtin-object-size-pr101832.c | 134 ++ gcc/tree-core.h | 2 + gcc/tree-object-size.cc | 23 ++- gcc/tree-streamer-in.cc | 5 +- gcc/tree-streamer-out.cc | 5 +- gcc/tree.h| 7 +- 9 files changed, 192 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc index b5b491cf2da..2c620b681d9 100644 --- a/gcc/c/c-decl.cc +++ b/gcc/c/c-decl.cc @@ -9282,6 +9282,17 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes, /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x. */ DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, x); + /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t. +when x is an array and is the last field. */ + if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE) + TYPE_INCLUDE_FLEXARRAY (t) + = is_last_field && flexible_array_member_type_p (TREE_TYPE (x)); + /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t +when x is an union or record and is the last field. */ + else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x))) + TYPE_INCLUDE_FLEXARRAY (t) + = is_last_field && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x)); + if (DECL_NAME (x) || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x))) saw_named_field = true; diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc index 537570204b3..35827aab075 100644 --- a/gcc/lto/lto-common.cc +++ b/gcc/lto/lto-common.cc @@ -1275,7 +1275,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map) if (AGGREGATE_TYPE_P (t1)) compare_values (TYPE_TYPELESS_STORAGE); compare_values (TYPE_EMPTY_P); - compare_values (TYPE_NO_NAMED_ARGS_STDARG_P); + if (FUNC_OR_METHOD_TYPE_P (t1)) + compare_values (TYPE_NO_NAMED_ARGS_STDARG_P); + if (RECORD_OR_UNION_TYPE_P (t1)) + compare_values (TYPE_INCLUDE_FLEXARRAY); compare_values (TYPE_PACKED); compare_values (TYPE_RESTRICT); compare_values (TYPE_USER_ALIGN); diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc index ccecd3dc6a7..aaded53b1b1 100644 --- a/gcc/print-tree.cc +++ b/gcc/print-tree.cc @@ -632,6 +632,11 @@ print_node (FILE *file, const char *prefix, tree node, int indent, && TYPE_CXX_ODR_P (node)) fputs (" cxx-odr-p", file); + if ((code == RECORD_TYPE + || code == UNION_TYPE) + && TYPE_INCLUDE_FLEXARRAY (node)) + fputs (" include-flexarray", file); + /* The transparent-union flag is used for different things in different nodes. */ if ((code == UNION_TYPE || code == RECORD_TYPE) diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c new file mode 100644 index 000..60078e11634 --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c @@ -0,0 +1,134 @@ +/* PR 101832: + GCC extension accepts the case when a struct with a C99 flexible array + member is embedded into another struct (possibly recursively). + __builtin_object_size will treat such struct as flexible size. + However, when a structure with
[V7][PATCH 0/2]Accept and Handle the case when a structure including a FAM nested in another structure
Hi, This is the 7th version of the patch, which rebased on the latest trunk. This is an important patch needed by Linux Kernel security project. We already have an extensive discussion on this issue and I have went through 6 revisions of the patches based on the discussion and resolved all the comments and suggestions raised during the discussion; compared to the 6th version, the major change are: 1. update the documentation to replace the mentioning of GCC14 with GCC15. 2. update the documentation to replace the following wording: "A structure or a union with a C99 flexible array member" with: "A structure containing a C99 flexible array member, or a union containing such a structure," All others are the same as 6th version. the 6th version are here: https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616312.html https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616313.html https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616314.html Kees has tested the 6th version of the patch with Linux kernel, and everything is good. relsolved many false positives for bounds checking. Notes for the review history of these patches (2 patches) 1.The patch 1/2: Handle component_ref to a structre/union field including flexible array member [PR101832] The C front-end part has been approved by Joseph. For the middle-end, most of the change has been reviewed by Richard (and modified based on his comments and suggestions), except the change in tree-object-size.cc. 2.The patch 2/2: Update documentation to clarify a GCC extension This is basically a C FE and documentation change, I have updated it based on previous comments and suggestions. Joseph, could you review it to see whether this version is ready to go? bootstrapped and regression tested on aarch64 and x86. Okay for commit? thanks a lot. Qing (for more details on the review history, I listed other important notes below: A. Richard Biener has reviewed the middle-end part of the first patch and raised some comments for the 4th version: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613643.html I updated it with his suggestion and Sandra’s comments as 5th version: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614100.html B. The comments for the 5th version: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614511.html (In this one, Joseph approved the C FE change of the first patch). https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614514.html (In this one, Joseph raised two comments on the documentation wordings for the 2nd patch. And I updated based on his comment in the 6th version) )
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #6 from Georg-Johann Lay --- (In reply to Andrew Pinski from comment #4) > For cset_32bit30_not with some patches which I will be posting, I get: > bst r25,6; 23 [c=4 l=3] *extzv/4 > clr r24 > bld r24,0 > ldi r25,lo8(1) ; 24 [c=4 l=1] movqi_insn/1 > eor r24,r25 ; 25 [c=4 l=1] *xorqi3 > /* epilogue start */ > ret ; 28 [c=0 l=1] return > > Which is better than what was there before. Quite impressive improvement. Maybe the last step can be achieved with a combiner pattern that combines extzv with a bit flip. One problem is usually that there is no canonical form (sometimes zero_extract, sometimes shift+and, sometimes with subregs for extraction or paradoxical subregs for wider types, different behaviour for MSB, etc.). avr's extzv currently reads (define_expand "extzv" [(set (match_operand:QI 0 "register_operand") (zero_extract:QI (match_operand:QI 1 "register_operand") (match_operand:QI 2 "const1_operand") (match_operand:QI 3 "const_0_to_7_operand")))]) Maybe QI for op1 is not optimal, but it's not possible to use mode iterator because there's only one gen_extzv. Dunno if VOIDmode would help or is sane. > The first one I suspect load_extend_op for SImode returning SIGN_EXTEND for > avr. It's not implemented for avr, thus UNKNOWN as of defaults.h.
[C PATCH] Remove dead code related to type compatibility across TUs.
Repost for stage 1. C: Remove dead code related to type compatibility across TUs. Code to detect struct/unions across the same TU is not needed anymore. Code for determining compatibility of tagged types is preserved as it will be used for C2X. Some errors in the unused code are fixed. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c/ * c-decl.cc (set_type_context): Remove. (pop_scope, diagnose_mismatched_decls, pushdecl): Remove dead code. * c-typeck.cc (comptypes_internal): Remove dead code. (same_translation_unit_p): Remove. (tagged_types_tu_compatible_p): Some fixes. diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc index f63c1108ab5..70345b4b019 100644 --- a/gcc/c/c-decl.cc +++ b/gcc/c/c-decl.cc @@ -1155,16 +1155,6 @@ update_label_decls (struct c_scope *scope) } } -/* Set the TYPE_CONTEXT of all of TYPE's variants to CONTEXT. */ - -static void -set_type_context (tree type, tree context) -{ - for (type = TYPE_MAIN_VARIANT (type); type; - type = TYPE_NEXT_VARIANT (type)) -TYPE_CONTEXT (type) = context; -} - /* Exit a scope. Restore the state of the identifier-decl mappings that were in effect when this scope was entered. Return a BLOCK node containing all the DECLs in this scope that are of interest @@ -1253,7 +1243,6 @@ pop_scope (void) case ENUMERAL_TYPE: case UNION_TYPE: case RECORD_TYPE: - set_type_context (p, context); /* Types may not have tag-names, in which case the type appears in the bindings list with b->id NULL. */ @@ -1364,12 +1353,7 @@ pop_scope (void) the TRANSLATION_UNIT_DECL. This makes same_translation_unit_p work. */ if (scope == file_scope) - { DECL_CONTEXT (p) = context; - if (TREE_CODE (p) == TYPE_DECL - && TREE_TYPE (p) != error_mark_node) - set_type_context (TREE_TYPE (p), context); - } gcc_fallthrough (); /* Parameters go in DECL_ARGUMENTS, not BLOCK_VARS, and have @@ -2318,21 +2302,18 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl, { if (DECL_INITIAL (olddecl)) { - /* If both decls are in the same TU and the new declaration -isn't overriding an extern inline reject the new decl. -In c99, no overriding is allowed in the same translation -unit. */ - if ((!DECL_EXTERN_INLINE (olddecl) - || DECL_EXTERN_INLINE (newdecl) - || (!flag_gnu89_inline - && (!DECL_DECLARED_INLINE_P (olddecl) - || !lookup_attribute ("gnu_inline", -DECL_ATTRIBUTES (olddecl))) - && (!DECL_DECLARED_INLINE_P (newdecl) - || !lookup_attribute ("gnu_inline", -DECL_ATTRIBUTES (newdecl - ) - && same_translation_unit_p (newdecl, olddecl)) + /* If the new declaration isn't overriding an extern inline +reject the new decl. In c99, no overriding is allowed +in the same translation unit. */ + if (!DECL_EXTERN_INLINE (olddecl) + || DECL_EXTERN_INLINE (newdecl) + || (!flag_gnu89_inline + && (!DECL_DECLARED_INLINE_P (olddecl) + || !lookup_attribute ("gnu_inline", + DECL_ATTRIBUTES (olddecl))) + && (!DECL_DECLARED_INLINE_P (newdecl) + || !lookup_attribute ("gnu_inline", + DECL_ATTRIBUTES (newdecl) { auto_diagnostic_group d; error ("redefinition of %q+D", newdecl); @@ -3360,18 +3341,11 @@ pushdecl (tree x) type to the composite of all the types of that declaration. After the consistency checks, it will be reset to the composite of the visible types only. */ - if (b && (TREE_PUBLIC (x) || same_translation_unit_p (x, b->decl)) - && b->u.type) + if (b && b->u.type) TREE_TYPE (b->decl) = b->u.type; - /* The point of the same_translation_unit_p check here is, -we want to detect a duplicate decl for a construct like -foo() { extern bar(); } ... static bar(); but not if -they are in different translation units. In any case, -the static does not go in the externals scope. */ - if (b - && (TREE_PUBLIC (x) || same_translation_unit_p (x, b->decl)) - && duplicate_decls (x, b->decl)) + /* the static does not go in the externals scope. */ + if (b && duplicate_decls (x, b->decl))
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #5 from Andrew Pinski --- Created attachment 55121 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55121=edit patch set here is the patch set that improves cset_32bit30_not . I am still looking into improving the other one.
[Bug middle-end/109907] Missed optimization for bit extraction (uses shift instead of single bit-test)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109907 --- Comment #4 from Andrew Pinski --- For cset_32bit30_not with some patches which I will be posting, I get: bst r25,6; 23 [c=4 l=3] *extzv/4 clr r24 bld r24,0 ldi r25,lo8(1) ; 24 [c=4 l=1] movqi_insn/1 eor r24,r25 ; 25 [c=4 l=1] *xorqi3 /* epilogue start */ ret ; 28 [c=0 l=1] return Which is better than what was there before. The way I get this is to use BIT_FIELD_REF inside fold_single_bit_test . The first one I suspect load_extend_op for SImode returning SIGN_EXTEND for avr.
[Bug other/109910] GCC prologue/epilogue saves/restores callee-saved registers that are never changed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109910 --- Comment #1 from Georg-Johann Lay --- Note that df_regs_ever_live_p may be used before reload_completed, for example in INITIAL_ELIMINATION_OFFSET. Hence, scanning the insns by hand using, say, note_stores, does not work because reload might still be in progress.
Re: [PATCH v4 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.
On 5/16/23 06:35, Ajit Agarwal wrote: On 29/04/23 5:03 am, Jeff Law wrote: On 4/28/23 16:42, Hans-Peter Nilsson wrote: On Sat, 22 Apr 2023, Ajit Agarwal via Gcc-patches wrote: Hello All: This new version of patch 4 use improve ree pass for rs6000 target using defined ABI interfaces. Bootstrapped and regtested on power64-linux-gnu. Thanks & Regards Ajit ree: Improve ree pass for rs6000 target using defined abi interfaces For rs6000 target we see redundant zero and sign extension and done to improve ree pass to eliminate such redundant zero and sign extension using defines ABI interfaces. 2023-04-22 Ajit Kumar Agarwal gcc/ChangeLog: * ree.cc (combline_reaching_defs): Add zero_extend using defined abi interfaces. (add_removable_extension): use of defined abi interfaces for no reaching defs. (abi_extension_candidate_return_reg_p): New defined ABI function. (abi_extension_candidate_p): New defined ABI function. (abi_extension_candidate_argno_p): New defined ABI function. (abi_handle_regs_without_defs_p): New defined ABI function. gcc/testsuite/ChangeLog: * g++.target/powerpc/zext-elim-3.C --- gcc/ree.cc | 176 +++--- .../g++.target/powerpc/zext-elim-3.C | 16 ++ 2 files changed, 162 insertions(+), 30 deletions(-) create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C diff --git a/gcc/ree.cc b/gcc/ree.cc index 413aec7c8eb..0de96b1ece1 100644 --- a/gcc/ree.cc +++ b/gcc/ree.cc @@ -473,7 +473,8 @@ get_defs (rtx_insn *insn, rtx reg, vec *dest) break; } - gcc_assert (use != NULL); + if (use == NULL) + return NULL; ref_chain = DF_REF_CHAIN (use); @@ -514,7 +515,8 @@ get_uses (rtx_insn *insn, rtx reg) if (REGNO (DF_REF_REG (def)) == REGNO (reg)) break; - gcc_assert (def != NULL); + if (def == NULL) + return NULL; ref_chain = DF_REF_CHAIN (def); @@ -750,6 +752,103 @@ get_extended_src_reg (rtx src) return src; } +/* Return TRUE if the candidate insn is zero extend and regno is + an return registers. */ + +static bool +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno) +{ + rtx set = single_set (insn); + + if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND) + return false; + + if (FUNCTION_VALUE_REGNO_P (regno)) + return true; + + return false; +} + +/* Return TRUE if reg source operand of zero_extend is argument registers + and not return registers and source and destination operand are same + and mode of source and destination operand are not same. */ + +static bool +abi_extension_candidate_p (rtx_insn *insn) +{ + rtx set = single_set (insn); + + if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND) + return false; + + machine_mode ext_dst_mode = GET_MODE (SET_DEST (set)); + rtx orig_src = XEXP (SET_SRC (set),0); + + bool copy_needed + = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0))); + + if (!copy_needed && ext_dst_mode != GET_MODE (orig_src) + && FUNCTION_ARG_REGNO_P (REGNO (orig_src)) + && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src))) + return true; + + return false; +} + +/* Return TRUE if the candidate insn is zero extend and regno is + an argument registers. */ + +static bool +abi_extension_candidate_argno_p (rtx_code code, int regno) +{ + if (code != ZERO_EXTEND) + return false; + + if (FUNCTION_ARG_REGNO_P (regno)) + return true; + + return false; +} I don't see anything in those functions that checks if ZERO_EXTEND is actually a feature of the ABI, e.g. as opposed to no extension or SIGN_EXTEND. Do I miss something? I don't think you missed anything. That was one of the points I was making last week. Somewhere, somehow we need to describe what the ABI mandates and guarantees. So while what Ajit has done is a step forward, at some point the actual details of the ABI need to be described in a way that can be checked and consumed by REE. The ABI we need for ree pass are the argument registers and return registers. Based on that I have described interfaces that we need. Other than that we dont any other ABI hooks. I have used FUNCTION_VALUE_REGNO_P and FuNCTION_ARG_REGNO_P abi hooks. You're working with one of many ABIs, some of which have useful properties, some of which do not. Simply testing FUNCTION_VALUE_REGNO_P/FUNCTION_ARG_REGNO_P is not sufficient. You need to be able to query the ABI properties. jeff
Re: [PATCH] MIPS: don't expand large block move
On Fri, 19 May 2023, Jeff Law wrote: > > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc > > index ca491b981a3..00f26d5e923 100644 > > --- a/gcc/config/mips/mips.cc > > +++ b/gcc/config/mips/mips.cc > > @@ -8313,6 +8313,12 @@ mips_expand_block_move (rtx dest, rtx src, rtx > > length) > > } > > else if (optimize) > > { > > + /* When the length is big enough, the lib call has better performace > > +than load/store insns. > > +In most platform, the value is about 64-128. > > +And in fact lib call may be optimized with SIMD */ > > + if (INTVAL(length) >= 64) > > + return false; > Just a formatting nit. Space between INTVAL and the open paren for its > argument list. This is oddly wrapped too. I'd move "performace" (typo there!) to the second line, to align better with the rest of the text. Plus s/platform/platforms/ and there's a full stop missing along with two spaces at the end. Also there's inconsistent style around <= and >=; the GNU Coding Standards ask for spaces around binary operators. And "don't" in the change heading ought to be capitalised. In fact, I'd justify the whole paragraph as each sentence doesn't have to start on a new line, and the commit description could benefit from some reformatting too, as it's now odd to read. > OK with that change. I think the conditional would be better readable if it was flattened though: if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT) ... else if (INTVAL (length) >= 64) ... else if (optimize) ... or even: if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT) ... else if (INTVAL (length) < 64 && optimize) ... One just wouldn't write it as proposed if creating the whole piece from scratch rather than retrofitting this extra conditional. Ultimately it may have to be tunable as LWL/LWR, etc. may be subject to fusion and may be faster after all. Maciej
Re: [PATCH 08/14] fortran: use _P() defines from tree.h
On Thu, 18 May 2023 21:20:41 +0200 Mikael Morin wrote: > Le 18/05/2023 à 17:18, Bernhard Reutner-Fischer a écrit : > > I've fed gfortran.h into the script and found some CLASS_DATA spots, > > see attached bootstrapped and tested patch. > > Do we want to have that? > Some of it makes sense, but not all of it. > > It is a macro to access the _data component of a class container. > So for class-related stuff it makes sense to use CLASS_DATA, and > typically there will be a check that the type is BT_CLASS before. > But for cases where we loop over all of the components of a type that is > not necessarily a class container, it doesn't make sense to use CLASS_DATA. > > So I suggest to only keep the following hunks. [] > OK for those hunks. Pushed those as r14-1001-g05b7cc7daac8b3 Many thanks! PS: I'm attaching the fugly script i used to do these macro replacements FYA. use-defines.1.awk Description: application/awk
Re: [PATCH] c++: mangle noexcept-expr [PR70790]
On Fri, 19 May 2023, Patrick Palka wrote: > This implements noexcept-expr mangling (and demangling) as per the > Itanium ABI. > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this > look OK for trunk? > > PR c++/70790 > > gcc/cp/ChangeLog: > > * mangle.cc (write_expression): Handle NOEXCEPT_EXPR. > > libiberty/ChangeLog: > > * cp-demangle.c (cplus_demangle_operators): Add the noexcept > operator. Oops, we should also make sure we print parens around the operand of noexcept. Otherwise we'd demangle the mangling of e.g. void f(A) instead as void f(A) Fixed in the following patch: -- >8 -- Subject: [PATCH] c++: mangle noexcept-expr [PR70790] This implements noexcept-expr mangling (and demangling) as per the Itanium ABI. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? PR c++/70790 gcc/cp/ChangeLog: * mangle.cc (write_expression): Handle NOEXCEPT_EXPR. libiberty/ChangeLog: * cp-demangle.c (cplus_demangle_operators): Add the noexcept operator. (d_print_comp_inner) : Always print parens around the operand of noexcept too. * testsuite/demangle-expected: Test noexcept operator demangling. gcc/testsuite/ChangeLog: * g++.dg/abi/mangle78.C: New test. --- gcc/cp/mangle.cc | 5 + gcc/testsuite/g++.dg/abi/mangle78.C | 14 ++ libiberty/cp-demangle.c | 5 +++-- libiberty/testsuite/demangle-expected | 3 +++ 4 files changed, 25 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc index 826c5e76c1d..7dab4e62bc9 100644 --- a/gcc/cp/mangle.cc +++ b/gcc/cp/mangle.cc @@ -3402,6 +3402,11 @@ write_expression (tree expr) else write_string ("tr"); } + else if (code == NOEXCEPT_EXPR) +{ + write_string ("nx"); + write_expression (TREE_OPERAND (expr, 0)); +} else if (code == CONSTRUCTOR) { bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr); diff --git a/gcc/testsuite/g++.dg/abi/mangle78.C b/gcc/testsuite/g++.dg/abi/mangle78.C new file mode 100644 index 000..63c4d779e9f --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/mangle78.C @@ -0,0 +1,14 @@ +// PR c++/70790 +// { dg-do compile { target c++11 } } + +template +struct A { }; + +template +void f(A); + +int main() { + f({}); +} + +// { dg-final { scan-assembler "_Z1fIiEv1AIXnxtlT_EEE" } } diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index f2b36bcad68..efada1c322b 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -1947,6 +1947,7 @@ const struct demangle_operator_info cplus_demangle_operators[] = { "ng", NL ("-"), 1 }, { "nt", NL ("!"), 1 }, { "nw", NL ("new"), 3 }, + { "nx", NL ("noexcept"), 1 }, { "oR", NL ("|="),2 }, { "oo", NL ("||"),2 }, { "or", NL ("|"), 2 }, @@ -5836,8 +5837,8 @@ d_print_comp_inner (struct d_print_info *dpi, int options, if (code && !strcmp (code, "gs")) /* Avoid parens after '::'. */ d_print_comp (dpi, options, operand); - else if (code && !strcmp (code, "st")) - /* Always print parens for sizeof (type). */ + else if (code && (!strcmp (code, "st") || !strcmp (code, "nx"))) + /* Always print parens for sizeof (type) or noexcept(expr). */ { d_append_char (dpi, '('); d_print_comp (dpi, options, operand); diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index d9bc7ed4b1f..52dff883a18 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -1659,3 +1659,6 @@ auto f()::{lambda(X<$T0>*, X*)#1}::operator()(X*, _ZZN1XIiE1FEvENKUliE_clEi X::F()::{lambda(int)#1}::operator()(int) const + +_Z1fIiEv1AIXnxtlT_EEE +void f(A) -- 2.41.0.rc0.4.g004e0f790f > * testsuite/demangle-expected: Test noexcept operator > demangling. > > gcc/testsuite/ChangeLog: > > * g++.dg/abi/mangle78.C: New test. > --- > gcc/cp/mangle.cc | 5 + > gcc/testsuite/g++.dg/abi/mangle78.C | 14 ++ > libiberty/cp-demangle.c | 1 + > libiberty/testsuite/demangle-expected | 3 +++ > 4 files changed, 23 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C > > diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc > index 826c5e76c1d..7dab4e62bc9 100644 > --- a/gcc/cp/mangle.cc > +++ b/gcc/cp/mangle.cc > @@ -3402,6 +3402,11 @@ write_expression (tree expr) >else > write_string ("tr"); > } > + else if (code == NOEXCEPT_EXPR) > +{ > + write_string ("nx"); > + write_expression (TREE_OPERAND (expr, 0)); > +} >else if (code == CONSTRUCTOR) > { >bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr); > diff --git
[PATCH v2] release the sorted FDE array when deregistering a frame [PR109685]
Am 19.05.23 um 19:26 schrieb Jeff Law: See: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617245.html I think this needs an update given the other changes in this space. jeff I have included the updated the patch below. The atomic fastpath bypasses the code that releases the sort array which was lazily allocated during unwinding. We now check after deregistering if there is an array to free. libgcc/ChangeLog: * unwind-dw2-fde.c: Free sort array in atomic fast path. --- libgcc/unwind-dw2-fde.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c index a5786bf729c..32b9e64a1c8 100644 --- a/libgcc/unwind-dw2-fde.c +++ b/libgcc/unwind-dw2-fde.c @@ -241,6 +241,12 @@ __deregister_frame_info_bases (const void *begin) // And remove ob = btree_remove (_frames, range[0]); bool empty_table = (range[1] - range[0]) == 0; + + // Deallocate the sort array if any. + if (ob && ob->s.b.sorted) +{ + free (ob->u.sort); +} #else init_object_mutex_once (); __gthread_mutex_lock (_mutex); -- 2.39.2
[PATCH] c++: mangle noexcept-expr [PR70790]
This implements noexcept-expr mangling (and demangling) as per the Itanium ABI. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? PR c++/70790 gcc/cp/ChangeLog: * mangle.cc (write_expression): Handle NOEXCEPT_EXPR. libiberty/ChangeLog: * cp-demangle.c (cplus_demangle_operators): Add the noexcept operator. * testsuite/demangle-expected: Test noexcept operator demangling. gcc/testsuite/ChangeLog: * g++.dg/abi/mangle78.C: New test. --- gcc/cp/mangle.cc | 5 + gcc/testsuite/g++.dg/abi/mangle78.C | 14 ++ libiberty/cp-demangle.c | 1 + libiberty/testsuite/demangle-expected | 3 +++ 4 files changed, 23 insertions(+) create mode 100644 gcc/testsuite/g++.dg/abi/mangle78.C diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc index 826c5e76c1d..7dab4e62bc9 100644 --- a/gcc/cp/mangle.cc +++ b/gcc/cp/mangle.cc @@ -3402,6 +3402,11 @@ write_expression (tree expr) else write_string ("tr"); } + else if (code == NOEXCEPT_EXPR) +{ + write_string ("nx"); + write_expression (TREE_OPERAND (expr, 0)); +} else if (code == CONSTRUCTOR) { bool braced_init = BRACE_ENCLOSED_INITIALIZER_P (expr); diff --git a/gcc/testsuite/g++.dg/abi/mangle78.C b/gcc/testsuite/g++.dg/abi/mangle78.C new file mode 100644 index 000..a3647711604 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/mangle78.C @@ -0,0 +1,14 @@ +// PR c++/70790 +// { dg-do compile { target c++11 } } + +template +struct A { }; + +template +void f(A); + +int main() { + f({}); +} + +// { dg-final { scan-assembler "_Z1fIiEv1AIXnxcvT__EEE" } } diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index f2b36bcad68..341c66db919 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -1947,6 +1947,7 @@ const struct demangle_operator_info cplus_demangle_operators[] = { "ng", NL ("-"), 1 }, { "nt", NL ("!"), 1 }, { "nw", NL ("new"), 3 }, + { "nx", NL ("noexcept"), 1 }, { "oR", NL ("|="),2 }, { "oo", NL ("||"),2 }, { "or", NL ("|"), 2 }, diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index d9bc7ed4b1f..7195cc39c19 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -1659,3 +1659,6 @@ auto f()::{lambda(X<$T0>*, X*)#1}::operator()(X*, _ZZN1XIiE1FEvENKUliE_clEi X::F()::{lambda(int)#1}::operator()(int) const + +_Z1fIiEv1AIXnxcvT__EEE +void f(A) -- 2.41.0.rc0.4.g004e0f790f
[Bug c++/108788] Lookup of injected class name should be type-dependent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108788 Patrick Palka changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||ppalka at gcc dot gnu.org --- Comment #2 from Patrick Palka --- Partially fixed by r12-3643-g18b57c1d4a8777. Reduced version of what we still reject: template struct templ_base { }; template int get_templ_base(T&& v) { return v.templ_base::a; // fails in all gcc versions } : In function ‘int get_templ_base(T&&)’: :7:14: error: ‘template struct templ_base’ used without template arguments
[Bug preprocessor/109912] New: #pragma GCC diagnostic ignored "-Wall" is ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109912 Bug ID: 109912 Summary: #pragma GCC diagnostic ignored "-Wall" is ignored Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: ed at catmur dot uk Target Milestone: --- #pragma GCC diagnostic warning "-Wall" #pragma GCC diagnostic ignored "-Wall" int i = 0 | 1 & 2; warning: suggest parentheses around arithmetic in operand of '|' [-Wparentheses] 3 | int i = 0 | 1 & 2; | ~~^~~ The expected behavior would be for `diagnostic ignored "-Wall"` to suppress all the warnings that were enabled by `diagnostic warning "-Wall"`. If this isn't possible, it would be good to emit a diagnostic that `diagnostic ignored "-Wall"` has no effect. Clang does support this and appears to have always done so.
[Bug target/109279] RISC-V: complex constants synthesized should be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279 --- Comment #16 from Vineet Gupta --- > Which is what this produces: > ``` > long long f(void) > { > unsigned t = 16843009; > long long t1 = t; > long long t2 = ((unsigned long long )t) << 32; > asm("":"+r"(t1)); > return t1 | t2; > } > ``` > I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or > added too. Will definitely improve complex constants forming too. > > Right now the backend does (const<<16+const)<<16+const... which is just so > bad. Umm this testcase is a different problem. It used to generate the same output but no longer after g2e886eef7f2b5a and the other related updates: g0530254413f8 and gc104ef4b5eb1. For the test above, the low and high words are created independently and then stitched. 260r.dfinit # lower word (insn 6 2 7 2 (set (reg:DI 138) (const_int [0x101])) {*movdi_64bit} (insn 7 6 8 2 (set (reg:DI 137) (plus:DI (reg:DI 138) (const_int [0x101]))) {adddi3} (expr_list:REG_EQUAL (const_int [0x1010101]) ) (insn 5 8 9 2 (set (reg/v:DI 134 [ t1 ]) (reg:DI 136 [ t1 ])) {*movdi_64bit} # upper word created independently, no reuse from prior values) (insn 9 5 10 2 (set (reg:DI 141) (const_int [0x101])) {*movdi_64bit} (insn 10 9 11 2 (set (reg:DI 142) (plus:DI (reg:DI 141) (const_int [0x101]))) {adddi3} (insn 11 10 12 2 (set (reg:DI 140) (ashift:DI (reg:DI 142) (const_int 32 [0x20]))) {ashldi3} (expr_list:REG_EQUAL (const_int [0x1010101])) # stitch them (insn 12 11 13 2 (set (reg:DI 139) (ior:DI (reg/v:DI 134 [ t1 ]) (reg:DI 140))) "const2.c":7:13 99 {iordi3} cse1 matches the new "*mvconst_internal" pattern independently on each of them (insn 7 6 8 2 (set (reg:DI 137) (const_int [0x1010101])) {*mvconst_internal} (expr_list:REG_EQUAL (const_int [0x1010101]))) (insn 11 10 12 2 (set (reg:DI 140) (const_int [0x1010101_])) {*mvconst_internal} (expr_list:REG_EQUAL (const_int [0x1010101_]) )) This ultimately gets in the way, as otherwise it would find the equivalent reg across the 2 snippets and reuse reg. It is interesting that due to same pattern, split1 undoes what cse1 did so in theory cse2 ? could redo it it. Anyhow needs to be investigated. But ATM we have the following codegen for the aforementioned test which clearly needs more work. li a0,16842752 addia0,a0,257 li a5,16842752 sllia0,a0,32 addia5,a5,257 or a0,a5,a0 ret