[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 YunQiang Su changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #26 from YunQiang Su --- Since we have 2 fixes both fixed this problem. Let's close it. Should we back port it to gcc13/gcc12?
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #25 from GCC Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:3ac58063114cf491891072be6205d32a42c6707d commit r14-6915-g3ac58063114cf491891072be6205d32a42c6707d Author: Roger Sayle Date: Thu Jan 4 10:49:33 2024 + Improved RTL expansion of field assignments into promoted registers. This patch fixes PR rtl-optmization/104914 by tweaking/improving the way the fields are written into a pseudo register that needs to be kept sign extended. The motivating example from the bugzilla PR is: extern void ext(int); void foo(const unsigned char *buf) { int val; ((unsigned char*))[0] = *buf++; ((unsigned char*))[1] = *buf++; ((unsigned char*))[2] = *buf++; ((unsigned char*))[3] = *buf++; if(val > 0) ext(1); else ext(0); } which at the end of the tree optimization passes looks like: void foo (const unsigned char * buf) { int val; unsigned char _1; unsigned char _2; unsigned char _3; unsigned char _4; int val.5_5; [local count: 1073741824]: _1 = *buf_7(D); MEM[(unsigned char *)] = _1; _2 = MEM[(const unsigned char *)buf_7(D) + 1B]; MEM[(unsigned char *) + 1B] = _2; _3 = MEM[(const unsigned char *)buf_7(D) + 2B]; MEM[(unsigned char *) + 2B] = _3; _4 = MEM[(const unsigned char *)buf_7(D) + 3B]; MEM[(unsigned char *) + 3B] = _4; val.5_5 = val; if (val.5_5 > 0) goto ; [59.00%] else goto ; [41.00%] [local count: 633507681]: ext (1); goto ; [100.00%] [local count: 440234144]: ext (0); [local count: 1073741824]: val ={v} {CLOBBER(eol)}; return; } Here four bytes are being sequentially written into the SImode value val. On some platforms, such as MIPS64, this SImode value is kept in a 64-bit register, suitably sign-extended. The function expand_assignment contains logic to handle this via SUBREG_PROMOTED_VAR_P (around line 6264 in expr.cc) which outputs an explicit extension operation after each store_field (typically insv) to such promoted/extended pseudos. The first observation is that there's no need to perform sign extension after each byte in the example above; the extension is only required after changes to the most significant byte (i.e. to a field that overlaps the most significant bit). The bug fix is actually a bit more subtle, but at this point during code expansion it's not safe to use a SUBREG when sign-extending this field. Currently, GCC generates (sign_extend:DI (subreg:SI (reg:DI) 0)) but combine (and other RTL optimizers) later realize that because SImode values are always sign-extended in their 64-bit hard registers that this is a no-op and eliminates it. The trouble is that it's unsafe to refer to the SImode lowpart of a 64-bit register using SUBREG at those critical points when temporarily the value isn't correctly sign-extended, and the usual backend invariants don't hold. At these critical points, the middle-end needs to use an explicit TRUNCATE rtx (as this isn't a TRULY_NOOP_TRUNCATION), so that the explicit sign-extension looks like (sign_extend:DI (truncate:SI (reg:DI)), which avoids the problem. 2024-01-04 Roger Sayle Jeff Law gcc/ChangeLog PR rtl-optimization/104914 * expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P a sign or zero extension is only required if the modified field overlaps the SUBREG's most significant bit. On MODE_REP_EXTENDED targets, don't refer to the temporarily incorrectly extended value using a SUBREG, but instead generate an explicit TRUNCATE rtx.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #24 from GCC Commits --- The master branch has been updated by YunQiang Su : https://gcc.gnu.org/g:65d4b32dee2508f5bcdd999a132332cd46cf8a4d commit r14-6905-g65d4b32dee2508f5bcdd999a132332cd46cf8a4d Author: YunQiang Su Date: Sat Dec 30 00:17:52 2023 +0800 MIPS: Add pattern insqisi_extended and inshisi_extended This match pattern allows combination (zero_extract:DI 8, 24, QI) with an sign-extend to 32bit INS instruction on TARGET_64BIT. For SI mode, if the sign-bit is modified by bitops, we will need a sign-extend operation. Since 32bit INS instruction can be sure that result is sign-extended, and the QImode src register is safe for INS, too. (insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ]) (const_int 8 [0x8]) (const_int 24 [0x18])) (subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1 (nil)) (insn 20 19 23 2 (set (reg/v:DI 200 [ val ]) (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "../xx.c":7:29 -1 (nil)) Combine try to merge them to: (insn 20 19 23 2 (set (reg/v:DI 200 [ val ]) (sign_extend:DI (ior:SI (and:SI (subreg:SI (reg/v:DI 200 [ val ]) 0) (const_int 16777215 [0xff])) (ashift:SI (subreg:SI (reg:QI 205 [ MEM[(const unsigned char *)buf_8(D) + 3B] ]) 0) (const_int 24 [0x18]) "../xx.c":7:29 18 {*insv_extended} (expr_list:REG_DEAD (reg:QI 205 [ MEM[(const unsigned char *)buf_8(D) + 3B] ]) (nil))) And do similarly for 16/16 pair: (insn 13 12 14 2 (set (zero_extract:DI (reg/v:DI 198 [ val ]) (const_int 16 [0x10]) (const_int 16 [0x10])) (subreg:DI (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) + 2B] ]) 0)) "xx.c":5:30 286 {*insvdi} (expr_list:REG_DEAD (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) + 2B] ]) (nil))) (insn 14 13 17 2 (set (reg/v:DI 198 [ val ]) (sign_extend:DI (subreg:SI (reg/v:DI 198 [ val ]) 0))) "xx.c":5:30 241 {extendsidi2} (nil)) > (insn 14 13 17 2 (set (reg/v:DI 198 [ val ]) (sign_extend:DI (ior:SI (ashift:SI (subreg:SI (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) + 2B] ]) 0) (const_int 16 [0x10])) (zero_extend:SI (subreg:HI (reg/v:DI 198 [ val ]) 0) "xx.c":5:30 284 {*inshisi_extended} (expr_list:REG_DEAD (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) + 2B] ]) (nil))) Let's accept these patterns, and set the cost to 1 instruction. gcc PR rtl-optimization/104914 * config/mips/mips.md (insqisi_extended): New patterns. (inshisi_extended): Ditto. gcc/testsuite * gcc.target/mips/pr104914.c: New test.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #23 from YunQiang Su --- I guess, we should drop TRULY_NOOP_TRUNCATION_MODES_P and TARGET_MODE_REP_EXTENDED for MIPS. In fact, it will only effect MIPS64 SI->DI. Then it may reduce the maintain workload for MIPS64. Let's have a try and run some benchmark.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #22 from YunQiang Su --- Any way, we should split the assert to another patch. I will try to find all the wrongly used TRULY_NOOP_TRUNCATION_MODES_P.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #21 from YunQiang Su --- Sorry, Roger. Your patch is correct. I misunderstood it.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #20 from YunQiang Su --- This patch has 2 problems: 1. We may need some more steps to add gcc_assert (outprec < inprec) Now, I met some ICE with it. 2. It doesn't solve the this problem: In combine.cc, jump_insn eats truncate and sign_extend. In fact that is the real problem: How to tell combine.cc: don't eat it; this truncate/sign_extend is really needed by us.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #19 from Roger Sayle --- Created attachment 56930 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56930=edit proposed patch And now for a patch that does (or should) work. This even contains an optimization, we middle-end knows we don't need to sign or zero extend if a insv doesn't modify the sign-bit. Testing on MIPS would be much appreciated. TIA.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #18 from Roger Sayle --- Please ignore comment #17, the above patch is completely bogus/broken.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #17 from Roger Sayle --- I think this patch might resolve the problem (or move it somewhere else): diff --git a/gcc/expr.cc b/gcc/expr.cc index 9fef2bf6585..218bca905f5 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -6274,10 +6274,7 @@ expand_assignment (tree to, tree from, bool nontemporal) result = store_expr (from, to_rtx, 0, nontemporal, false); else { - rtx to_rtx1 - = lowpart_subreg (subreg_unpromoted_mode (to_rtx), - SUBREG_REG (to_rtx), - subreg_promoted_mode (to_rtx)); + rtx to_rtx1 = gen_reg_rtx (subreg_unpromoted_mode (to_rtx)); result = store_field (to_rtx1, bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), The motivation/solution comes from a comment in expmed.cc: /* If the destination is a paradoxical subreg such that we need a truncate to the inner mode, perform the insertion on a temporary and truncate the result to the original destination. Note that we can't just truncate the paradoxical subreg as (truncate:N (subreg:W (reg:N X) 0)) is (reg:N X). */ The same caveat applies to extensions on MIPS, so we should use a new pseudo temporary register rather than update the SUBREG in place. If someone could confirm this fixes the issue on MIPS, I'll try to come up with a milder form of this fix that checks TARGET_MODE_REP_EXTENDED that'll limit the churn/impact on other targets.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #16 from YunQiang Su --- (In reply to Roger Sayle from comment #15) > Is MIPS64 actually a TRULY_NOOP_TRUNCATION_TARGET? If SImode is implicitly > assumed to be (sign?) extended, then an arbitrary DImode value/register > can't be used as an SImode value without appropriately setting/clearing the > upper bits. > i.e. thus this integer truncation isn't a no-op. > in gcc/config/mips/mips.cc, there are lines: static bool mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec) { return !TARGET_64BIT || inprec <= 32 || outprec > 32; } So for mips_truly_noop_truncation(64, 32), it is true, aka we can convert 32bit value to 64bit value without any insn. This setting is based on that most (if not all) word (32bit) operation insns are all sign-extend. For example, when we run these instructions on a MIPS64 CPU li $a1, 0x7fff add $a3, $a1, $a1 The result of $a3 will be: 0x fffe And for theses instructions: li $a1, 0x7fff dadd $a3, $a1, $a1 # note, add -> dadd Then the content of $a3 will be: 0x fffe And MIPS has the single instruction for: branch less than zero, for both MIPS32, MIPS64. Let me explain example 1: if the code is running on a 32bit CPU, the result of $a3 will be 0xfffe, which is -2. if the code is running on a 64bit CPU, since the result of $a3 will be sign-extend to 0x fffe, it is still -2. That's how MIPS make 32bit binaries run smoothly on a 64bit CPU without any mode switch. > I suspect that the underlying problem is that the backend is relying on > implicit invariants, not explicitly represented in the RTL, and then > surprised when valid RTL transformations don't preserve those > invariants/assumptions. > > I wonder why the zero_extract followed by sign_extend example mentioned in > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626137.html isn't > already being considered as a try_combine candidate, allowing the backend to > simply recognize or split it. I'll investigate. Thanks.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 Roger Sayle changed: What|Removed |Added CC||roger at nextmovesoftware dot com --- Comment #15 from Roger Sayle --- Is MIPS64 actually a TRULY_NOOP_TRUNCATION_TARGET? If SImode is implicitly assumed to be (sign?) extended, then an arbitrary DImode value/register can't be used as an SImode value without appropriately setting/clearing the upper bits. i.e. thus this integer truncation isn't a no-op. I suspect that the underlying problem is that the backend is relying on implicit invariants, not explicitly represented in the RTL, and then surprised when valid RTL transformations don't preserve those invariants/assumptions. I wonder why the zero_extract followed by sign_extend example mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626137.html isn't already being considered as a try_combine candidate, allowing the backend to simply recognize or split it. I'll investigate.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #14 from YunQiang Su --- New patch: diff --git a/gcc/expmed.cc b/gcc/expmed.cc index fbd4ce2d42f..66d45da67df 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -850,6 +850,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, since that case is valid for any mode. The following cases are only valid for integral modes. */ opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0)); + opt_scalar_int_mode str_mode = int_mode_for_mode (GET_MODE (str_rtx)); scalar_int_mode imode; if (!op0_mode.exists () || imode != GET_MODE (op0)) { @@ -881,8 +882,15 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, op0 = gen_lowpart (op0_mode.require (), op0); } - return store_integral_bit_field (op0, op0_mode, ibitsize, ibitnum, - bitregion_start, bitregion_end, + bool use_str_mode = false; + if (GET_MODE_CLASS(GET_MODE (str_rtx)) == MODE_INT + && GET_MODE_CLASS(GET_MODE (op0)) == MODE_INT + && known_gt (GET_MODE_SIZE(GET_MODE(op0)), GET_MODE_SIZE(GET_MODE(str_rtx +use_str_mode = true; + + return store_integral_bit_field (use_str_mode ? str_rtx : op0, + use_str_mode ? str_mode : op0_mode, + ibitsize, ibitnum, bitregion_start, bitregion_end, fieldmode, value, reverse, fallback_p); }
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #13 from YunQiang Su --- This tiny patch works will this small test case by replace with dins to ins. I have no idea whether it will have any side effects. Any idea? diff --git a/gcc/expmed.cc b/gcc/expmed.cc index fbd4ce2d42f..37f90912122 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -849,7 +849,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, if we aren't. This must come after the entire register case above, since that case is valid for any mode. The following cases are only valid for integral modes. */ - opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (op0)); + opt_scalar_int_mode op0_mode = int_mode_for_mode (GET_MODE (str_rtx)); scalar_int_mode imode; if (!op0_mode.exists () || imode != GET_MODE (op0)) {
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #12 from Andrew Pinski --- (In reply to Andrew Pinski from comment #11) > But I don't have any other notes on my change (nor a testcase). So I found some notes and it is similar but still different. We were expanding: ;; insn.j_format.target = D.21597_19; Into (insn 25 24 26 (set (reg:DI 220) (lshiftrt:DI (reg:DI 196 [ D.21584 ]) (const_int 2 [0x2]))) arch/mips/kernel/jump_label.c:56 -1 (nil)) (insn 26 25 0 (set (zero_extract:SI (reg/v:SI 208 [ insn ]) (const_int 26 [0x1a]) (const_int 6 [0x6])) (subreg:SI (reg:DI 220) 4)) arch/mips/kernel/jump_label.c:56 -1 (nil)) But the subreg there was incorrect. In this case of this bug, the reg is DI rather than SI. I wonder why we have that in the first place even though val is the size of SImode ...
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #11 from Andrew Pinski --- (In reply to Andrew Pinski from comment #10) > Created attachment 55496 [details] > old patch against GCC 4.7 > > I am trying to find my notes on this old patch but our internal bug system > has moved a few times and the project looks archived even. > But I am pretty sure this is related to the problem at hand. (note I had another patch before that which renamed store_bit_field_1 to store_bit_field_2). The code is now in store_bit_field_using_insv. Here: else { tmp = gen_lowpart_if_possible (op_mode, value1); if (! tmp) tmp = gen_lowpart (op_mode, force_reg (value_mode, value1)); } value1 = tmp; } But I don't have any other notes on my change (nor a testcase).
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #10 from Andrew Pinski --- Created attachment 55496 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55496=edit old patch against GCC 4.7 I am trying to find my notes on this old patch but our internal bug system has moved a few times and the project looks archived even. But I am pretty sure this is related to the problem at hand.
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=67736 --- Comment #9 from Andrew Pinski --- (In reply to YunQiang Su from comment #8) > (In reply to Andrew Pinski from comment #7) > > The initial RTL has a signed extend in there: > > > > > > (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) > > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) > > "/app/example.cpp":7:29 -1 > > (nil)) > > (jump_insn 23 20 24 2 (set (pc) > > (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4) > > (const_int 0 [0])) > > (label_ref 32) > > (pc))) "/app/example.cpp":8:5 -1 > > (int_list:REG_BR_PROB 440234148 (nil)) > > -> 32) > > > > > > Before combine also looks fine: > > (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) > > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) > > "/app/example.cpp":7:29 235 {extendsidi2} > > (nil)) > > Yes. I noticed it. while in mips.md, extendsidi2 is expanded to no > instructions at all. Right then the le should had a truncation before the use of SI mode here ...
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #8 from YunQiang Su --- (In reply to Andrew Pinski from comment #7) > The initial RTL has a signed extend in there: > > > (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) > "/app/example.cpp":7:29 -1 > (nil)) > (jump_insn 23 20 24 2 (set (pc) > (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4) > (const_int 0 [0])) > (label_ref 32) > (pc))) "/app/example.cpp":8:5 -1 > (int_list:REG_BR_PROB 440234148 (nil)) > -> 32) > > > Before combine also looks fine: > (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) > "/app/example.cpp":7:29 235 {extendsidi2} > (nil)) Yes. I noticed it. while in mips.md, extendsidi2 is expanded to no instructions at all. ``` ;; Extension insns. ;; Those for integer source operand are ordered widest source type first. ;; When TARGET_64BIT, all SImode integer and accumulator registers ;; should already be in sign-extended form (see TARGET_TRULY_NOOP_TRUNCATION ;; and truncdisi2). We can therefore get rid of register->register ;; instructions if we constrain the source to be in the same register as ;; the destination. ;; ;; Only the pre-reload scheduler sees the type of the register alternatives; ;; we split them into nothing before the post-reload scheduler runs. ;; These alternatives therefore have type "move" in order to reflect ;; what happens if the two pre-reload operands cannot be tied, and are ;; instead allocated two separate GPRs. We don't distinguish between ;; the GPR and LO cases because we don't usually know during pre-reload ;; scheduling whether an operand will be LO or not. (define_insn_and_split "extendsidi2" [(set (match_operand:DI 0 "register_operand" "=d,l,d") (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,0,m")))] "TARGET_64BIT" "@ # # lw\t%0,%1" "&& reload_completed && register_operand (operands[1], VOIDmode)" [(const_int 0)] { emit_note (NOTE_INSN_DELETED); DONE; } [(set_attr "move_type" "move,move,load") (set_attr "mode" "DI")]) ```
[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 Andrew Pinski changed: What|Removed |Added Component|target |rtl-optimization Ever confirmed|0 |1 Last reconfirmed||2023-07-06 Status|UNCONFIRMED |NEW Known to fail||11.2.0 --- Comment #7 from Andrew Pinski --- The initial RTL has a signed extend in there: (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 -1 (nil)) (jump_insn 23 20 24 2 (set (pc) (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4) (const_int 0 [0])) (label_ref 32) (pc))) "/app/example.cpp":8:5 -1 (int_list:REG_BR_PROB 440234148 (nil)) -> 32) Before combine also looks fine: (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ]) (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) "/app/example.cpp":7:29 235 {extendsidi2} (nil)) (jump_insn 23 20 24 2 (set (pc) (if_then_else (le (subreg/s/u:SI (reg/v:DI 200 [ val+-4 ]) 4) (const_int 0 [0])) (label_ref 32) (pc))) "/app/example.cpp":8:5 471 {*branch_ordersi} (expr_list:REG_DEAD (reg/v:DI 200 [ val+-4 ]) (int_list:REG_BR_PROB 440234148 (nil))) -> 32) But combine does the wrong thing: Trying 20 -> 23: 20: r200:DI=sign_extend(r200:DI#4) 23: pc={(r200:DI#4<=0)?L32:pc} REG_DEAD r200:DI REG_BR_PROB 440234148 Successfully matched this instruction: (set (pc) (if_then_else (le (subreg:SI (reg/v:DI 200 [ valD.1959+-4 ]) 4) (const_int 0 [0])) (label_ref 32) (pc))) allowing combination of insns 20 and 23 original costs 4 + 16 = 20 replacement cost 16 deferring deletion of insn with uid = 20. modifying insn i323: pc={(r200:DI#4<=0)?L32:pc} REG_BR_PROB 440234148 REG_DEAD r200:DI deferring rescan insn with uid = 23. Instead of a subreg here, it should have been a truncate.