On Fri, 28 Mar 2025, Jakub Jelinek wrote:
> Hi!
>
> The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
> We have from earlier combinations
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
> (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
> (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
> (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
> (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
> (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (parallel [
> (set (reg:CCZ 17 flags)
> (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
> (const_int 0 [0])))
> (set (reg/v:SI 116 [ e ])
> (neg:SI (reg:SI 104 [ _7 ])))
> ]) "pr119291.c":26:13 977 {*negsi_2}
> (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
> (nil)))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
> (ne:DI (reg:CCZ 17 flags)
> (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
> (expr_list:REG_DEAD (reg:CCZ 17 flags)
> (nil)))
> and try_combine is called on i3 25 and i2 22 (second time)
> and reach the hunk being patched with simplified i3
> (insn 25 24 26 4 (parallel [
> (set (pc)
> (pc))
> (set (reg/v:SI 116 [ e ])
> (const_int 0 [0]))
> ]) "pr119291.c":28:13 977 {*negsi_2}
> (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
> (nil)))
> and
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
> (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
> (nil))
> Now, the try_combine code there attempts to split two independent
> sets in newpat by moving one of them to i2.
> And among other tests it checks
> !modified_between_p (SET_DEST (set1), i2, i3)
> which is certainly needed, if there would be say
> (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
> in between i2 and i3, we couldn't do that, as that set would overwrite
> the value set by set1 we want to move to the i2 position.
> But in this case pseudo 116 isn't set in between i2 and i3, but used
> (and additionally there is a REG_DEAD note for it).
>
> This is equally bad for the move, because while the i3 insn
> and later will see the pseudo value that we set, the insn in between
> which uses the value will see a different value from the one that
> it should see.
>
> As we don't check for that, in the end try_combine succeeds and
> changes the IL to:
> (insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
> (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
> (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
> (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
> (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
> (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (set (pc)
> (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
> (nil))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
> (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
> (nil))
> (note, the i3 got turned into a nop and try_combine also modified insn 27).
>
> The following patch replaces the modified_between_p
> tests with reg_used_between_p, my understanding is that
> modified_between_p is a subset of reg_used_between_p, so one
> doesn't need both.
>
> Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux,
> powerpc64le-linux and s390x-linux, ok for trunk?
>
> Looking at this some more today, I think we should special case
> set_noop_p because that can be put into i2 (except for the JUMP_P
> violations), currently both modified_between_p (pc_rtx, i2, i3)
> and reg_used_between_p (pc_rtx, i2, i3) returns false.
> I'll post a patch incrementally for that (but that feels like
> new optimization, so probably not something that should be backported).
Trying to review this I noticed that both the comment in combine suggests
that memory set is important and modified_between_p handles MEM_P
'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
reg_used_between_p only handles REG_P 'reg'. Also shouldn't
you use reg_referenced_p? At least that function seems to handle
SET_SRC/SET_DEST separately?
Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
does the comment talk about memory?
> 2025-03-28 Jakub Jelinek <[email protected]>
>
> PR rtl-optimization/119291
> * combine.cc (try_combine): For splitting of PARALLEL with
> 2 independent SETs into i2 and i3 sets check reg_used_between_p
> of the SET_DESTs rather than just modified_between_p.
>
> * gcc.c-torture/execute/pr119291.c: New test.
>
> --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
> +++ gcc/combine.cc 2025-03-27 09:50:15.768567383 +0100
> @@ -4012,18 +4012,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> rtx set1 = XVECEXP (newpat, 0, 1);
>
> /* Normally, it doesn't matter which of the two is done first, but
> - one which uses any regs/memory set in between i2 and i3 can't
> - be first. The PARALLEL might also have been pre-existing in i3,
> - so we need to make sure that we won't wrongly hoist a SET to i2
> - that would conflict with a death note present in there, or would
> - have its dest modified between i2 and i3. */
> + one which uses any regs/memory set or used in between i2 and i3
> + can't be first. The PARALLEL might also have been pre-existing
> + in i3, so we need to make sure that we won't wrongly hoist a SET
> + to i2 that would conflict with a death note present in there, or
> + would have its dest modified or used between i2 and i3. */
> if (!modified_between_p (SET_SRC (set1), i2, i3)
> && !(REG_P (SET_DEST (set1))
> && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
> && !(GET_CODE (SET_DEST (set1)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set1))))
> - && !modified_between_p (SET_DEST (set1), i2, i3)
> + && !reg_used_between_p (SET_DEST (set1), i2, i3)
> /* If I3 is a jump, ensure that set0 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
> @@ -4038,7 +4038,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> && !(GET_CODE (SET_DEST (set0)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set0))))
> - && !modified_between_p (SET_DEST (set0), i2, i3)
> + && !reg_used_between_p (SET_DEST (set0), i2, i3)
> /* If I3 is a jump, ensure that set1 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
> --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27
> 09:48:01.917407084 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27
> 09:47:48.020598094 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/119291 */
> +
> +int a;
> +long c;
> +
> +__attribute__((noipa)) void
> +foo (int x)
> +{
> + if (x != 0)
> + __builtin_abort ();
> + a = 42;
> +}
> +
> +int
> +main ()
> +{
> + int e = 1;
> +lab:
> + if (a < 2)
> + {
> + int b = e;
> + _Bool d = a != 0;
> + _Bool f = b != 0;
> + unsigned long g = -(d & f);
> + unsigned long h = c & g;
> + unsigned long i = ~c;
> + e = -(i & h);
> + c = e != 0;
> + a = ~e + b;
> + foo (e);
> + goto lab;
> + }
> +}
>
> Jakub
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)