https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108803

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Besides this question whether it is ok to emit possibly out-of-bounds shifts
into code executed at runtime where source code didn't have any UB, provided
that their destination is only used in a conditional move that will not pick
that result up, I think we have a combiner bug here.

Copying what I wrote in
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612909.html:
I've added debug_bb_n after
every successful combine attempt,
--- gcc/combine.cc.jj   2023-01-02 09:32:43.720977011 +0100
+++ gcc/combine.cc      2023-02-27 19:27:28.151289055 +0100
@@ -4755,6 +4755,16 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
   if (added_notes_insn && DF_INSN_LUID (added_notes_insn) < DF_INSN_LUID
(ret))
     ret = added_notes_insn;

+{
+static int cnt = 0;
+char buf[64];
+sprintf (buf, "/tmp/combine.%d", cnt++);
+FILE *f = fopen (buf, "a");
+fprintf (f, "%d\n", combine_successes);
+basic_block bb = BASIC_BLOCK_FOR_FN (cfun, 2);
+dump_bb (f, bb, 0, dump_flags_t ());
+fclose (f);
+}
   return ret;
 }

and I see
(insn 52 48 53 2 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 130)
            (const_int 0 [0]))) "pr108803.c":12:25 437 {cmpsi}
     (expr_list:REG_DEAD (reg:SI 130)
        (expr_list:REG_EQUAL (compare:CC (const_int -64 [0xffffffffffffffc0])
                (const_int 0 [0]))
            (nil))))
(insn 53 52 57 2 (set (reg:DI 152 [ _6+8 ])
        (if_then_else:DI (ge (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg:DI 132)
            (const_int 0 [0]))) "pr108803.c":12:25 490 {*cmovdi_insn}
     (expr_list:REG_DEAD (reg:DI 132)
        (nil)))
(insn 57 53 59 2 (set (reg:DI 151 [ _6 ])
        (if_then_else:DI (ge (reg:CC 66 cc)
                (const_int 0 [0]))
            (const_int 0 [0])
            (reg:DI 126))) "pr108803.c":12:25 490 {*cmovdi_insn}
     (expr_list:REG_DEAD (reg:CC 66 cc)
        (nil)))
...
(insn 71 68 72 2 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 137)
            (const_int 0 [0]))) "pr108803.c":12:42 437 {cmpsi}
     (expr_list:REG_DEAD (reg:SI 137)
        (expr_list:REG_EQUAL (compare:CC (const_int -64 [0xffffffffffffffc0])
                (const_int 0 [0]))
            (nil))))
(insn 72 71 76 2 (set (reg:DI 153 [ _8 ])
        (if_then_else:DI (ge (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg:DI 139)
            (reg:DI 153 [ _8 ]))) "pr108803.c":12:42 490 {*cmovdi_insn}
     (expr_list:REG_DEAD (reg:DI 139)
        (nil)))
(insn 76 72 77 2 (set (reg:DI 154 [ _8+8 ])
        (if_then_else:DI (ge (reg:CC 66 cc)
                (const_int 0 [0]))
            (reg:DI 138)
            (reg:DI 127))) "pr108803.c":12:42 490 {*cmovdi_insn}
     (expr_list:REG_DEAD (reg:DI 138)
        (expr_list:REG_DEAD (reg:DI 127)
            (expr_list:REG_DEAD (reg:CC 66 cc)
                (nil)))))
(insn 77 76 78 2 (set (reg:DI 159 [ b ])
        (ior:DI (reg:DI 151 [ _6 ])
            (reg:DI 126))) "pr108803.c":12:12 537 {iordi3}
     (expr_list:REG_DEAD (reg:DI 126)
        (expr_list:REG_DEAD (reg:DI 151 [ _6 ])
            (nil))))
(insn 78 77 80 2 (set (reg:DI 160 [ b+8 ])
        (reg:DI 152 [ _6+8 ])) "pr108803.c":12:12 65 {*movdi_aarch64}
     (expr_list:REG_DEAD (reg:DI 152 [ _6+8 ])
        (nil)))
in one of the dumps which still looks correct from brief skimming,
r130 here is k - 64 aka -64 and it is the pair of conditional moves of the
first TImode shift, effectively (a + 63) << 0 but 0 is obfuscated at
expansion time, while the second pair of conditional moves are for the
second TImode shift, effectively (a + 63) >> (0 & 63) and then both ored
together.  It makes sense that the second set of conditional moves are
optimized
earlier because the shift count in that case is more constrained, it is masked
with 63,
so one can see (0 & 63) - 64 < 0, while r130 comparison isn't optimized so
early
because r130 has 2 users and one of them couldn't be merged.
Note, this is the exact spot why I think we want the other patch, we can't
merge
(insn 40 37 41 2 (set (reg:SI 130)
        (const_int -64 [0xffffffffffffffc0])) "pr108803.c":12:25 64
{*movsi_aarch64}
     (nil))
(insn 41 40 43 2 (set (reg:DI 132)
        (ashift:DI (reg:DI 126)
            (subreg:QI (reg:SI 130) 0))) "pr108803.c":12:25 781
{*aarch64_ashl_sisd_or_int_di3}
     (expr_list:REG_EQUAL (ashift:DI (reg:DI 126)
            (const_int -64 [0xffffffffffffffc0]))
        (nil)))
because the backend doesn't like out of bounds shift count immediates in the
instruction
directly.
When doing
Trying 53 -> 78:
   53: r152:DI={(cc:CC>=0)?r132:DI:0}
      REG_DEAD r132:DI
   78: r160:DI=r152:DI
      REG_DEAD r152:DI
Successfully matched this instruction:
(set (reg:DI 160 [ b+8 ])
    (if_then_else:DI (ge (reg:CC 66 cc)
            (const_int 0 [0]))
        (reg:DI 132)
        (const_int 0 [0])))
allowing combination of insns 53 and 78
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 53.
modifying insn i3    78: r160:DI={(cc:CC>=0)?r132:DI:0}
      REG_DEAD cc:CC
      REG_DEAD r132:DI
deferring rescan insn with uid = 78.
combination, the IL changes:
@@ -38,21 +38,15 @@
 (insn 52 48 53 2 (set (reg:CC 66 cc)
         (compare:CC (reg:SI 130)
             (const_int 0 [0]))) "pr108803.c":12:25 437 {cmpsi}
      (expr_list:REG_DEAD (reg:SI 130)
         (expr_list:REG_EQUAL (compare:CC (const_int -64 [0xffffffffffffffc0])
                 (const_int 0 [0]))
             (nil))))
-(insn 53 52 57 2 (set (reg:DI 152 [ _6+8 ])
-        (if_then_else:DI (ge (reg:CC 66 cc)
-                (const_int 0 [0]))
-            (reg:DI 132)
-            (const_int 0 [0]))) "pr108803.c":12:25 490 {*cmovdi_insn}
-     (expr_list:REG_DEAD (reg:DI 132)
-        (nil)))
+(note 53 52 57 2 NOTE_INSN_DELETED)
 (insn 57 53 59 2 (set (reg:DI 151 [ _6 ])
         (if_then_else:DI (ge (reg:CC 66 cc)
                 (const_int 0 [0]))
             (const_int 0 [0])
             (reg:DI 126))) "pr108803.c":12:25 490 {*cmovdi_insn}
      (expr_list:REG_DEAD (reg:CC 66 cc)
         (nil)))
@@ -72,17 +66,21 @@
 (insn 77 76 78 2 (set (reg:DI 159 [ b ])
         (ior:DI (reg:DI 151 [ _6 ])
             (reg:DI 126))) "pr108803.c":12:12 537 {iordi3}
      (expr_list:REG_DEAD (reg:DI 126)
         (expr_list:REG_DEAD (reg:DI 151 [ _6 ])
             (nil))))
 (insn 78 77 80 2 (set (reg:DI 160 [ b+8 ])
-        (reg:DI 152 [ _6+8 ])) "pr108803.c":12:12 65 {*movdi_aarch64}
-     (expr_list:REG_DEAD (reg:DI 152 [ _6+8 ])
-        (nil)))
+        (if_then_else:DI (ge (reg:CC 66 cc)
+                (const_int 0 [0]))
+            (reg:DI 132)
+            (const_int 0 [0]))) "pr108803.c":12:12 490 {*cmovdi_insn}
+     (expr_list:REG_DEAD (reg:CC 66 cc)
+        (expr_list:REG_DEAD (reg:DI 132)
+            (nil))))
 (insn 80 78 82 2 (parallel [
             (set (reg:CC_C 66 cc)
                 (compare:CC_C (plus:DI (reg:DI 155 [ a ])
                         (reg:DI 159 [ b ]))
                     (reg:DI 155 [ a ])))
             (set (reg:DI 144)
                 (plus:DI (reg:DI 155 [ a ])
but as you can see, because cc reg has been REG_DEAD before on insn 57
rather than on insn 53, nothing really removed REG_DEAD note from there
and just adds it on insn 78 (note, besides this REG_DEAD issue the
IL is otherwise still sane, the previous cc setter 71 and its previous
uses 72 and 76 in between the move have been optimized away already in
an earlier successful combination).
And things go wild with the next successful combination:
Trying 52 -> 57:
   52: cc:CC=cmp(0xffffffffffffffc0,0)
      REG_DEAD r130:SI
      REG_EQUAL cmp(0xffffffffffffffc0,0)
   57: r151:DI={(cc:CC>=0)?0:r126:DI}
      REG_DEAD cc:CC
Successfully matched this instruction:
(set (reg:DI 151 [ _6 ])
    (reg:DI 126))
allowing combination of insns 52 and 57
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 52.
modifying insn i3    57: r151:DI=r126:DI
deferring rescan insn with uid = 57.
where the IL changes:
@@ -29,27 +29,18 @@
 (insn 41 40 43 2 (set (reg:DI 132)
         (ashift:DI (reg:DI 126)
             (subreg:QI (reg:SI 130) 0))) "pr108803.c":12:25 781
{*aarch64_ashl_sisd_or_int_di3}
-     (expr_list:REG_EQUAL (ashift:DI (reg:DI 126)
-            (const_int -64 [0xffffffffffffffc0]))
-        (nil)))
+     (expr_list:REG_DEAD (reg:SI 130)
+        (expr_list:REG_EQUAL (ashift:DI (reg:DI 126)
+                (const_int -64 [0xffffffffffffffc0]))
+            (nil))))
 (note 43 41 46 2 NOTE_INSN_DELETED)
 (note 46 43 48 2 NOTE_INSN_DELETED)
 (note 48 46 52 2 NOTE_INSN_DELETED)
-(insn 52 48 53 2 (set (reg:CC 66 cc)
-        (compare:CC (reg:SI 130)
-            (const_int 0 [0]))) "pr108803.c":12:25 437 {cmpsi}
-     (expr_list:REG_DEAD (reg:SI 130)
-        (expr_list:REG_EQUAL (compare:CC (const_int -64 [0xffffffffffffffc0])
-                (const_int 0 [0]))
-            (nil))))
+(note 52 48 53 2 NOTE_INSN_DELETED)
 (note 53 52 57 2 NOTE_INSN_DELETED)
 (insn 57 53 59 2 (set (reg:DI 151 [ _6 ])
-        (if_then_else:DI (ge (reg:CC 66 cc)
-                (const_int 0 [0]))
-            (const_int 0 [0])
-            (reg:DI 126))) "pr108803.c":12:25 490 {*cmovdi_insn}
-     (expr_list:REG_DEAD (reg:CC 66 cc)
-        (nil)))
+        (reg:DI 126)) "pr108803.c":12:25 65 {*movdi_aarch64}
+     (nil))
 (note 59 57 60 2 NOTE_INSN_DELETED)
 (insn 60 59 61 2 (set (reg:DI 139)
         (const_int 0 [0])) "pr108803.c":12:42 65 {*movdi_aarch64}
This removes the cc setter that was used by insn 57 (which has been
updated) but is also needed for the former insn 53 which has been
merged into insn 78.

Reply via email to