On Mon, Dec 22, 2014 at 3:54 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Sat, Dec 20, 2014 at 8:18 PM, Eric Botcazou <ebotca...@adacore.com> wrote: >>> As described both in the PR and patch comments, this patch fixes PR62151 by >>> setting right value to ELIM_I0/ELIM_I1 when distributing REG_DEAD notes from >>> i0/i1. It is said that distribute_notes had caused many bugs in the past. >>> I think it still has bug in it, as noted in the PR. This patch doesn't >>> touch distribute_notes because we are in stage3 and I want to have more >>> discussion on it. >>> Bootstrap and test on x86_64. aarch64 is ongoing. So is it ok? >>> >>> 2014-12-11 Bin Cheng <bin.ch...@arm.com> >>> >>> PR rtl-optimization/62151 >>> * combine.c (try_combine): Reset elim_i0 and elim_i1 when >>> distributing notes from i0notes or i1notes, this time don't >>> check whether newi2pat sets i1dest or i0dest. >> >> The reasoning looks correct to me and the patch is certainly safe so it's OK >> on principle, but I think that we should avoid the duplication of predicates. >> >> Can you move the computation of the alternative elim_i1 & elim_i0 up to where >> the original ones are computed along with the explanation of why we care >> about >> newi2pat only for notes that were on I3 and I2? Something like: >> >> /* Compute which registers we expect to eliminate. newi2pat may be >> setting >> either i3dest or i2dest, so we must check it. */ >> rtx elim_i2 = ((newi2pat && reg_set_p (i2dest, newi2pat)) >> || i2dest_in_i2src || i2dest_in_i1src || i2dest_in_i0src >> || !i2dest_killed >> ? 0 : i2dest); >> /* For I1 we need to compute both local elimination and global elimination >> because i1dest may be the same as i3dest, in which case newi2pat may be >> setting i1dest. <big explanation of why this is needed> */ >> rtx local_elim_i1 = (i1 == 0 || i1dest_in_i1src || i1dest_in_i0src >> || !i1dest_killed >> ? 0 : i1dest); >> rtx elim_i1 = (local_elim_i1 == 0 >> || (newi2pat && reg_set_p (i1dest, newi2pat)) >> ? 0 : i1dest); >> /* Likewise for I0. */ >> rtx local_elim_i0 = (i0 == 0 || i0dest_in_i0src >> || !i0dest_killed >> ? 0 : i0dest); >> rtx elim_i0 = (local_elim_i0 == 0 >> || (newi2pat && reg_set_p (i0dest, newi2pat)) >> ? 0 : i0dest); >> >> -- >> Eric Botcazou > > Hi Eric, > Thanks for reviewing. Here comes the revised patch. Bootstrap and > test on x86_64, is it OK? > > Thanks, > bin > > > 2014-12-22 Bin Cheng <bin.ch...@arm.com> > > PR rtl-optimization/62151 > * combine.c (try_combine): New local variables local_elim_i1 > and local_elim_i0. Set elim_i1 and elim_i0 using the local > version variables. Distribute notes from i0notes or i1notes > using the local variavbles. > > gcc/testsuite/ChangeLog > 2014-12-22 Bin Cheng <bin.ch...@arm.com> > > PR rtl-optimization/62151 > * gcc.c-torture/execute/pr62151.c: New test.
Hmm, I further revised comment in the patch since in try_combine, i2 is always after i0/i1. The original comment is inaccurate about that. Thanks, bin
Index: gcc/testsuite/gcc.c-torture/execute/pr62151.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/pr62151.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/pr62151.c (revision 0) @@ -0,0 +1,41 @@ +/* PR rtl-optimization/62151 */ + +int a, c, d, e, f, g, h, i; +short b; + +int +fn1 () +{ + b = 0; + for (;;) + { + int j[2]; + j[f] = 0; + if (h) + d = 0; + else + { + for (; f; f++) + ; + for (a = 0; a < 1; a++) + for (;;) + { + i = b & ((b ^ 1) & 83647) ? b : b - 1; + g = 1 ? i : 0; + e = j[0]; + if (c) + break; + return 0; + } + } + } +} + +int +main () +{ + fn1 (); + if (g != -1) + __builtin_abort (); + return 0; +} Index: gcc/combine.c =================================================================== --- gcc/combine.c (revision 218855) +++ gcc/combine.c (working copy) @@ -4119,19 +4119,46 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn rtx midnotes = 0; int from_luid; /* Compute which registers we expect to eliminate. newi2pat may be setting - either i3dest or i2dest, so we must check it. Also, i1dest may be the - same as i3dest, in which case newi2pat may be setting i1dest. */ + either i3dest or i2dest, so we must check it. */ rtx elim_i2 = ((newi2pat && reg_set_p (i2dest, newi2pat)) || i2dest_in_i2src || i2dest_in_i1src || i2dest_in_i0src || !i2dest_killed ? 0 : i2dest); - rtx elim_i1 = (i1 == 0 || i1dest_in_i1src || i1dest_in_i0src + /* For i1, we need to compute both local elimination and global + elimination information with respect to newi2pat because i1dest + may be the same as i3dest, in which case newi2pat may be setting + i1dest. Global information is used when distributing REG_DEAD + note for i2 and i3, in which case it does matter if newi2pat sets + i1dest or not. + + Local information is used when distributing REG_DEAD note for i1, + in which case it doesn't matter if newi2pat sets i1dest or not. + See PR62151, if we have four insns combination: + i0: r0 <- i0src + i1: r1 <- i1src (using r0) + REG_DEAD (r0) + i2: r0 <- i2src (using r1) + i3: r3 <- i3src (using r0) + ix: using r0 + From i1's point of view, r0 is eliminated, no matter if it is set + by newi2pat or not. In other words, REG_DEAD info for r0 in i1 + should be discarded. + + Note local information only affects cases in forms like "I1->I2->I3", + "I0->I1->I2->I3" or "I0&I1->I2, I2->I3". For other cases like + "I0->I1, I1&I2->I3" or "I1&I2->I3", newi2pat won't set i1dest or + i0dest anyway. */ + rtx local_elim_i1 = (i1 == 0 || i1dest_in_i1src || i1dest_in_i0src + || !i1dest_killed + ? 0 : i1dest); + rtx elim_i1 = (local_elim_i1 == 0 || (newi2pat && reg_set_p (i1dest, newi2pat)) - || !i1dest_killed ? 0 : i1dest); - rtx elim_i0 = (i0 == 0 || i0dest_in_i0src + /* Same case as i1. */ + rtx local_elim_i0 = (i0 == 0 || i0dest_in_i0src || !i0dest_killed + ? 0 : i0dest); + rtx elim_i0 = (local_elim_i0 == 0 || (newi2pat && reg_set_p (i0dest, newi2pat)) - || !i0dest_killed ? 0 : i0dest); /* Get the old REG_NOTES and LOG_LINKS from all our insns and @@ -4300,10 +4327,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn elim_i2, elim_i1, elim_i0); if (i1notes) distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, - elim_i2, elim_i1, elim_i0); + elim_i2, local_elim_i1, local_elim_i0); if (i0notes) distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, - elim_i2, elim_i1, elim_i0); + elim_i2, elim_i1, local_elim_i0); if (midnotes) distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0);