https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85563
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amacleod at redhat dot com, | |rguenth at gcc dot gnu.org Last reconfirmed|2021-04-15 00:00:00 |2023-3-30 --- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> --- Re-confirmed. On the testcase in comment#6 we have before EVRP <bb 2> : Vframe_list.0_1 = Vframe_list; a.1_17 = (long int) Vframe_list.0_1; _15 = (unsigned int) a.1_17; _13 = _15 & 7; _10 = _13 == 3; if (_13 != 3) goto <bb 3>; [INV] else goto <bb 4>; [INV] <bb 3> : __builtin_unreachable (); <bb 4> : # frame1_6 = PHI <frame1_12(D)(2), frame1_16(5)> # tail_7 = PHI <Vframe_list.0_1(2), tail_18(5)> a.1_19 = (long int) tail_7; _20 = (unsigned int) a.1_19; _21 = _20 & 7; _22 = _21 == 3; if (_21 == 3) goto <bb 5>; [INV] else goto <bb 6>; [INV] <bb 5> : _25 = tail_7 + 18446744073709551613; _26 = __builtin_assume_aligned (_25, 8); frame1_16 = _26->u.s.car; tail_18 = _26->u.s.u.cdr; goto <bb 4>; [INV] <bb 6> : do_switch_frame (frame1_6); and we fail to realize we are never exiting the loop in the first iteration. The diagnostic happens because we run into loop header copying which then results in a jump around the loop with an obviously uninitialized argument to frame1_6. And that's because we have thrown away the __builtin_unreachable before loop header copying (in VRP1, after the first threadfull). Re-ordering CH before VRP fails to optimize even when I also put a copyprop between CH and VRP. We then see <bb 2> [local count: 118111600]: # PT = nonlocal escaped null Vframe_list.0_1 = Vframe_list; a.1_12 = (long int) Vframe_list.0_1; _10 = (unsigned int) a.1_12; # RANGE [irange] Lisp_Type [0, 7] NONZERO 0x7 _8 = _10 & 7; if (_8 != 3) goto <bb 3>; [0.00%] else goto <bb 4>; [100.00%] <bb 3> [count: 0]: __builtin_unreachable (); <bb 4> [local count: 118111600]: a.1_21 = (long int) Vframe_list.0_1; _22 = (unsigned int) a.1_21; # RANGE [irange] Lisp_Type [0, 7] NONZERO 0x7 _23 = _22 & 7; if (_23 == 3) but for some reason VRP doesn't optimize the second compare. We'd probably need to value-number here (loop header copying does, but only parts of the function and not including this leading block). So somewhat of a pass ordering issue. The next CSE is DOM and then PRE.