https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100810

--- Comment #4 from Roger Sayle <roger at nextmovesoftware dot com> ---
I believe this bug occurs during the .195t.ccp4 pass that was introduced by the
commit identified above, where tree-ssa-propagate.c's
substitute_and_fold_engine appears not to correctly handle the situation of a
completely empty basic block (or the CFG flags describing it are incorrect).

Things are fine (but unusual) with the incoming GIMPLE:

  <bb 11> [local count: 69772953]:
  # i_30 = PHI <i_16(10)>
  # h_lsm.23_40 = PHI <h_lsm.23_29(10)>
  if (g.12_20 != 0)
    goto <bb 12>; [50.00%]
  else
    goto <bb 13>; [50.00%]

  <bb 12> [local count: 282263306]:

  <bb 13> [local count: 69772953]:
  # h_lsm.23_27 = PHI <h_lsm.23_40(11), 2(12)>
  _31 = i_30 != 0;
  _14 = _10 & _31;
  if (_14 != 0)
    goto <bb 14>; [25.00%]
  else
    goto <bb 15>; [75.00%]

Notice that the only purpose of the empty bb12 is to define the phi eddges
into bb13, one edge sets h_lsm to 2, the other doesn't.  Alas, this logic gets
overlooked by ccp4, where TDF_DETAILS reports:

Folding PHI node: h_lsm.23_40 = PHI <h_lsm.23_29(10)>
No folding possible
Folding PHI node: h_lsm.23_27 = PHI <h_lsm.23_40(11), 2(12)>
Queued PHI for removal.  Folds to: 2
...
Removing basic block 12
;; basic block 12, loop depth 1
;;  pred:
;;  succ:       13
Merging blocks 11 and 13

And things do downhill from there. Alas I'm still trying to figure exactly how
the "Folds to: 2" is (mis)deduced, but I thought I'd share my analysis so far
so that a real tree-ssa expert can confirm what's supposed to happen.

A useful workaround for debugging is:
diff --git a/gcc/passes.def b/gcc/passes.def
index 26d86df..c2806bc 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -339,7 +339,9 @@ along with GCC; see the file COPYING3.  If not see
       /* Threading can leave many const/copy propagations in the IL.
         Clean them up.  Instead of just copy_prop, we use ccp to
         compute alignment and nonzero bits.  */
+#if 0
       NEXT_PASS (pass_ccp, true /* nonzero_p */);
+#endif
       NEXT_PASS (pass_warn_restrict);
       NEXT_PASS (pass_dse);
       NEXT_PASS (pass_cd_dce, true /* update_address_taken_p */);

Another workaround, for this particular testcase, is -fno-tree-loop-sm, but the
real culprit (as shown above) is inside pass_ccp (or in the invariants it's
relying on).

Reply via email to