https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100145
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|tree-optimization |ipa Keywords| |missed-optimization Ever confirmed|0 |1 Last reconfirmed| |2021-04-20 CC| |hubicka at gcc dot gnu.org, | |marxin at gcc dot gnu.org, | |rguenth at gcc dot gnu.org Status|UNCONFIRMED |NEW Version|unknown |11.0 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- At -O2 we optimize things in thread3, at -O3 we have a PHI less there and thus do no backwards threading which is because 'c' wasn't PREd for some reason (-fno-tree-vectorize or -fno-tree-partial-pre do not help) int main () { - int D.2001; - int b.1_7; - int prephitmp_16; + int _1; + int b.1_6; <bb 2> [local count: 1073741824]: c = 0; - b.1_7 = b; - if (b.1_7 != 0) + b.1_6 = b; + if (b.1_6 != 0) goto <bb 3>; [34.00%] else goto <bb 4>; [66.00%] - <bb 3> [local count: 3318838410]: + <bb 3> [local count: 365072224]: c = 1; <bb 4> [local count: 1073741824]: - # prephitmp_16 = PHI <0(2), 1(3)> d = 1; - if (prephitmp_16 > 100) + _1 = c; + if (_1 > 100) goto <bb 5>; [33.00%] else goto <bb 6>; [67.00%] the issue seems to be the guessed profile (but BB counts are the same!): +Skipping partial redundancy for expression {mem_ref<0B>,addr_expr<&c>}@.MEM_7 ( 0001), no redundancy on to be optimized for speed edge so that leaves the "global" hot count we IPA compute somehow? Honza?