[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||ro at gcc dot gnu.org --- Comment #8 from luoxhu at gcc dot gnu.org --- *** Bug 103041 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Richard Biener --- Should be fixed now.
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 --- Comment #6 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:164bbf701ff10ff44e272525e8f462ed3ff1cf43 commit r12-4850-g164bbf701ff10ff44e272525e8f462ed3ff1cf43 Author: Richard Biener Date: Tue Nov 2 18:47:14 2021 +0100 tree-optimization/103029 - ensure vect loop versioning constraint on PHIs PHI nodes in vectorizer loop versioning need to maintain the same order of PHI arguments to not disturb SLP discovery. The following adds an assertion and mitigation in case loop versioning breaks this which happens more often after the recent reorg. 2021-11-02 Richard Biener PR tree-optimization/103029 * tree-vect-loop-manip.c (vect_loop_versioning): Ensure the PHI nodes in the loop maintain their original operand order.
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 --- Comment #5 from Richard Biener --- The gimple_lv_adjust_loop_header_phi comment isn't relevant but what is is the association between PHI node argument and SLP node index from the analysis phase, in particular for backedges. See vect_schedule_scc which does /* Now fixup the backedge def of the vectorized PHIs in this SCC. */ slp_tree phi_node; FOR_EACH_VEC_ELT (phis_to_fixup, i, phi_node) { gphi *phi = as_a (SLP_TREE_REPRESENTATIVE (phi_node)->stmt); edge_iterator ei; edge e; FOR_EACH_EDGE (e, ei, gimple_bb (phi)->preds) { unsigned dest_idx = e->dest_idx; child = SLP_TREE_CHILDREN (phi_node)[dest_idx]; if (!child || SLP_TREE_DEF_TYPE (child) != vect_internal_def) continue; but here (gdb) p e $14 = 9)> (gdb) p e->dest_idx $12 = 0 (gdb) p debug (phi_node) pr82436.c:20:5: note: node 0x3448ba8 (max_nunits=2, refcnt=1) pr82436.c:20:5: note: op template: w_lsm.7_73 = PHI <_58(10), 0.0(25)> pr82436.c:20:5: note: stmt 0 w_lsm.7_73 = PHI <_58(10), 0.0(25)> pr82436.c:20:5: note: stmt 1 y_lsm.6_74 = PHI <_61(10), 0.0(25)> pr82436.c:20:5: note: children (nil) 0x34487f0 so in fact the entry edge edge now has the backedge SLP node. I think the former dance redirecting edges back and forth made the edge order consistent but nothing really forced it the same order as the original loop which could have either order as well. In fact the old code seems to reliably put the entry edge first (but we start with the entry edge second with the very first version) but the new code manages to swap the edges on the _old_ loop which I think is something we should try to avoid. That happens when we loop_redirect_edge (latch_edge, nloop->header); I suppose without making loop_version entirely SSA dependent we can for the moment mostly ensure we end up with the same order but not necessarily avoid changing the original IL order. diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8ed8c69b5b1..81b0dcb7006 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -9044,6 +9044,23 @@ gimple_lv_adjust_loop_header_phi (basic_block first, basic_block second, def = PHI_ARG_DEF (phi2, e2->dest_idx); add_phi_arg (phi1, def, e, gimple_phi_arg_location_from_edge (phi2, e2)); } + + /* If the edges into the old and new loop end up at different PHI arg + indices swap one by redirecting it to its own destination. */ + if (e->dest_idx != e2->dest_idx) +{ + if (e->dest_idx == 0) + { + ssa_redirect_edge (e, e->dest); + flush_pending_stmts (e); + } + else + { + gcc_assert (e2->dest_idx == 0); + ssa_redirect_edge (e2, e2->dest); + flush_pending_stmts (e2); + } +} } ensures this and fixes the testcase. As said it would be desirable to preserve the original loop PHI argument order. A way to have the vectorizer less fragile would of course be also appreciated. I am testing the above.
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2021-11-02 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Target Milestone|--- |12.0 Ever confirmed|0 |1 --- Comment #4 from Richard Biener --- Let me have a look here.
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 --- Comment #3 from luoxhu at gcc dot gnu.org --- This hack could restore the previous phi order to put nondfs phi args before dfs_edge args. But I am not sure whether this is the correct direction. At least it proves that the phi order matters for later vectorizer code. diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 455c3ef8db9..2ca256c15fa 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see #include "gimplify-me.h" #include "tree-ssa-loop-manip.h" #include "dumpfile.h" +#include "ssa.h" static void copy_loops_to (class loop **, int, class loop *); @@ -1577,6 +1578,41 @@ lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head, e1->probability = then_prob; e->probability = else_prob; + edge le, dfs = NULL, nondfs = NULL; + edge_iterator ei; + + if (EDGE_COUNT (e1->dest->preds) > 1) + { +FOR_EACH_EDGE (le, ei, e1->dest->preds) + { + if (le->flags & EDGE_DFS_BACK) + dfs = le; + else + nondfs = le; + } +if (dfs && nondfs && dfs->dest_idx < nondfs->dest_idx) + { + gphi_iterator psi; + gphi *phi; + tree dfsdef, nondfsdef; + for (psi = gsi_start_phis (e1->dest); !gsi_end_p (psi); gsi_next (&psi)) + { + phi = psi.phi (); + dfsdef = PHI_ARG_DEF (phi, dfs->dest_idx); + nondfsdef = PHI_ARG_DEF (phi, nondfs->dest_idx); + SET_PHI_ARG_DEF (phi, dfs->dest_idx, nondfsdef); + SET_PHI_ARG_DEF (phi, nondfs->dest_idx, dfsdef); + } + + EDGE_PRED (e1->dest, dfs->dest_idx) = nondfs; + EDGE_PRED (e1->dest, nondfs->dest_idx) = dfs; + + unsigned int temp = nondfs->dest_idx; + nondfs->dest_idx = dfs->dest_idx; + dfs->dest_idx = temp; + } + } +
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org, ||rguenther at suse dot de --- Comment #2 from luoxhu at gcc dot gnu.org --- Confirmed. P7's extra option -mno-allow-movmisalign makes this ICE happens. If add this option on P9 also ICEs. Reason is the phi arguments order changes if switch the sequence of loopify and lv_adjust_loop_entry_edge. the constant input argument from bb 18 is in phi index 1 now makes the followed vectorize code fail to handle? if (_42 != 0) goto ; [80.00%] else goto ; [20.00%] [local count: 67276368]: [local count: 611603351]: # i_76 = PHI // here # y_lsm.6_74 = PHI <_61(10), 0.0(18)> // here # w_lsm.7_73 = PHI <_58(10), 0.0(18)> // here i.0_72 = (unsigned int) i_76; _70 = (long unsigned int) i.0_72; _69 = _70 * 80; x_68 = r_22(D) + _69; fpred_67 = x_68->f_pred; fexp_66 = x_68->f_exp; tem_65 = fpred_67 - fexp_66; _64 = x_68->f_sigma; _63 = tem_65 / _64; _62 = ABS_EXPR <_63>; _61 = _62 + y_lsm.6_74; _60 = tem_65 / fexp_66; _59 = ABS_EXPR <_60>; _58 = _59 + w_lsm.7_73; i_57 = i_76 + 1; if (n_19(D) > i_57) goto ; [89.00%] else goto ; [11.00%] [local count: 544326983]: goto ; [100.00%] It was: if (_42 != 0) goto ; [80.00%] else goto ; [20.00%] [local count: 67276368]: [local count: 611603351]: # i_76 = PHI <1(18), i_57(10)> # y_lsm.6_74 = PHI <0.0(18), _61(10)> # w_lsm.7_73 = PHI <0.0(18), _58(10)> i.0_72 = (unsigned int) i_76; _70 = (long unsigned int) i.0_72; _69 = _70 * 80; x_68 = r_22(D) + _69; fpred_67 = x_68->f_pred; fexp_66 = x_68->f_exp; tem_65 = fpred_67 - fexp_66; _64 = x_68->f_sigma; _63 = tem_65 / _64; _62 = ABS_EXPR <_63>; _61 = _62 + y_lsm.6_74; _60 = tem_65 / fexp_66; _59 = ABS_EXPR <_60>; _58 = _59 + w_lsm.7_73; i_57 = i_76 + 1; if (n_19(D) > i_57) goto ; [89.00%] else goto ; [11.00%] [local count: 544326983]: goto ; [100.00%] The comments in function gimple_lv_adjust_loop_header_phi says /* Browse all 'second' basic block phi nodes and add phi args to edge 'e' for 'first' head. PHI args are always in correct order. */ Any function to fix the phi order?
[Bug tree-optimization/103029] [12 regression] gcc.dg/vect/pr82436.c ICEs on r12-4818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029 seurer at gcc dot gnu.org changed: What|Removed |Added Target||powerpc64-linux-gnu CC||bergner at gcc dot gnu.org, ||luoxhu at cn dot ibm.com Build||powerpc64-linux-gnu Host||powerpc64-linux-gnu --- Comment #1 from seurer at gcc dot gnu.org --- commit f35af8df241a9eb9c2edf7da26d3c5f53d6e2511 Author: Xionghu Luo Date: Mon Nov 1 00:12:36 2021 -0500 Refactor loop_version