On Thu, May 11, 2017 at 2:15 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, May 11, 2017 at 2:14 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Thu, May 11, 2017 at 1:15 PM, Robin Dapp <rd...@linux.vnet.ibm.com> wrote: >>> Included the requested changes in the patches (to follow). I removed >>> the alignment count check now altogether. >>> >>>> I'm not sure why you test for unlimited_cost_model here as I said >>>> elsewhere I'm not sure >>>> what not cost modeling means for static decisions. The purpose of >>>> unlimited_cost_model >>>> is to always vectorize when possible and omit the runtime >>>> profitability check. So for peeling >>>> I'd just always use the cost model. Thus please drop this check. >>> >>> Without that, I get one additional FAIL gcc.dg/vect/slp-25.c for x86. >>> It is caused by choosing no peeling (inside costs 0) over peeling for >>> known alignment with unlimited cost model (inside costs 0 as well). >>> Costs 0 for no peeling are caused by count == 0 or rather ncopies = vf / >>> nunits == 4 / 8 == 0 in record_stmt_costs (). Shouldn't always hold >>> ncopies > 0? Even 0.5 would have worked here to make no peeling more >>> expensive than 0. >> >> That's odd. I can't get >> >> Index: gcc/tree-vect-stmts.c >> =================================================================== >> --- gcc/tree-vect-stmts.c (revision 247882) >> +++ gcc/tree-vect-stmts.c (working copy) >> @@ -95,6 +96,7 @@ record_stmt_cost (stmt_vector_for_cost * >> enum vect_cost_for_stmt kind, stmt_vec_info stmt_info, >> int misalign, enum vect_cost_model_location where) >> { >> + gcc_assert (count > 0); >> if (body_cost_vec) >> { >> tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; >> >> to ICE with the testcase (unpatched trunk) >> >> Where's that record_stmt_cost call done? You can't simply use vf/nunits >> for SLP. > > Ah, of course needs -fvect-cost-model. > > I'll investigate.
Ugh. The vect_peeling_hash_get_lowest_cost isn't handling SLP in any way, there's quite some refactoring necessary to fix that. I suggest (eh) to do Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c (revision 247734) +++ gcc/tree-vect-data-refs.c (working copy) @@ -1129,7 +1129,7 @@ vect_get_data_access_cost (struct data_r int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - int ncopies = vf / nunits; + int ncopies = MAX (1, vf / nunits); /* TODO: Handle SLP properly */ if (DR_IS_READ (dr)) vect_get_load_cost (dr, ncopies, true, inside_cost, outside_cost, > Richard. > >> Richard. >> >>> Test suite on s390x is clean. >>> >>> Regards >>> Robin >>>