On Sat, May 8, 2021 at 10:05 AM Kewen.Lin <li...@linux.ibm.com> wrote:
>
> Hi Richi,
>
> Thanks for the comments!
>
> on 2021/5/7 下午5:43, Richard Biener wrote:
> > On Fri, May 7, 2021 at 5:30 AM Kewen.Lin via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Hi,
> >>
> >> When I was investigating density_test heuristics, I noticed that
> >> the current rs6000_density_test could be used for single scalar
> >> iteration cost calculation, through the call trace:
> >>   vect_compute_single_scalar_iteration_cost
> >>     -> rs6000_finish_cost
> >>          -> rs6000_density_test
> >>
> >> It looks unexpected as its desriptive function comments and Bill
> >> helped to confirm this needs to be fixed (thanks!).
> >>
> >> So this patch is to check the passed data, if it's the same as
> >> the one in loop_vinfo, it indicates it's working on vector version
> >> cost calculation, otherwise just early return.
> >>
> >> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
> >>
> >> Nothing remarkable was observed with SPEC2017 Power9 full run.
> >>
> >> Is it ok for trunk?
> >
> > +  /* Only care about cost of vector version, so exclude scalar
> > version here.  */
> > +  if (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) != (void *) data)
> > +    return;
> >
> > Hmm, looks like a quite "random" test to me.  What about adding a
> > parameter to finish_cost () (or init_cost?) indicating the cost kind?
> >
>
> I originally wanted to change the hook interface, but noticed that
> the finish_cost in function vect_estimate_min_profitable_iters is
> the only invocation with LOOP_VINFO_TARGET_COST_DATA (loop_vinfo),
> it looks enough to differentiate the scalar version costing or
> vector version costing for loop.  Do you mean this observation/
> assumption easy to be broken sometime later?

Yes, this field is likely to become stale.

>
> The attached patch to add one new parameter to indicate the
> costing kind explicitly as you suggested.
>
> Does it look better?
>
> gcc/ChangeLog:
>
>         * doc/tm.texi: Regenerated.
>         * target.def (init_cost): Add new parameter costing_for_scalar.
>         * targhooks.c (default_init_cost): Adjust for new parameter.
>         * targhooks.h (default_init_cost): Likewise.
>         * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise.
>         (vect_compute_single_scalar_iteration_cost): Likewise.
>         (vect_analyze_loop_2): Likewise.
>         * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
>         (vect_bb_vectorization_profitable_p): Likewise.
>         * tree-vectorizer.h (init_cost): Likewise.
>         * config/aarch64/aarch64.c (aarch64_init_cost): Likewise.
>         * config/i386/i386.c (ix86_init_cost): Likewise.
>         * config/rs6000/rs6000.c (rs6000_init_cost): Likewise.
>
> > OTOH we already pass scalar_stmt to individual add_stmt_cost,
> > so not sure whether the context really matters.  That said,
> > the density test looks "interesting" ... the intent was that finish_cost
> > might look at gathered data from add_stmt, not that it looks at
> > the GIMPLE IL ... so why are you not counting vector_stmt vs.
> > scalar_stmt entries in vect_body and using that for this metric?
> >
>
> Good to know the intention behind finish_cost, thanks!
>
> I'm afraid that the check on vector_stmt and scalar_stmt entries
> from add_stmt_cost doesn't work for the density test here.  The
> density test focuses on the vector version itself, there are some
> stmts whose relevants are marked as vect_unused_in_scope, IIUC
> they won't be passed down when costing for both versions.  But the
> existing density check would like to know the cost for the
> non-vectorized part.  The current implementation does:
>
>  vec_cost = data->cost[vect_body]
>
>           if (!STMT_VINFO_RELEVANT_P (stmt_info)
>               && !STMT_VINFO_IN_PATTERN_P (stmt_info))
>             not_vec_cost++
>
>  density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);
>
> it takes those unrelevant stmts into account, and then has
> both costs from the non-vectorized part (not_vec_cost)
> and vectorized part (cost[vect_body]), it can calculate the
> vectorization code density ratio.

Yes, but then what "relevant" stmts are actually needed and what
not is missed by your heuristics.  It's really some GIGO one
I fear - each vectorized data reference will add a pointer IV
(eventually commoned by IVOPTs later) and pointer value updates
that are not accounted for in costing (the IVs and updates in the
scalar code are marked as not relevant).  Are those the stmts
this heuristic wants to look at?

The patch looks OK btw.

Thanks,
Richard.

>
> BR,
> Kewen

Reply via email to