On Tue, 19 Jun 2012, William J. Schmidt wrote:

> On Tue, 2012-06-19 at 12:08 +0200, Richard Guenther wrote:
> > On Mon, 18 Jun 2012, William J. Schmidt wrote:
> > 
> > > On Mon, 2012-06-11 at 13:40 +0200, Richard Guenther wrote:
> > > > On Fri, 8 Jun 2012, William J. Schmidt wrote:
> > > > 
> > > <snip>
> > > > 
> > > > Hmm.  I don't like this patch or its general idea too much.  Instead
> > > > I'd like us to move more of the cost model detail to the target, giving
> > > > it a chance to look at the whole loop before deciding on a cost.  ISTR
> > > > posting the overall idea at some point, but let me repeat it here 
> > > > instead
> > > > of trying to find that e-mail.
> > > > 
> > > > The basic interface of the cost model should be, in targetm.vectorize
> > > > 
> > > >   /* Tell the target to start cost analysis of a loop or a basic-block
> > > >      (if the loop argument is NULL).  Returns an opaque pointer to
> > > >      target-private data.  */
> > > >   void *init_cost (struct loop *loop);
> > > > 
> > > >   /* Add cost for N vectorized-stmt-kind statements in vector_mode.  */
> > > >   void add_stmt_cost (void *data, unsigned n,
> > > >                       vectorized-stmt-kind,
> > > >                       enum machine_mode vector_mode);
> > > > 
> > > >   /* Tell the target to compute and return the cost of the accumulated
> > > >      statements and free any target-private data.  */
> > > >   unsigned finish_cost (void *data);
> > > > 
> > > > with eventually slightly different signatures for add_stmt_cost
> > > > (like pass in the original scalar stmt?).
> > > > 
> > > > It allows the target, at finish_cost time, to evaluate things like
> > > > register pressure and resource utilization.
> > > > 
> > > > Thanks,
> > > > Richard.
> > > 
> > > I've been looking at this in between other projects.  I wanted to be
> > > sure I understood the SLP infrastructure and whether it would cause any
> > > problems.  It looks to me like it will be mostly ok.  One issue I
> > > noticed is a possible difference in the order in which SLP instructions
> > > are analyzed and the order in which the instructions are "issued" during
> > > transformation.
> > > 
> > > For both loop analysis and basic block analysis, SLP trees are
> > > constructed and analyzed prior to examining other vectorizable
> > > instructions.  Their costs are calculated and stored in the SLP trees at
> > > this time.  Later, when transforming statements to their vector
> > > equivalents, instructions in the block (or loop body) are processed in
> > > order until the first instruction that's part of an SLP tree is
> > > encountered.  At that point, every instruction that's part of any SLP
> > > tree is transformed; then the vectorizer continues with the remaining
> > > non-SLP vectorizable statements.
> > > 
> > > So if we do the natural and easy thing of placing calls to add_stmt_cost
> > > everywhere that costs are calculated today, the order that those costs
> > > are presented to the back end model will possibly be different than the
> > > order they are actually "emitted."
> > 
> > Interesting.  But I suppose this is similar to how pattern statements
> > are handled?  Thus, the whole pattern sequence is processed when
> > we encounter the "main" pattern statement?
> 
> Yes, but the difference is that both vect_analyze_stmt and
> vect_transform_loop handle the pattern statements in the same order
> (thankfully -- I would hate to have to deal with the pattern mess).
> With SLP, all SLP statements are analyzed ahead of time, but they aren't
> transformed until one of them is encountered in the statement walk.

Ah, ok.  I suppose we can simply declare that when we register
vectorized stmts with the backend they are in arbitrary oder.
After all this is not supposed to be another machine dependent reorg
phase (to quote David).

> > 
> > > For a first cut at this, I suggest ignoring the problem other than to
> > > document it as an opportunity for improvement.  Later we could improve
> > > it by using an add_stmt_slp_cost () interface (or adding an is_slp
> > > flag), and another interface to be called at the time during analysis
> > > when the SLP statements will be issued during transformation.  This
> > > would allow the back end model to queue up the SLP costs in a separate
> > > vector and later place them in its internal structures at the
> > > appropriate place.
> > >
> > > It should eventually be possible to remove these fields/accessors:
> > > 
> > >  * STMT_VINFO_{IN,OUT}SIDE_OF_LOOP_COST
> > >  * SLP_TREE_{IN,OUT}SIDE_OF_LOOP_COST
> > >  * SLP_INSTANCE_{IN,OUT}SIDE_OF_LOOP_COST
> > > 
> > > However, I think this should be delayed until we have the basic
> > > infrastructure in place for the new model and well-tested.
> > 
> > Indeed.
> > 
> > > The other issue is that we should have the model track both the inside
> > > and outside costs if we're going to get everything into the target
> > > model.  For a first pass we can ignore this and keep the existing logic
> > > for the outside costs.  Later we should add some interfaces analogous to
> > > add_stmt_cost such as add_stmt_prolog_cost and add_stmt_epilog_cost so
> > > the model can track this stuff as carefully as it wants to.
> > 
> > Outside costs are merely added to the niter * inner-cost metric to
> > be compared with the scalar cost niter * scalar-cost, right?  Thus
> > they would be tracked completely separate - eventually similar to
> > how we compute the cost of the scalar loop.
> 
> Yes, that's the way they're used today, and probably nobody will ever
> want to get fancier than that.  But as you say, the idea would be to let
> them be tracked similarly, but in two more pieces: prolog and epilog.

Ok.

Richard.

Reply via email to