On Tue, Nov 26, 2019 at 2:42 AM Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
>
> On Mon, Nov 25, 2019 at 11:08:47PM +0000, Richard Sandiford wrote:
> > Segher Boessenkool <seg...@kernel.crashing.org> writes:
> > > On Mon, Nov 25, 2019 at 09:16:52PM +0000, Richard Sandiford wrote:
> > >> Segher Boessenkool <seg...@kernel.crashing.org> writes:
> > >> > I am wondering the other way around :-)  Is what you do for combine2
> > >> > something that would be more generally applicable/useful?  That's what
> > >> > I'm trying to find out :-)
> > >> >
> > >> > What combine does could use some improvement, if you want to hear a
> > >> > more direct motivations.  LOG_LINKS just skip references we cannot
> > >> > handle (and some more), so we always have to do modified_between etc.,
> > >> > which hurts.
> > >>
> > >> The trade-offs behind the choice of representation are very specific
> > >> to the pass.
> > >
> > > Yes, but hopefully not so specific that every pass needs a completely
> > > different representation ;-)
> >
> > Well, it depends.  Most passes make do with df (without DU/UD-chains).
> > But since DU/UD-chains are naturally quadratic in the general case,
> > and are expensive to keep up to date, each DU/UD pass is going to have
> > make some compromises.  It doesn't seem too bad that passes make
> > different compromises based on what they're trying to do.  (combine:
> > single use per definition; fwprop.c: track all uses, but for dominating
> > definitions only; sched: fudged via a param; regrename: single
> > definition/multiple use chains optimised for renmaing; combine2: full
> > live range information, but limited use list; etc.)
>
> combine actually *calculates* DU chains almost completely, it just throws
> away most of that information (it wants to have LOG_LINKS, as it did ages
> ago).  The only thing stopping us from doing that right now is that not
> all uses are counted (some are skipped).
>
> Since combine works only within BBs, DU chains are linear to compute, and
> UD chains are trivial (and just linear to compute).

quadraticness appears for RTL DU/UD chains because of partial definitions,
that doesn't change for BBs so even there computing is them is quadratic
(because recording them is).  The situation is simply having N partial
defs all reaching M uses which gives you a chain of size N * M.

Now - for combine you don't want partial defs, so for simplicity we could
choose to _not_ record DU/UD chains whenever we see a partial def for
a pseudo (and mark those as "bad").  Or, slightly enhanced, we can
handle DU/UD chains for regions where there is no partial definition
and add a "fake" D denoting (there are [multiple] defs beyond that
might be partial).  Depending on the use-case that should suffice and
make the problem linear.

I think you want to ask sth like "is REG changed [partially] between
its use in insn A and the def in insn B" and you want to answer that by using
REGs UD chain for that.  If you only ever reached the def in insn B via the
"pruned" chain then this would work, likewise for the chain we do not compute
any UD chain for REG.

> Updating is quadratic in general, sure.  Luckily in most realistic cases
> it is cheap (most, sigh) (insns aren't combined to very far away).

Updating is linear as well if you can disregard partial defs.  Updating cannot
be quadratic if compute is linear ;)

> > So yeah, if passes want to make roughly the same compromises, it would
> > obviously be good if they shared a representation.  But since each pass
> > does something different, I don't think it's a bad sign that they make
> > different compromises and use different representations.
> >
> > So I don't think a new pass with a new representation is in itself a
> > sign of failure.
>
> Oh, I don't think so either.  I just wonder if it would be useful more
> generically :-)
>
> > >> >> >> Target                 Tests   Delta    Best   Worst  Median
> > >> >> >> avr-elf                 1341 -111401  -13824     680     -10
> > >> >> >
> > >> >> > Things like this are kind of suspicious :-)
> > >> >>
> > >> >> Yeah.  This mostly seems to come from mopping up the extra moves 
> > >> >> created
> > >> >> by make_more_copies.  So we have combinations like:
> > >> >>
> > >> >>    58: r70:SF=r94:SF
> > >> >>       REG_DEAD r94:SF
> > >> >>    60: r22:SF=r70:SF
> > >> >>       REG_DEAD r70:SF
> > >> >
> > >> > Why didn't combine do this?  A target problem?
> > >>
> > >> Seems to be because combine rejects hard-reg destinations whose classes
> > >> are likely spilled (cant_combine_insn_p).
> > >
> > > Ah, okay.  And that is required to prevent ICEs, in combine2 as well
> > > then -- ICEs in RA.
> >
> > Not in this case though.  The final instruction is a hardreg<-pseudo move
> > whatever happens.  There's nothing special about r70 compared to r94.
>
> So the target hook could be improved?  Or, this doesn't matter anyway,
> the extra register move does not prevent any combinations, and RA should
> get rid of it when that is beneficial.
>
> But you see smaller code in the end, hrm.
>
>
> Segher

Reply via email to