https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #36 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #35)
> (In reply to Richard Biener from comment #34)
> > The change itself looks reasonable given costs, though maybe 2 -> 2
> > combinations should not trigger when the cost remains the same?  In
> > this case it definitely doesn't look profitable, does it?  Well,
> > in theory it might hide latency and the 2nd instruction can issue
> > at the same time as the first.
> 
> No, it definitely should be done.  As I showed back then, it costs less than
> 1%
> extra compile time on *any platform* on average, and it reduced code size by
> 1%-2%
> everywhere.
> 
> It also cannot get stuck, any combination is attempted only once, any
> combination
> that succeeds eats up a loglink.  It is finite, (almost) linear in fact.

So the slowness for the testcase comes from failed attempts.

> Any backend is free to say certain insns shouldn't combine at all.  This will
> lead to reduced performance though.
> 
> - ~ - ~ -
> 
> Something that is the real complaint here: it seems we do not GC often
> enough,
> only after processing a BB (or EBB)?  That adds up for artificial code like
> this, sure.

For memory use if you know combine doesn't have "dangling" links to GC memory
you can call ggc_collect at any point you like.  Or, when you create
throw-away RTL, ggc_free it explicitly (yeah, that only frees the "toplevel").

> And the "param to give an upper limit to how many combination attempts are
> done
> (per BB)" offer is on the table still, too.  I don't think it would ever be
> useful (if you want your code to compile faster just write better code!),
> but :-)

Well, while you say the number of successful combinations is linear the
number of combine attempts appearantly isn't (well, of course, if we ever
combine from multi-use defs).  So yeah, a param might be useful here but
instead of some constant limit on the number of combine attempts per
function or per BB it might make sense to instead limit it on the number
of DEFs?  I understand we work on the uses so it'll be a bit hard to
apply this in a way to, say, combine a DEF only with the N nearest uses
(but not any ones farther out), and maintaining such a count per DEF would
cost.  So more practical might be to limit the number of attempts to combine
into an (unchanged?) insns?

Basically I would hope with a hard limit in place we'd not stop after the
first half of a BB leaving trivial combinations in the second half
unhandled but instead somehow throttle the "expensive" cases?

Reply via email to