> > On Tue, Oct 20, 2020 at 1:02 PM Martin Jambor <mjam...@suse.cz> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Oct 20 2020, Richard Biener wrote:
> > > > On Mon, Oct 19, 2020 at 7:52 PM Gary Oblock <g...@amperecomputing.com> 
> > > > wrote:
> > > >>
> > > >> Richard,
> > > >>
> > > >> I guess that will work for me. However, since it
> > > >> was decided to remove an identical function,
> > > >> why weren't the calls to it adjusted to reflect it?
> > > >> If the call wasn't transformed that means it will
> > > >> be mapped at some later time. Is that mapping
> > > >> available to look at? Because using that would
> > > >> also be a potential solution (assuming call
> > > >> graph information exists for the deleted function.)
> > > >
> > > > I'm not sure how the transitional cgraph looks like
> > > > during WPA analysis (which is what we're talking about?),
> > > > but definitely the IL is unmodified in that state.
> > > >
> > > > Maybe Martin has an idea.
> > > >
> > >
> > > Exactly, the cgraph_edges is where the correct call information is
> > > stored until the inlining transformation phase calls
> > > cgraph_edge::redirect_call_stmt_to_callee is called on it - inlining is
> > > a special pass in this regard that performs this IPA-infrastructure
> > > function in addition to actual inlining.
> > >
> > > In cgraph means the callee itself but also information in
> > > e->callee->clone.param_adjustments which might be interesting for any
> > > struct-reorg-like optimizations (...and in future possibly in other
> > > transformation summaries).
> > >
> > > The late IPA passes are in very unfortunate spot here since they run
> > > before the real-IPA transformation phases but after unreachable node
> > > removals and after clone materializations and so can see some but not
> > > all of the changes performed by real IPA passes.  The reason for that is
> > > good cache locality when late IPA passes are either not run at all or
> > > only look at small portion of the compilation unit.  In such case IPA
> > > transformations of a function are followed by all the late passes
> > > working on the same function.
> > >
> > > Late IPA passes are unfortunately second class citizens and I would
> > > strongly recommend not to use them since they do not fit into our
> > > otherwise robust IPA framework very well.  We could probably provide a
> > > mechanism that would allow late IPA passes to run all normal IPA
> > > transformations on a function so they could clearly see what they are
> > > looking at, but extensive use would slow compilation down so its use
> > > would be frowned upon at the very least.
> > 
> > So IPA PTA does get_body () on the nodes it wants to analyze and I
> > thought that triggers any pending IPA transforms?
> 
> Yes, it does (and get_untransormed_body does not)
And to bit correct Maritn's explanation: the late IPA passes are
intended to work, though I was mostly planning them for prototyping true
ipa passes and also possibly for implementing passes that inspect only
few functions.

IPA transforms happens when get_body is called.  With LTO this also
trigger reading the body from disk.  So if you want to see all bodies
and work on them, you can simply call get_body on everything but it will
result in increased memory use since everything will be loaded form disk
and expanded (by inlining) at once instead of doing it on per-function
basis.

get_body is simply mean to arrange the body on demand.  The passmanager
uses it before late passes are executed and ipa-pta uses it before it
builds constraints (that is not good for reasons described above).

Clone materialization is also triggered by get_body. The clone
materialization pass mostly happens to remove unreachable function
bodies. I plan to get rid of it, since as we are now better on doing ipa
transforms it brings in a lot of bodies already. For cc1plus it is well
over 1GB of memory.

Honza
> 
> Honza
> > 
> > Richard.
> > 
> > > Martin
> > >

Reply via email to