On Wed, Oct 21, 2020 at 5:21 AM Gary Oblock <g...@amperecomputing.com> wrote: > > >IPA transforms happens when get_body is called. With LTO this also > >trigger reading the body from disk. So if you want to see all bodies > >and work on them, you can simply call get_body on everything but it will > >result in increased memory use since everything will be loaded form disk > >and expanded (by inlining) at once instead of doing it on per-function > >basis. > Jan, > > Doing > > FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_body (); > > instead of > > FOR_EACH_FUNCTION_WITH_GIMPLE_BODY ( node) node->get_untransformed_body (); > > instantaneously breaks everything...
I think during WPA you cannot do ->get_body (), only ->get_untransformed_body (). But we don't know yet where in the IPA process you're experiencing the issue. Richard. > Am I missing something? > > Gary > ________________________________ > From: Jan Hubicka <hubi...@ucw.cz> > Sent: Tuesday, October 20, 2020 4:34 AM > To: Richard Biener <richard.guent...@gmail.com> > Cc: GCC Development <gcc@gcc.gnu.org>; Gary Oblock <g...@amperecomputing.com> > Subject: Re: Where did my function go? > > [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please > be mindful of safe email handling and proprietary information protection > practices.] > > > > > On Tue, Oct 20, 2020 at 1:02 PM Martin Jambor <mjam...@suse.cz> wrote: > > > > > > > > Hi, > > > > > > > > On Tue, Oct 20 2020, Richard Biener wrote: > > > > > On Mon, Oct 19, 2020 at 7:52 PM Gary Oblock > > > > > <g...@amperecomputing.com> wrote: > > > > >> > > > > >> Richard, > > > > >> > > > > >> I guess that will work for me. However, since it > > > > >> was decided to remove an identical function, > > > > >> why weren't the calls to it adjusted to reflect it? > > > > >> If the call wasn't transformed that means it will > > > > >> be mapped at some later time. Is that mapping > > > > >> available to look at? Because using that would > > > > >> also be a potential solution (assuming call > > > > >> graph information exists for the deleted function.) > > > > > > > > > > I'm not sure how the transitional cgraph looks like > > > > > during WPA analysis (which is what we're talking about?), > > > > > but definitely the IL is unmodified in that state. > > > > > > > > > > Maybe Martin has an idea. > > > > > > > > > > > > > Exactly, the cgraph_edges is where the correct call information is > > > > stored until the inlining transformation phase calls > > > > cgraph_edge::redirect_call_stmt_to_callee is called on it - inlining is > > > > a special pass in this regard that performs this IPA-infrastructure > > > > function in addition to actual inlining. > > > > > > > > In cgraph means the callee itself but also information in > > > > e->callee->clone.param_adjustments which might be interesting for any > > > > struct-reorg-like optimizations (...and in future possibly in other > > > > transformation summaries). > > > > > > > > The late IPA passes are in very unfortunate spot here since they run > > > > before the real-IPA transformation phases but after unreachable node > > > > removals and after clone materializations and so can see some but not > > > > all of the changes performed by real IPA passes. The reason for that is > > > > good cache locality when late IPA passes are either not run at all or > > > > only look at small portion of the compilation unit. In such case IPA > > > > transformations of a function are followed by all the late passes > > > > working on the same function. > > > > > > > > Late IPA passes are unfortunately second class citizens and I would > > > > strongly recommend not to use them since they do not fit into our > > > > otherwise robust IPA framework very well. We could probably provide a > > > > mechanism that would allow late IPA passes to run all normal IPA > > > > transformations on a function so they could clearly see what they are > > > > looking at, but extensive use would slow compilation down so its use > > > > would be frowned upon at the very least. > > > > > > So IPA PTA does get_body () on the nodes it wants to analyze and I > > > thought that triggers any pending IPA transforms? > > > > Yes, it does (and get_untransormed_body does not) > And to bit correct Maritn's explanation: the late IPA passes are > intended to work, though I was mostly planning them for prototyping true > ipa passes and also possibly for implementing passes that inspect only > few functions. > > IPA transforms happens when get_body is called. With LTO this also > trigger reading the body from disk. So if you want to see all bodies > and work on them, you can simply call get_body on everything but it will > result in increased memory use since everything will be loaded form disk > and expanded (by inlining) at once instead of doing it on per-function > basis. > > get_body is simply mean to arrange the body on demand. The passmanager > uses it before late passes are executed and ipa-pta uses it before it > builds constraints (that is not good for reasons described above). > > Clone materialization is also triggered by get_body. The clone > materialization pass mostly happens to remove unreachable function > bodies. I plan to get rid of it, since as we are now better on doing ipa > transforms it brings in a lot of bodies already. For cc1plus it is well > over 1GB of memory. > > Honza > > > > Honza > > > > > > Richard. > > > > > > > Martin > > > >