Thanks for the feedback! Let's have a video meeting. My schedule is flexible. What time is best for you?
Cheers, Csaba On Thu, Dec 16, 2021 at 5:29 PM Sebastian Graf <[email protected]> wrote: > Hey Csaba, > > After catching up on your talk and reflecting about it a bit, it seems > quite obvious that your tool is the right way to collect the data and > validate our analysis! > Even if meanwhile we decided that a "transparent stack frame" (which I > believe is something similar to what you are doing here > <https://github.com/grin-compiler/ghc-whole-program-compiler-project/blob/d3ed07d9f3167ad1afa01d1aa95aec2472b2708f/external-stg-interpreter/lib/Stg/Interpreter.hs#L130>, > with an explicit `argCount` which we do not know) is not the ideal solution > we've been looking for (for different reasons). > > Essentially, we have two things > > 1. An escape analysis, implemented as an STG-to-STG pass that attaches > a boolean flag to each Id whether it "escapes its scope" (for a suitable > definition of that). > We'd like to implement it in a way that it would be reusable within > GHC with moderate effort (e.g., renaming Binder to Id or accounting for > different fields), operating on a module at a time rather than whole > program. > 2. The instrumentation that tries to measure how many heap objects > could be allocated on the stack. E.g., set a closure-specific flag whenever > the closure is entered, unset that bit (once) when we "leave the scope" > that defines the closure. > If my understanding is right, we could just implement this > "instrumentation" as a simple extra field to Closure > > <https://github.com/grin-compiler/ghc-whole-program-compiler-project/blob/d3ed07d9f3167ad1afa01d1aa95aec2472b2708f/external-stg-interpreter/lib/Stg/Interpreter/Base.hs#L135>, > right? Neat! > > A bit tangential: I see that your interpreter currently allocates a fresh > closure for let-no-escapes > <https://github.com/grin-compiler/ghc-whole-program-compiler-project/blob/d3ed07d9f3167ad1afa01d1aa95aec2472b2708f/external-stg-interpreter/lib/Stg/Interpreter.hs#L606> > when it could just re-use the closure of its defining scope. That would > skew our numbers somewhat compared to instrumenting GHC-compiled programs, > but I think we'd be able to work around that. I also wonder if the > semantics of your let-no-escapes are actually as typically specified > (operationally), in that a jump to a let-no-escape should also reset the > stack pointer. It should hardly matter for the programs that GHC generates, > though. > > I would also be interested in knowing whether the +RTS -s "bytes allocated > in the heap" metric (very nearly) coincides with a similar metric you could > produce. It would be fantastic if that was the case! Theoretically, that > should be possible, right? > > I think your interpreter work is very valuable to collect data we > otherwise would only be able to measure with a TickyTicky-based approach. > Nice! > Another, similar use case would be to identify the fraction of closures > that are only entered once. I remember that there was a ticky-based patch > with which Joachim used to measure this fraction > <https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/demand#instrumentation> > (and similarly validate the analysis results), but unfortunately it > couldn't end up in master. Ah, yes, we have a whole open ticket about it: > #10613 <https://gitlab.haskell.org/ghc/ghc/-/issues/10613>. In fact, that > instrumentation is also somewhat similar (adding a field to every closure) > as what we want to do. > > Anyway, it seems like your work will be very valuable in replacing some of > the annoying ticky-based instrumentation ideas! > > Maybe we can have a call some time this or next week to discuss details, > once Sebastian and I are more familiar with the code base? > > Thanks for sticking with the project and doing all the hard work that can > build upon! > Sebastian > > ------ Originalnachricht ------ > Von: "Csaba Hruska" <[email protected]> > An: "Sebastian Graf" <[email protected]> > Cc: "ghc-devs" <[email protected]>; "Sebastian Scheper" < > [email protected]> > Gesendet: 15.12.2021 16:16:27 > Betreff: Re: Transparently hooking into the STG stack to validate an > escape analysis > > Hi, > > IMO the Cmm STG machine implementation is just too complex for student > projects. It's not fun to work with at all. > Why did you choose this approach? > IMO the escape analysis development and validation would be much smoother > and fun when you'd use the external STG interpreter. > When you have a solid and working design of your analysis and > transformations then you could implement it in GHC's native backend if it > needs any changes at all. > > What do you think? > Do you disagree? > > Have you seen my presentation about the stg interpreter? > https://www.youtube.com/watch?v=Ey5OFPkxF_w > > Cheers, > Csaba > > On Wed, Dec 8, 2021 at 11:20 AM Sebastian Graf <[email protected]> > wrote: > >> Hi Devs, >> >> my master's student Sebastian and I (also Sebastian :)) are working on an >> escape analysis in STG, see >> https://gitlab.haskell.org/ghc/ghc/-/issues/16891#note_347903. >> >> We have a prototype for the escape analysis that we want to >> validate/exploit now. >> The original plan was to write the transformation that allocates >> non-escaping objects on the STG stack. But that is quite tricky for many >> reasons, one of them being treatment of the GC. >> >> This mail is rather lengthy, so I suggest you skip to "What we hope could >> work" and see if you can answer it without the context I provide below. If >> you can't, I would be very grateful if you were willing to suffer through >> the exposition. >> >> # Instrumentation >> >> So instead we thought about doing a (easily changed and thus versatile) >> instrumentation-based approach: >> Assign a sequence number to every instantiation (a term I will use to >> mean "allocation of g's closure") of things that we our analysis >> determines as escaping or non-escaping, such as STG's let bindings >> (focusing on non-let-no-escape functions for now). >> (One sequence number *per allocation* of the let binding's closure, not >> based on its syntactic identity.) >> Then, we set a bit in a (dynamically growing) global bit vector whenever >> the let "RHS is entered" and then unset it when we "leave the let-body". >> Example: >> >> f = \x y -> >> let { >> g = [y] \z -> y + z; >> } in g x >> >> Here, our analysis could see that no instantiation (which I say instead >> of "allocation of g's closure") of g will ever escape its scope within f. >> Our validation would give a fresh sequence number to the instantiation of >> g whenever f is called and store it in g's closure (which we arrange by >> piggy-backing on -prof and adding an additional field to the profiling >> header). >> Then, when g's RHS is entered, we set the bit in the global bit vector, >> indicating "this instantiation of g might escape". >> After leaving the RHS of g, we also leave the body of the defining let, >> which means we unset the bit in the bit vector, meaning "every use so far >> wasn't in an escaping scenario". >> >> So far so good. Modifying the code upon entering g takes a bit of >> tinkering but can be done by building on TickyTicky in StgToCmm. >> But what is not done so easily is inserting the continuation upon >> entering the let that will unset the bit! >> >> # What doesn't work: Modifying the Sequel >> >> At first, we tried to modify the sequel >> <https://gitlab.haskell.org/ghc/ghc/-/blob/4c6985cc91b9cf89b393f69c9a603e8df0aea9e0/compiler/GHC/StgToCmm/Monad.hs#L210-222> >> of >> the let-body to an `AssignTo`. >> That requires us to know the registers in which the let-body will return >> its results, which in turn means we have to know the representation of >> those results, so we have to write a function `stgExprPrimRep :: GenStgExpr >> p -> [PrimRep]`. >> Urgh! We were very surprised that there was no such function. And while >> we tested our function, we very soon knew why. Consider the following >> pattern synonym matcher: >> >> GHC.Natural.$mNatJ# >> :: forall {rep :: GHC.Types.RuntimeRep} {r :: TYPE rep}. >> GHC.Num.Natural.Natural >> -> (GHC.Num.BigNat.BigNat -> r) -> ((# #) -> r) -> r >> = {} \r [scrut_sBE cont_sBF fail_sBG] >> case scrut_sBE of { >> GHC.Num.Natural.NS _ -> fail_sBG GHC.Prim.(##); >> GHC.Num.Natural.NB ds_sBJ -> >> let { >> sat_sBK :: GHC.Num.BigNat.BigNat >> = CCCS GHC.Num.BigNat.BN#! [ds_sBJ]; >> } in cont_sBF sat_sBK; >> }; >> >> Note how its result is representation-polymorphic! It only works because >> our machine implementation allows tail-calls. >> It's obvious in hindsight that we could never write `stgExprPrimRep` in >> such a way that it will work on the expression `cont_sBF sat_sBK`. >> So the sequel approach seems infeasible. >> >> # What we hope could work: A special stack frame >> >> The other alternative would be to insert a special continuation frame on >> the stack when we enter the let-body (inspired by stg_restore_cccs). >> This continuation frame would simply push all registers (FP regs, GP >> regs, Vec regs, ...) to the C stack, do its work (unsetting the bit), then >> pop all registers again and jump to the topmost continuation on the STG >> stack. >> Example: >> >> f :: forall rep (r :: TYPE rep). Int# -> (Int# -> r) -> r >> f = \x g -> >> let { >> h = [x] \a -> x + a; >> } in >> case h x of b { >> __DEFAULT -> g b >> } >> >> We are only interested in unsetting the bit for h here. Consider the >> stack when entering the body of h. >> >> caller_of_f_cont_info <- Sp >> >> Now push our special continuation frame: >> >> caller_of_f_cont_info >> seq_h >> unset_bit_stk_info <- Sp >> >> E.g., the stack frame contains the info pointer and the sequence number. >> (Btw., I hope I got the stack layout about right and this is even possible) >> Then, after we entered the continuation of the __DEFAULT alt, we do a >> jump to g. >> Plot twist: g returns an unboxed 8-tuple of `Int#`s (as caller_of_f_cont_info >> knows, but f certainly doesn't!), so before it returns it will push two >> args on the stack (correct?): >> >> caller_of_f_cont_info >> seq_h >> unset_bit_stk_info >> unboxed tuple component 7 >> unboxed tuple component 8 <- Sp >> >> And then `g` jumps directly to the entry code for `unset_bit_stk_info` >> (which does the register saving I mentioned), which absolutely can't figure >> out from Sp alone where seq_h is. >> Darn! I think Luite's recent work on the StgToByteCode had to work around >> similar issues, I found this wiki page >> <https://gitlab.haskell.org/ghc/ghc-wiki-mirror/-/blob/master/Unboxed-Tuples-and-Sums-in-GHCi-bytecode.md#returning-a-tuple>. >> But we aren't in a position where we know the representation of `r` *at >> all*! >> >> So our idea was to scan the stack, beginning from `Sp`, until I find >> `unset_bit_stk_info`, giving us the following situation: >> >> caller_of_f_cont_info >> seq_h >> unset_bit_stk_info <- Bp >> unboxed tuple component 7 >> unboxed tuple component 8 <- Sp >> >> I suggestively named the register in which we store the result Bp in >> analogy to the traditional base pointer. This information would be enough >> to unset the bit at index seq_h >> and then copy the unboxed tuple components 7 and 8 up by two words: >> >> caller_of_f_cont_info >> unboxed tuple component 7 >> unboxed tuple component 8 <- Sp >> >> Then jump to caller_of_f_cont_info, which knows what to make of the >> stack and the register state. >> >> The stack scanning is horrible and too brittle and slow for a production >> setting, as any of the unboxed tuple components could have the same bit >> pattern as `unset_bit_stk_info`. >> We simply hope that is never the case and it's fine for the purposes of a >> quick-and-dirty instrumentation. >> >> QUESTION 1: What do you think? Could this work? Can you anticipate pit >> falls of this approach? >> >> # What about Thunks and StgRhsCon? >> >> QUESTION 2: The instrumentation as described won't work for Thunks (which >> are only entered once) and constructor applications (like sat_sBK in the >> BigNat matcher above). Not sure how to change that without incurring huge >> performance hits (by deactivating memoisation). Ideas are welcome here. >> >> Thanks for reading this far. I hope you could follow and are now fizzing >> with excitement because you have a much better idea of how to do this and >> will let me know :) >> >> Sebastian >> _______________________________________________ >> ghc-devs mailing list >> [email protected] >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >
_______________________________________________ ghc-devs mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
