Hi all, tl;dr: I want to try to implement native support for capturing slices of the RTS stack as a personal experiment; please tell me the obstacles I am likely to run into. Much more context follows.
--- I have been working on an implementation of an algebraic effect system that uses unsafe primops to be as performant as possible. However, the unavoidable need to CPS the entire program balloons heap allocation. Consider the following definition: f a b = g a >>= \c -> h (b + c) Assume `g` and `h` are not inlined. If the monad used is IO, this will be compiled efficiently: the result of `g` is returned on the stack, and no closure needs to be allocated for the lambda. However, if the monad supports capturing the continuation, the above definition must be CPS’d. After inlining, we end up with f a b = \k -> let lvl = \c -> h (b + c) k in g a lvl which must allocate a closure on the heap. This is frustrating, as it must happen for every call to a non-inlined monadic operation, even if that operation never captures the continuation. In an algebraic effect system, there are many shortcuts that avoid the need to capture the continuation, and under my implementation, large swaths of code never do so. I’ve managed to exploit that to get some savings, but I can’t escape the need to allocate all these closures. This motivates my question: how difficult would it be to allow capturing a portion of the RTS call stack directly? My requirements are fairly minimal, as continuations go: 1. Capturing a continuation is only legal from within a strict state thread (i.e. IO or strict ST). 2. The continuation is captured up to a prompt, which would be a new kind of RTS stack frame. Prompts are not tagged, so there is only ever exactly one prompt active at any time (which may be the root prompt). 3. Capturing a continuation is unsafe. The behavior of capturing a continuation is undefined if the current prompt was not created by the current state thread (and it is never legal to capture up to the root prompt). 4. Applying a continuation is unsafe. Captured continuations return `Any`, and type safety is the caller’s obligation. 5. Continuations are “functional,” which is to say applying them does not trigger any additional stack unwinding. This minimal support for primitive continuation capturing would be enough to support my efficient, safe delimited control implementation. In my ignorant mind, implementing this ought to be as simple as defining two new primops, reset# :: (State# s -> (# State# s, a #)) -> State# s -> (# State# s, a #) shift# :: ((a -> State# s -> (# State# s, Any #)) -> State# s -> (# State# s, Any #)) -> State# s -> (# State# s, a #) where reset# pushes a new prompt frame and shift# captures a slice of the RTS stack up to that frame and copies it into the heap. Restoring a continuation would copy all the captured frames onto the end of the current stack. Sounds simple enough! I would like to experiment with implementing something like this myself, just to see if it would really work, but somehow I doubt it is actually as simple as it sounds. Minor complications are fine, but what worries me are major obstacles I haven’t found in my limited attempts to learn about the RTS. So far, I’ve read the old “The New GHC/Hugs Runtime System” paper, which still seems mostly accurate from a high level, though I imagine many details have changed since then. I’ve also read the Commentary/RTS/Storage/Stack wiki page, and I’ve peeked at some of the RTS source code. I’ve also skimmed a handful of other papers and wiki pages, but it’s hard to know what I’m looking for. Can anyone point me to better resources or give me some insight into what will be hard? Thanks in advance, Alexis _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs