On Sat, Aug 3, 2013 at 12:38 AM, William ML Leslie < [email protected]> wrote:
> On 3 August 2013 03:52, Jonathan S. Shapiro <[email protected]> wrote: > > On Fri, Aug 2, 2013 at 8:20 AM, Matt Rice <[email protected]> wrote: > >> > >> On Sat, Jul 27, 2013 at 1:29 PM, Jonathan S. Shapiro <[email protected]> > >> wrote: > > However when region polymorphism is introduced at procedure parameters > and > > return values this becomes less clear. Objects in a caller region can > now be > > manipulated by a callee. The lifetime of a region is still bounded by the > > stack frame in which it is allocate [*], but it's no longer easy to > point at > > a single stack frame and say "that's the [single] frame in which this > region > > can be touched". > > You didn't add the footnote you mentioned there... So sorry. It's a pretty boring footnote, though. The lifetime of a region is bounded by the stack frame in which it [the region] is allocated, *except for the "bottom" region, which is the general GC heap, and is not associated with any particular frame*. It's a minor point about how the region ordering induction grounds out, that's all. Shap said: > > > The lifetime *of a region* is bounded by the stack frame in which *it* > is allocated > > This need not be the stack frame that the *value* was allocated in... > Indeed. The model is that a region is allocated in a let-like binding somewhere (therefore in a stack frame somewhere), and then objects are allocated in/from that region. Assuming we do not choose to GC the region, the lifetime of those objects is exactly the lifetime of the region. If we * do* choose to GC the region, we can exploit the region induction to reduce what has to be traced. I've said that second part several times, but maybe a more unpacked explanation might be helpful: At any given moment, we have an inductive chain of regions that have a stack-like ordering discipline: R0 (the GC Heap) > R1 > R2 > R3 > R4 ... > Rmax (the current frame) where '>' should be read as "has longer lifespan than". The usual use of regions imposes a subtype relation, such that an object in region Ri can only be pointed at from a pointer in Region Rj <= Ri. This is in contrast to generational collection, where pointers are permitted from tenured space to younger generations, and must be tracked specially. The length of the region relation chain |R| is of the same order as the depth of the stack in frames |S|. More precisely, the length |R| is strictly <= the total number of let bindings |LB| on the stack, and since there is a constant bound on let bindings per frame, |LB| ~ |S|. But in pragmatically, most "temporary" garbage is allocated in procedures close to the bottom of the call chain. What this means is that regions worth the bother of collecting tend to be very near Rnow, tend to be small, and tend to require a very limited amount of tracing. One way to think about it is that region-driven collection is a form of multi-generational collection in which the liveness of a GC generation is related to the liveness of stack frames, but there is an added benefit that we can often tell, statically, when a given region cannot usefully be collected. By "not usefully", I mean that we can show statically that collection will not meaningfully reduce the size of that region. > The return case is illustrative, because a function can return a > reference to a fresh list of dynamically determined length and still > have accurate lifetimes inferred by the region analysis! Clearly the > list itself can't be allocated on the stack, so what is going on? > Actually, it *can* be allocated on the stack. Or at least it *often* can. First, a side observation: you can think of stack frames themselves, and the unboxed regions they contain, as a "poor man's region system". The usual cross-region reference rule is enforced by prohibiting address-of on unboxed stack-allocated objects, or alternatively by allowing address-of but only when stored in borrowed pointer cells. So borrowed pointers are an extended from of "poor man's region system". If you can statically estimate the size of a region, or get a decently conservative bound, you can quite literally put the region on the stack. When you know that an object is non-escaping, you simply allocate that object from the stack frame using alloca(). That allocation approach breaks down if the region size is dynamic. At that point you either have to allocate the dynamically-allocated portion of the region from the GC heap, or leave yourself a conservatively sized arena, or some such other form of storage management. The most interesting case is the one where you are returning a dynamically allocated object. The problem is that you would like to make an alloca() call in the caller's call frame, but the callee's call frame is "in the way". For the same reason, it's hard to fuse regions with generational scavenging in this case, though there are some "partial compaction" techniques that you can use. A region system can tell us that references into a region do not > escape, may escape, or clearly escape via the return value. When a > region escapes via return, region inference requires that the caller > take responsibility for disposal of the region (usually, specifically > the value). Yes. > It is perfectly acceptible for the value itself to live > on the heap. > >From a "how to think about regions" perspective, I agree. From a "why are regions pragmatically useful" perspective, not so much. If you put the region-governed objects into the GC heap, then they are still subject to GC, and the main practical benefit of regions was GC reduction. > I find the most thorough introduction to regions to be Mads Tofte and > Lars Birkedal: "A Region Inference Algorithm", ACM Transactions on > Programming Languages and Systems, Vol 20, No 5, July 1998. > Yes. This is one of the better papers on the subject. One thing I *haven't* stumbled across is a paper or writeup that covers the same material in a more non-researcher-friendly sort of way. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
