GC for pure functions -- implementation ideas

Don Fri, 15 Apr 2011 13:16:21 -0700

I noticed a lively discussion in Bugzilla about the GC, with speculationabout the impact of a precise GC on speed.But it seems to me that a dedicated GC for pure functions has enormousunexplored potential, and might be relatively easy to implement.


LEAKY FUNCTIONS


Define a 'leaky' pure function as a pure function which can return
heap-allocated memory to the caller, ie, where the return value or a
parameter passed by reference has at least one pointer or reference
type. This can be determined simply by inspecting the signature. (Note
that the function does not need to be immutably pure).

The interesting thing is that heap allocation inside non-leaky pure
functions behaves like stack allocation. When you return from that

function, *all* those variables are unreachable, and can be discarded enmasse. Here's an idea of how to exploit this.


THE PURE HEAP

Create a pure heap for each thread. This is a heap which can only be
used by pure functions. I present some simplistic code, with the

simplest possible implementation: just a big block of memory with athread local 'stack pointer' which points to the first free slot.


static ubyte *heap; // initialized to big chunk of RAM.
static size_t stackptr = 0;
static size_t savedstackptr = 0;

For *non-leaky* pure functions: if any of the functions it calls areleaky, or if it makes any memory allocations, then call a HeapEnterfunction (in the druntime) at the start, and a HeapExit function at theend. Leaky pure functions don't get this prologue and epilogue code.Non-leaky pure functions that don't do memory allocation are simplyignored. (Note that the compiler can determine if a function makes anymemory allocations, simply by inspecting its body -- it isn't any moredifficult than checking if it is nothrow).


void pureHeapEnter()
{
    cast(ubyte *)(heap + stackptr) = savedstackptr;
    savedstackptr = stackptr;
    stackptr += size_t.sizeof;
}

void pureHeapExit()
{
    stackptr = savedstackptr;  // instant GC!!
    savedstackptr = cast(ubyte *)(heap +stackptr);
}

The pureHeapExit function has the effect of instantly (and precisely!)

collecting all of the memory allocated in the non-leaky pure functionand in every leaky function that it called.


In any pure function, leaky or non-leaky, when memory is allocated, call

pureMalloc instead of gcMalloc when allocating. (Non-leaky purefunctions will of course always allocate on the pure heap.).


void *pureMalloc(int nbytes)
{
    if (!stackptr)
        return gcMalloc(nbytes); // we're leaky, do a normal malloc
    // we can use the pure heap
    auto r = heap + stackptr;
    stackptr += nbytes;
    return r;
}

REFINEMENTS

We can make this scheme more generally applicable. If there is a leakyreturn value which is cheap to copy, then we can pretend the function isnon-leaky: at exit, if we were called with stackptr == 0, then we copy(deepdup) the return value to the gc heap, before calling pureHeapExit.If stackptr was non-zero, we don't need to copy it.


COMPLICATIONS

Classes with finalizers are an annoying complication. But again, we canlook at all the functions we call, and all the 'new' operations weperform, to see if any finalizers exist. Maybe we could even have aseparate finalizer heap?

Exceptions are the biggest nuisance, since they can also leakheap-allocated memory. A catch handler in a non-pure function would needto check to see if the pure heap 'stackpointer' is non-zero, and if so,it would need to do a deep dup of the exception, then clear the pureheap. Any pure function (leaky or not) which contains a catch handlerwould need to record the value of the savedstackptr at entry to thefunction, and the catch handler would need to unwind the pure heap untilwe get back to it.


In reality, things are going to be a bit more complicated than this. But

it seems to me that conceptually, something like this could still stayfairly simple and be very, very fast. With no changes required to thelanguage, and not even any changes required to existing code.

GC for pure functions -- implementation ideas

Reply via email to