[Felix-language] GC reimplementation

skaller Fri, 25 May 2007 22:07:16 -0700

The current Felix GC is rather slow and heavy. Now I have Judy arrays
working, I'm considering changing it. Felix current has something
like a 6 word overhead on heap blocks, and, interior pointers
use a special Felix pointer type:


        struct { void *data; offset_t offset; }

so that pointers interior to a block don't hide the block
address from the garbage collector.

A JudyL array is an map word -> word, where word is
unsigned long/void*. JudyL provides insert, delete,
find, find_greater, find_greater_equal, find_less,
find_less_equal.

STAGE 1a REDESIGN
****************

The first approximation to a new GC design
is to use a Judy array:

        heap_address -> block_length

instead of the current doubly linked list of heap blocks,
saving two words per block at the cost of a JudyL array,
and throw out Felix pointers altogether: just use C
pointers.

When a block is allocated we put its address and length
in the JudyL array.

Now, when the gc gets a pointer p, it does this:

        1. get the block p0 less_equal to p
        2. Lookup its length n
        3. If p is interior to the block (p0<=p<p0+n)
             follow the pointer p0

So we can check if ANY raw pointer is a Felix heap pointer
OR interior to any Felix heap block in O(1) time.

This is SLOWER than the current technique, where every pointer
MUST be a pointer to a heap block (and, not interior).

However pointer deref is faster (no need to add any offset),
and foreign C pointers can ALSO be used anywhere a Felix pointer
was required before.

The GC can use the JudyL iterators to find all the blocks.
Again, this is slower than the linked list.

The JudyL array also supports fast manual deletion -- which was the
main reason to use a doubly linked list instead of a singly
linked list.

STAGE 1b REDESIGN
*****************

The above design still costs a word for the gc flags.
A mark/sweep collector first marks all words, then sweeps
from the roots clearing everything it finds, any blocks
remaining marked are unreachable, and thus garbage.

We can save the current flag word -- 64 bits on a amd64,
but only 1 bit is used -- by stealing 1 bit from the
length. This bit is actually free anyhow, since all allocated
store is going to be at least have even alignment.

We can go further if we're willing to either 

(a) increase the alignment
(b) decrease the maximum block size

to give us more bits. In particular we could number each
RTTTI type object 0, 1, 2 .. etc instead of using an address.
Felix can generate an array of RTTI pointers. We store the
index in the 'length' word.

This means the type of a heap object, which is used to
locate the pointers in it, can be encoded in the JudyL array,
which means we save another word in the heap object.

In fact .. we just saved ALL the words in a non-array object:
the overhead is now one word in the JudyL array, plus the 
Judy cost of storing the heap block address as a key.
Judy specialises in reducing that cost, hopefully to less
than one word per key, whilst still providing O(1) access.

Note that the access is still 8 lookups on a 64 bit machine,
which is NOT fast, even if the higher level blocks are
locked in the CPU cache. Accessing contiguous store is much
faster as such, however the extra space, which is used
even when there's no need to access any of the GC stuff,
will fill the CPU cache more quickly during normal application
processing.

So .. in summary, JudyL array based GC will be a bit slower
collecting that the current design, but applications may
run faster, and, the Felix/C pointer dichotomy will disappear.

Another interesting quirk is the 'Unknown Type' type.
This is the type where we don't know where the pointers are.
So we assume EVERY word in the heap block is a pointer ..
and we have a conservative garbage collector.

Also note: Felix can be 'given' foreign pointers to manage,
provided they're allocated with our allocator, which currently
is just malloc().

Just about the only thing to comment now .. is that perhaps
we might as well just switch to the Boehm-Reiser-Detlefs collector
and be done with it :)


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language

[Felix-language] GC reimplementation

Reply via email to