On Fri, 30 Oct 2009 08:08:10 +0300, dsimcha <dsim...@yahoo.com> wrote:
After a few evenings of serious hacking, I've integrated precise heap
scanning
into the GC. Right now, I still need to test it better and debug it,
but it
at least basically works. I also still need to write the templates to
generate bit masks at compile time, but this is a simple matter of
programming.
A few things:
1. Who knows how to write some good stress tests to make sure this
works?
2. I'm thinking about how to write the bitmask templates. In the next
release of DMD, when static arrays are value types and returnable from
functions, will they be returnable from functions in CTFE?
3. new only takes RTTI. It is not a template. Unless RTTI gets
bitmasks in
the format I created (which I'll document once I clean things up and
release,
but has only deviated slightly from what I had talked about here), stuff
allocated using it won't be able to take advantage of precise heap
scanning.
The default bitmask, if none is provided, uses good (bad) old-fashioned
conservative scanning unless the entire block has no pointers, in which
case
it isn't scanned. This means that we have all the more incentive to
replace
new with a template of some kind.
4. I solved the static array problem, but the solution required using
up one
of the high-order bits. We have at least one more to play with in my
bitmask
scheme, because I'm storing things by word offsets, not byte offsets,
since
the GC isn't supposed to work with misaligned pointers anyhow. This
leaves
one more bit before we start limiting T.sizeof to less than full address
space
(on 32-bit, where a word is 4 bytes). I think it needs to be reserved
for
pinning, in case a copying collector ever gets implemented. If we're
willing
to not let any precisely scanned object have a T.sizeof of more than
half the
address space (a ridiculously minor limitation; this does not limit the
size
of arrays, only the size of classes and the elements of an array), we
could
throw in a third bit for weak references.
Blaze (http://www.dsource.org/projects/blaze) is often suggested for
stress-testing the GC. Probably, because it does huge amount of dynamic
allocations, while total amount of memory consumed is about the same.
Worth a note, it's for D1/Tango, but you said you were going to port it to
Tango, too, so it may be better to start with Tango (because there are a
lot more code written against Tango and you get instant user feedback) and
then port it to druntime. If not a performance test, it may be a good
correctness test (so that you don't collect memory which is still
referenced).