Re: Hashtable+GC problems

Mike Lambert Sun, 26 May 2002 15:25:46 -0700

> Ok, I'll finish off the original conversion to indexed access that I
> began once, before giving up in disgust. The problem is not just that
> you have to use indices instead of pointers; it's also that you have
> to constantly go back to the buffer header before you can get
> anywhere. That needs to be hidden by a macro or (my preference) an
> inline function, and slows down the common case. Also, you lose the
> clean sentinel value NULL (index 0 is definitely valid; index -1
> introduces signedness problems.)


Dan says it won't be slow. So nyah! :P

> Let me know if you've already started a rewrite, though, so I don't
> just redo it.

Sorry, I forgot to reply earlier...no I hadn't started work on a rewrite.

> Something about the whole setup just feels wrong. GC shouldn't be this
> intrusive. And it definitely shouldn't slow down the common case by
> making the compiler defensively reload a buffer pointer every few
> instructions (it'll be cached, but it fouls up reordering.)

Alright. Today I discovered tracked headers. :)

What are wrong with these for hashtable buckets?  These are headers, and
so are immobile. You can allocate lots of them without having to pay much
of a price in terms of instructions.

One drawback of tracked headers is the loss of cache coherency over time
as the tracked headers end up getting spread out over memory, and then
large allocations get interspersed into the various holes, with no
locality. Hopefully we can get away with this due to the studies which
have shown that objects tend to live and die in groups (and thus allocate
and free up lots of memory all at once).

Another problem is that these tracked headers aren't DOD'ed at all. This
means you have to explicitly free them with add_to_free_pool (I'm not sure
what the design of tracked headers is supposed to be....who is the
'tracked' referring to? Is user code or GC code supposed to be tracking
them?). Since all of the buckets in your hashtable should be available
from the hash itself, it should be easy to manage them yourself.

In addition to not being dod'ed themselves, they don't mark other objects
as live themselves. So you'd have to handle all your tracked headers in
your PMC, going through them yourself and handling any buffer/pmcs they
might point at.

Now you have immobile memory that's efficient to allocate, good at
avoiding memory fragmentation, and good for you to do with what you
please.

Once we figure out how hashes are implemented well, we should probably
write up some guidelines on when to use what kinds of headers, et.c

Thoughts?
Mike Lambert

Re: Hashtable+GC problems

Reply via email to