At 1:30 PM +0100 1/6/03, Leopold Toetsch wrote:
Attached test program shows some additional effects of PMC size and
timing. A PMC is 32 byte, a SPMC is 16 byte, matching current and minimal PMC sizes for i386 (or a typical 32 bit system).

1) for linear access half sized PMCs give double mark speed. This is
in good relation to stress.pasm
Not at all surprising, as it doubles the cache density. OTOH, this also argues for moving the mark stuff out of the PMC struct entirely if we can manage it. A region of the PMC pool that's entirely mark area would up the cache density by a factor of three or four.

2) As we have only one free_list per pool, memory usage will get more
and more random, when e.g. parts of arrays are rewritten with new
values, which come from the free_list. The worst case behaviour will be
total random access to our pools data.
This is a very good point. It might be worth having an "allocate X PMCs into buffer Y" call that could potentially let us do more optimization here, since if we're allocating a full pool's worth of PMCs it may be more prudent to just allocate a full new pool and stuff them all into the single array.

For PMCs, the worst case random access takes about double the time of
a linear access. SPMCs take almost 3 times, but are still faster then
PMCs. But the advantage of smaller PMCs is almost gone.

3) Conclusion
To keep PMCs more tightly together when reusing them, these numbers
seem to indicate, that we might need a free_list per pool->arena,
which would need a pool->arena pointer in the PMC.
We could avoid this with properly aligned PMC pools, if we don't mind playing pointer masking games. Whether this would be faster is an open question, though.

4) With a pool->arena pointer in the PMC, we could also try to use a
spearate flags field, which could be 1 byte per PMC sized.
I don't think a single byte's enough for PMC flags, though it could be for internal use. (Heck, two bits may be enough, in which case it may be worth having a separate mark bitstring for live/dead marks)

5) On my system pools of different sizes are in totally different
ranges of memory (sbrk vs. mmap). This makes the quick bit mask test
in trace_mem_block totally unusable.
Yeah, but I'm not sure that is a problem as such--we're going to have to deal with multiple memory blocks anyway, so...
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk

Reply via email to