Here are some timings on my system with your basic stats patch: These results are taken when the first command input is expected, having keyed-ahead the N to avoid delays.
CVS + COW: (using your original cow patch) Took 36.080085 seconds. A total of 2412496 bytes were allocated A total of 18 DOD runs were made A total of 279 collection runs were made Copying a total of 449220208 bytes There are 10775 active Buffer structs There are 88064 total Buffer structs Pure CVS : Took 8.239516 seconds. A total of 105072 bytes were allocated A total of 165 DOD runs were made A total of 168 collection runs were made Copying a total of 12004304 bytes There are 2470 active Buffer structs There are 22528 total Buffer structs Old 'local' (this is my last version before your refactoring exercise) Took 4.734133 seconds. A total of 352256 bytes were allocated A total of 257 DOD runs were made A total of 6 collection runs were made Copying a total of 97761 bytes There are 1311 active Buffer structs There are 1760 total Buffer structs New 'grey' (basically the same as CVS plus the grey1 patch) Took 4.458817 seconds. A total of 221184 bytes were allocated A total of 44 DOD runs were made A total of 42 collection runs were made Copying a total of 2355521 bytes There are 2008 active Buffer structs There are 22528 total Buffer structs This is with the random behaviour, but I doubt that makes any really significant differences. I am rather concerned about the total Buffer structs numbers: your cow version allocates 4x as many as cvs. It doesn't seem to be anything that should be affected by the cow logic. Does it get better with your reclaimable changes? From my previous benchmarks I remember that DOD is one of our most expensive operations, and that is dependent on the number of allocated objects. Note that I get half the memory usage with grey that you do, even though we should be running the same code; but it is still double current cvs. Out of interest, try adding a sweep and collect in instructions.pasm at label getout, so the reported active buffers and memory use are as accurate as we can make them. Which makes me think of something - grey is ignoring reclaimable to get the size to allocate for the post-compaction pool, therefore the memory usage is always going to be higher than is actually needed - are we simply looking at excess allocation here, rather than excess usage? If so, grey will fix it in the next release with paged memory allocation; and I'm sure you'll think of a solution also. -- Peter Gibbs EmKel Systems