Sounds promising. How does it effect other cases? Some typical GC-heavy 
benchmark? Lots of smaller no scan objects that are just under your 
optimization threshold?

dsimcha Wrote:

> http://d.puremagic.com/issues/show_bug.cgi?id=5623
> 
> I've found a way to speed up the GC massively on large heaps without 
> excessive ripple effects.  Technically it's still O(N), but with about a 
> hundred fold smaller constant in the case of large heaps with most stuff 
> not scanned.  Now, I think the O(N) (where N is the total size of the 
> heap) term has such a small constant that it's for almost all practcal 
> purposes the GC is O(S) (where S is the size of the scanned portion of 
> the heap).  It also no longer has any O(N^2) pathological case (which I 
> had discovered while reading the code).
> 
> So far all unittests for Phobos, dstats and 
> std.parallelism/parallelfuture pass with this enabled.  Please test some 
> other code so we can wring out the corner cases in time for the next 
> release.
> 
> Basically all I did was diverge the Pool struct slightly into large and 
> small object sub-varieties.  The large object sub-variety is used to 
> allocate objects of at least a page.  It only stores gcbits at page-size 
> offsets, and tracks the offsets of B_PAGEPLUS bins from the nearest 
> B_PAGE bin so that they can be found in O(1).
> 
> I also added a field to the Pool struct so that the number of free pages 
> in a pool can be tracked in O(1).  This should drastically lessen the 
> time it takes to perform large allocations on large heaps.  Right now a 
> free memory region is found by a linear search through the pools in the 
> case of large allocations.  Unfortunately, I don't see any easy way to 
> fix this.  This patch at least allows short circuiting a large number of 
> pools, if there isn't enough free space in the whole pool, let alone 
> contiguous space.
> 
> Here are the benchmarks with this patch enabled.
> 
> Collected a 10 megabyte heap in 0 milliseconds.
> Collected a 50 megabyte heap in 0 milliseconds.
> Collected a 250 megabyte heap in 1 milliseconds.
> Collected a 500 megabyte heap in 0 milliseconds.
> Collected a 1000 megabyte heap in 1 milliseconds.
> Collected a 5000 megabyte heap in 3 milliseconds.
> Collected a 10000 megabyte heap in 6 milliseconds.
> Collected a 30000 megabyte heap in 16 milliseconds.
> Collected a 50000 megabyte heap in 26 milliseconds.

Reply via email to