Sounds promising. How does it effect other cases? Some typical GC-heavy benchmark? Lots of smaller no scan objects that are just under your optimization threshold?
dsimcha Wrote: > http://d.puremagic.com/issues/show_bug.cgi?id=5623 > > I've found a way to speed up the GC massively on large heaps without > excessive ripple effects. Technically it's still O(N), but with about a > hundred fold smaller constant in the case of large heaps with most stuff > not scanned. Now, I think the O(N) (where N is the total size of the > heap) term has such a small constant that it's for almost all practcal > purposes the GC is O(S) (where S is the size of the scanned portion of > the heap). It also no longer has any O(N^2) pathological case (which I > had discovered while reading the code). > > So far all unittests for Phobos, dstats and > std.parallelism/parallelfuture pass with this enabled. Please test some > other code so we can wring out the corner cases in time for the next > release. > > Basically all I did was diverge the Pool struct slightly into large and > small object sub-varieties. The large object sub-variety is used to > allocate objects of at least a page. It only stores gcbits at page-size > offsets, and tracks the offsets of B_PAGEPLUS bins from the nearest > B_PAGE bin so that they can be found in O(1). > > I also added a field to the Pool struct so that the number of free pages > in a pool can be tracked in O(1). This should drastically lessen the > time it takes to perform large allocations on large heaps. Right now a > free memory region is found by a linear search through the pools in the > case of large allocations. Unfortunately, I don't see any easy way to > fix this. This patch at least allows short circuiting a large number of > pools, if there isn't enough free space in the whole pool, let alone > contiguous space. > > Here are the benchmarks with this patch enabled. > > Collected a 10 megabyte heap in 0 milliseconds. > Collected a 50 megabyte heap in 0 milliseconds. > Collected a 250 megabyte heap in 1 milliseconds. > Collected a 500 megabyte heap in 0 milliseconds. > Collected a 1000 megabyte heap in 1 milliseconds. > Collected a 5000 megabyte heap in 3 milliseconds. > Collected a 10000 megabyte heap in 6 milliseconds. > Collected a 30000 megabyte heap in 16 milliseconds. > Collected a 50000 megabyte heap in 26 milliseconds.