I've been trying out D's new 64-bit compiler and a serious barrier to using it effectively seems to be abysmal garbage collection performance with large heaps. It seems like the time for a garbage collection to run scales linearly with the size of the heap *even if most of the heap is marked as NO_SCAN*. I'm running a program with a heap size of ~6GB, almost all of which is strings (DNA sequences), which are not scanned by the GC. It's spending most of its time in GC, based on pausing it every once in a while in GDB and seeing what's at the top of the stack.

Here's a test program and the results for a few runs.

import std.stdio, std.datetime, core.memory, std.conv;

void main(string[] args) {
    if(args.length < 2) {
        stderr.writeln("Need size.");
        return;
    }

    immutable mul = to!size_t(args[1]);
    auto ptr = GC.malloc(mul * 1_048_576, GC.BlkAttr.NO_SCAN);

    auto sw = StopWatch(autoStart);
    GC.collect();
    immutable msec = sw.peek.msecs;
    writefln("Collected a %s megabyte heap in %s milliseconds.",
        mul, msec);
}

Outputs for various sizes:

Collected a 10 megabyte heap in 1 milliseconds.
Collected a 50 megabyte heap in 4 milliseconds.
Collected a 200 megabyte heap in 16 milliseconds.
Collected a 500 megabyte heap in 41 milliseconds.
Collected a 1000 megabyte heap in 80 milliseconds.
Collected a 5000 megabyte heap in 397 milliseconds.
Collected a 10000 megabyte heap in 801 milliseconds.
Collected a 30000 megabyte heap in 2454 milliseconds.
Collected a 50000 megabyte heap in 4096 milliseconds.

Note that these tests were run on a server with over 100 GB of physical RAM, so a shortage of physical memory isn't the problem. Shouldn't GC be O(1) with respect to the size of the unscanned portion of the heap?

Reply via email to