On Monday, 24 September 2018 at 14:31:45 UTC, Steven
Schveighoffer wrote:
Why is the overhead so big for a single allocation of an array
with elements containing no indirections (which the GC doesn't
need to scan for pointers).
It's not scanning the blocks. But it is scanning the stack.
Ok, I modified the code to be
import std.stdio;
void* mallocAndFreeBytes(size_t byteCount)()
{
import core.memory : pureMalloc, pureFree;
void* ptr = pureMalloc(byteCount);
pureFree(ptr);
return ptr; // for side-effects
}
void main(string[] args)
{
import std.datetime.stopwatch : benchmark;
import core.time : Duration;
immutable benchmarkCount = 1;
// GC
static foreach (const i; 0 .. 31)
{
{
enum byteCount = 2^^i;
const Duration[1] resultsC =
benchmark!(mallocAndFreeBytes!(i))(benchmarkCount);
writef("%s bytes: mallocAndFreeBytes: %s nsecs",
byteCount,
cast(double)resultsC[0].total!"nsecs"/benchmarkCount);
import core.memory : GC;
auto dArray = new byte[byteCount]; // one Gig
const Duration[1] resultsD =
benchmark!(GC.collect)(benchmarkCount);
writefln(" GC.collect(): %s nsecs after %s",
cast(double)resultsD[0].total!"nsecs"/benchmarkCount, dArray.ptr);
dArray = null;
}
}
}
I still be believe these numbers are absolutely horrible
1 bytes: mallocAndFreeBytes: 400 nsecs GC.collect(): 21600 nsecs
after 7F1ECC0B1000
2 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 20800 nsecs
after 7F1ECC0B1010
4 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 20500 nsecs
after 7F1ECC0B1000
8 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 20300 nsecs
after 7F1ECC0B1010
16 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 23200 nsecs
after 7F1ECC0B2000
32 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 19600 nsecs
after 7F1ECC0B1000
64 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 17800 nsecs
after 7F1ECC0B2000
128 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 16600
nsecs after 7F1ECC0B1000
256 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 16200
nsecs after 7F1ECC0B2000
512 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 15900
nsecs after 7F1ECC0B1000
1024 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 15700
nsecs after 7F1ECC0B2000
2048 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14600
nsecs after 7F1ECC0B1010
4096 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 14400
nsecs after 7F1ECC0B2010
8192 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14200
nsecs after 7F1ECC0B4010
16384 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14100
nsecs after 7F1ECC0B7010
32768 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14200
nsecs after 7F1ECC0BC010
65536 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14200
nsecs after 7F1ECC0C5010
131072 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14200
nsecs after 7F1ECC0D6010
262144 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 14200
nsecs after 7F1ECC0F7010
524288 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 17500
nsecs after 7F1ECAC14010
1048576 bytes: mallocAndFreeBytes: 200 nsecs GC.collect(): 18000
nsecs after 7F1ECAC95010
2097152 bytes: mallocAndFreeBytes: 500 nsecs GC.collect(): 18700
nsecs after 7F1ECAD96010
4194304 bytes: mallocAndFreeBytes: 300 nsecs GC.collect(): 20000
nsecs after 7F1ECA514010
8388608 bytes: mallocAndFreeBytes: 400 nsecs GC.collect(): 61000
nsecs after 7F1EC9913010
16777216 bytes: mallocAndFreeBytes: 24900 nsecs GC.collect():
27100 nsecs after 7F1EC8112010
33554432 bytes: mallocAndFreeBytes: 800 nsecs GC.collect(): 36600
nsecs after 7F1EC5111010
67108864 bytes: mallocAndFreeBytes: 600 nsecs GC.collect(): 57900
nsecs after 7F1EBF110010
134217728 bytes: mallocAndFreeBytes: 500 nsecs GC.collect():
98300 nsecs after 7F1EB310F010
268435456 bytes: mallocAndFreeBytes: 700 nsecs GC.collect():
175700 nsecs after 7F1E9B10E010
536870912 bytes: mallocAndFreeBytes: 600 nsecs GC.collect():
326900 nsecs after 7F1E6B10D010
1073741824 bytes: mallocAndFreeBytes: 900 nsecs GC.collect():
641500 nsecs after 7F1E0B04B010
How is it possible for the GC to be 500-1000 times slower than a
malloc-free call for a single array containing just bytes with no
indirections for such a simple function!!!?
I really don't understand this...