On Thursday, 18 February 2016 at 13:10:28 UTC, Dicebot wrote:
On 02/18/2016 02:00 PM, Witek wrote:
So, the question is, why is D / DMD allocator so slow under
heavy multithreading? The working set is pretty small (few
megabytes at most), so I do not think this is an issue with GC
scanning itself. Can I plug-in tcmalloc / jemalloc, to be
used as the underlying allocator, instead of using glibc? Or
is D runtime using mmap/srbk/etc directly?
DMD/druntime use stop-the-world GC implementation. That means
every time anything needs to be allocated there is possibility
of collection cycle which will pause execution of all threads.
It doesn't matter if there is much garbage - the very pause is
the problem.
Using allocations not controlled by GC (even plain malloc)
should change
situation notably.
The working set is pretty small, and I do not think GC collection
is triggered at all. I think the allocations itself are slow /
not scalable to multiple threads.
I am going to check some GC stats, or disable GC collections
completely, add some explicit 'scope(exit) delete ...;' where
necessary, and see if anything changes.