== Quote from dsimcha (dsim...@yahoo.com)'s article > 2. gcx.d appears to cache the last block size query. This means that > repeatedly querying the same block in a single threaded program (where > thread_needLock() returns false and no lock is necessary) is very fast. This > is true in both the old Phobos GC and the druntime GC. I wonder if this was > somehow bypassed by the ~= operator when druntime was integrated with DMD in > its early days.
On further examination, this is clearly somehow related to caching. Here is a very similar test program, that appends to tow arrays instead of one: import std.stdio, std.perf; void main() { scope pc = new PerformanceCounter; pc.start; uint[] foo, bar; foreach(i; 0..1_000_000) { foo ~= i; bar ~= i; } pc.stop; writeln(pc.milliseconds); } Timings: DMD 2.019: ~1800 ms DMD 2.029: ~2300 ms (Note: Still slower but not by as much even in absolute terms) DMD 2.029 (Using Appender instead of ~=): 49 ms By appending to two arrays, we screw up the caching scheme, hence much poorer performance. However most use cases probably involve appending to only one array at a time. Since this is a clear regression, I'll file a bug report.