https://issues.dlang.org/show_bug.cgi?id=17881
--- Comment #4 from Steven Schveighoffer <schvei...@yahoo.com> --- Stevens-MacBook-Pro:testd steves$ cat testpreallocate.d struct S { S *next; } void main() { version(freelist) { S* head = null; foreach(i; 0 .. 20_000_000) head = new S(head); } version(preallocate) { S* head = null; auto x = new S[20_000_000]; foreach(i; 0 .. 20_000_000) { x[i] = S(head); head = &(x[i]); } } } Stevens-MacBook-Pro:testd steves$ dmd -O -release testpreallocate.d -version=freelist Stevens-MacBook-Pro:testd steves$ time ./testpreallocate real 0m1.869s user 0m1.750s sys 0m0.114s Stevens-MacBook-Pro:testd steves$ dmd -O -release testpreallocate.d -version=preallocate Stevens-MacBook-Pro:testd steves$ time ./testpreallocate real 0m0.111s user 0m0.045s sys 0m0.062s The point is that the GC is not well-equipped to handle the tight allocation loop. The second version has the drawback that all 20 million elements will remain in memory as long as there is one element still alive. What I'm looking for is something that has the performance (or similar, I realize it won't be as good) of the second, but can collect each block individually like the first. --