Among the new things done for Python 2.7 (currently in alpha) they have improved the GC:
http://bugs.python.org/issue4074 http://bugs.python.org/issue4688 There they have shown a small Python program named "tuple_gc_hell.py", this is a reduced version: import time, gc def list_of_tuples(n): m = n // 10 l = [] prev_time = time.time() for i in xrange(n): if i % m == 0: cur_time = time.time() print i, round(cur_time - prev_time, 2) prev_time = cur_time l.append((i % 2, i % 12)) #gc.disable() #import psyco; psyco.full() list_of_tuples(10 * 1000 * 1000) Its output: 0 0.0 1000000 1.2 2000000 2.58 3000000 3.98 4000000 4.89 5000000 6.64 6000000 8.05 7000000 9.44 8000000 9.84 9000000 12.2 Total running time: 73.3 s Disabling the GC and using Psyco JIT compiler: 0 0.0 1000000 0.17 2000000 0.19 3000000 0.17 4000000 0.19 5000000 0.16 6000000 0.17 7000000 0.17 8000000 0.17 9000000 0.19 Total running time: 2.67 s Despite being a very different language, with a very different GC (Python has an improved reference counting GC) I have tried to test a similar D program (compiled with DMD v2.029): import std.c.time: clock; // where's CLOCKS_PER_SEC? import std.stdio: writeln; //import std.gc: disable; //import core.memory: disable; // where's the disable? void array_of_pairs(int n) { class S { int x, y; this(int x, int y) { this.x = x; this.x = y; } } int m = n / 10; S[] l; auto prev_time = clock(); for (int i; i < n; i++) { if (i % m == 0) { auto cur_time = clock(); writeln(i, " ", (cur_time - prev_time) / 1000.0); prev_time = cur_time; } l ~= new S(i % 2, i % 12); } } void main() { //disable(); array_of_pairs(10_000_000); } Its output is not good: 0 0 1000000 1.594 2000000 3.563 3000000 6.156 4000000 8.5 5000000 9.469 6000000 10.047 7000000 13.453 8000000 18.703 9000000 24.203 Total running time: 122.3 s In DMD2.029 array appends are very slow so in a successive version I have created the array up-front. And I have seen that pulling the class out of the array_of_pairs function speeds up the program significantly. To improve the situation even more I have used a struct instead of a class: import std.c.time: clock; import std.stdio: writeln; struct S { int x, y; this(int x, int y) { this.x = x; this.x = y; } } void array_of_pairs(int n) { int m = n / 10; auto l = new S*[n]; auto prev_time = clock(); for (int i; i < n; i++) { if (i % m == 0) { auto cur_time = clock(); writeln(i, " ", (cur_time - prev_time) / 1000.0); prev_time = cur_time; } l[i] = new S(i % 2, i % 12); } } void main() { //disable(); array_of_pairs(10_000_000); } Its output: 0 0 1000000 0.266 2000000 0.625 3000000 0.547 4000000 0.641 5000000 0.765 6000000 0.891 7000000 1.031 8000000 1.172 9000000 1.328 Total running time: 9.46 s Now it's much faster than Python (with GC and without JIT), but the GC shows having troules anyway. Just to be sure it's not a D2 problem I have done the following test with DMD v1.042 and Phobos1: import std.c.time: clock, CLOCKS_PER_SEC; import std.stdio: writefln; import std.gc: disable; struct S { int x, y; } void list_of_tuples(int n) { int m = n / 10; auto l = new S*[n]; auto prev_time = clock(); for (int i; i < n; i++) { if (i % m == 0) { auto cur_time = clock(); writefln(i, " ", (cur_time - prev_time) / cast(float)CLOCKS_PER_SEC); prev_time = cur_time; } l[i] = new S; l[i].x = i % 2; l[i].y = i % 12; } } void main() { disable(); list_of_tuples(10_000_000); } Its output: 0 0 1000000 0.297 2000000 0.344 3000000 0.359 4000000 0.391 5000000 0.421 6000000 0.438 7000000 0.484 8000000 0.5 9000000 0.532 Total running time: 4.96 About twice faster. Now that Phobos 2 has Tuple people will use it, and many of such tuples will not contain references, so I think an optimization similar to the one done for Python 2.7 may become useful. Bye, bearophile