First of all, thank you Raymond for your answer, and a happy new year to you and to all the Python group :-)
Raymond: >#3 is a fine choice. It is memory efficient -- the repeat() itertool takes-up >only a few bytes. It doesn't need psyco, you already have to fast C routines >talking to each other without having to go through the interpreter loop.< In my code I have found otherwise, so to be more sure I have just done few benchmarks on a Pentium3 CPU, 256 MB RAM, ActivePython 2.5.1.1, on Win. It may sound unfair, but I think it's useful to have some basic reference timing, so I taken timings from a program written in D language too (it's like a better C++ mixed with some Python and other things), this is the D code: import std.stdio, std.conv, std.string, std.c.stdlib, std.c.time; void main(string[] args) { int n = toInt(args[1].replace("_", "")); auto t0 = clock(); if (args[2].strip() == "1") auto a = new int[n]; else auto a = cast(int*)calloc(n, int.sizeof); auto t1 = clock(); writefln("%.2f s", (t1-t0)/cast(double)CLOCKS_PER_SEC); } As input it takes n and 1 or 2, "Warm" timings: n time-1 time-2 1_000_000 0.04 0.04 10_000_000 0.44 0.42 30_000_000 1.32 1.26 In both cases the a array is initialized to all 0. D "int" is 4 always bytes. The memory used by this little program is essentially what you ask for, so it is 4 MB for n = 1 million, 40 MB for n = 10 millions, and 120 MB for n = 30 millions (plus few other small things, like the stack, etc. All the other extra memory is probably less than 800 KB.). This is the Python code, as input it takes n and 1, 2 or 3: from array import array from itertools import repeat from time import clock from sys import argv def main(): n = int(argv[1].replace("_", "")) choice = int(argv[2]) assert choice in [1, 2, 3] if choice == 1: t0 = clock() a = array("l", xrange(n)) t1 = clock() if choice == 2: t0 = clock() a = array("l", [0] * n) t1 = clock() if choice == 3: t0 = clock() a = array("l", repeat(0, n)) t1 = clock() print round(t1 - t0, 2), "s" assert len(a) == n main() "Warm" timings: Time (seconds): n #1 #2 #3 1_000_000 1.98 0.79 1.91 10_000_000 21.73 7.97 21.13 30_000_000 70. 28.55 65.3 Approximate peak MB RAM used: n #1 #2 #3 1_000_000 9 10.3 6.6 10_000_000 60 81 64 30_000_000 184 210+ 200+ At runtime the Python interpreter plus other little things take some memory, about 1.7 MB. In the cases with the "+" sign the RAM was not enough, and I've had swaps, so timings are probably different if you have more RAM (but I think this benchmark is significant anyway, because IMHO an array of integers of 120 MB isn't supposed to make your OS swap your 256 MB of memory if you allocate it properly). To me it seems that array("l",repeat(0,n)) eats lot of memory, and it's quite slower than array("l",[0]*n) too. (I presume array("l",repeat(0,n)) converts the python int 0 to the signed 4-byte int over and over again...). So I belive still an allocate-like method may be useful to the array object. It may allocate the data structure with n=30 millions in probably less than 2 seconds. In the meantime such arrays I presume I'll use an external numerical lib :-) Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list