Abrahams, Max <[EMAIL PROTECTED]> wrote: > > I've looked into pickle, dump, load, save, readlines(), etc. > > Which is the best method? Fastest? My lists tend to be around a thousand to > a million items. > > Binary and text files are both okay, text would be preferred in > general unless there's a significant speed boost from something > binary.
You could try the marshal module which is very vast, lightweight and built in. http://www.python.org/doc/current/lib/module-marshal.html It makes a binary format though, and it will only dump "simple" objects - see the page above. It is what python uses internally to make .pyc files from .py I believe. ------------------------------------------------------------ #!/usr/bin/python import os from marshal import dump, load from timeit import Timer def write(N, file_name = "z.marshal"): L = range(N) out = open(file_name, "wb") dump(L, out) out.close() print "Written %d bytes for list size %d" % (os.path.getsize(file_name), N) def read(N): inp = open("z.marshal", "rb") L = load(inp) inp.close() assert len(L) == N for log_N in range(7): N = 10**log_N loops = 10 write(N) print "Read back %d items in" % N, Timer("read(%d)" % N, "from __main__ import read").repeat(1, loops)[0]/loops, "s" ------------------------------------------------------------ Produces $ ./test-marshal.py Written 10 bytes for list size 1 Read back 1 items in 4.14133071899e-05 s Written 55 bytes for list size 10 Read back 10 items in 4.31060791016e-05 s Written 505 bytes for list size 100 Read back 100 items in 8.23020935059e-05 s Written 5005 bytes for list size 1000 Read back 1000 items in 0.000352478027344 s Written 50005 bytes for list size 10000 Read back 10000 items in 0.00165479183197 s Written 500005 bytes for list size 100000 Read back 100000 items in 0.0175776958466 s Written 5000005 bytes for list size 1000000 Read back 1000000 items in 0.175704598427 s -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list