Caleb Hattingh wrote: > I tried this: > > from timeit import * > > #Try readlines > print Timer('import > gzip;lines=gzip.GzipFile("gztest.txt.gz").readlines();[i+"1" for i in > lines]').timeit(200) # This is one line > > > # Try file object - uses buffering? > print Timer('import gzip;[i+"1" for i in > gzip.GzipFile("gztest.txt.gz")]').timeit(200) # This is one line > > Produces: > > 3.90938591957 > 3.98982691765 > > Doesn't seem much difference, probably because the test file easily > gets into memory, and so disk buffering has no effect. The file > "gztest.txt.gz" is a gzipped file with 1000 lines, each being "This is > a test file".
$ python -c"file('tmp.txt', 'w').writelines('%d This is a test\n' % n for n in range(1000))" $ gzip tmp.txt Now, if you follow Martin's advice: $ python -m timeit -s"from gzip import GzipFile" "GzipFile('tmp.txt.gz').readlines()" 10 loops, best of 3: 20.4 msec per loop $ python -m timeit -s"from gzip import GzipFile" "GzipFile('tmp.txt.gz').read().splitlines(True)" 1000 loops, best of 3: 534 usec per loop Factor 38. Not bad, I'd say :-) Peter -- http://mail.python.org/mailman/listinfo/python-list