Felipe Almeida Lessa wrote:
> def readlines(self, sizehint=None):
> if sizehint is None:
> return self.read().splitlines(True)
> # ...
>
> Is it okay? Or is there any embedded problem I couldn't see?
It's dangerous, if the file is really large - it might exhaust
your mem
Em Qua, 2006-03-22 às 00:47 +0100, "Martin v. Löwis" escreveu:
> Caleb Hattingh wrote:
> > What does ".readlines()" do differently that makes it so much slower
> > than ".read().splitlines(True)"? To me, the "one obvious way to do it"
> > is ".readlines()".
[snip]
> Anyway, decompressing the entir
Caleb Hattingh wrote:
> What does ".readlines()" do differently that makes it so much slower
> than ".read().splitlines(True)"? To me, the "one obvious way to do it"
> is ".readlines()".
readlines reads 100 bytes (at most) at a time. I'm not sure why it
does that (probably in order to not read fu
Hi Peter
Clearly I misunderstood what Martin was saying :)I was comparing
operations on lines via the file generator against first loading the
file's lines into memory, and then performing the concatenation.
What does ".readlines()" do differently that makes it so much slower
than ".read().sp
Bill wrote:
> Is there something that can be improved in the Python version?
Seems like GzipFile.readlines is not optimized, file.readline works
better:
C:\py>python -c "file('tmp.txt', 'w').writelines('%d This is a test\n'
% n for n in range(1))"
C:\py>python -m timeit "open('tmp.txt').read
Bill wrote:
> I've written a small program that, in part, reads in a file and parses
> it. Sometimes, the file is gzipped. The code that I use to get the
> file object is like so:
>
> if filename.endswith(".gz"):
> file = GzipFile(filename)
> else:
> file = open(filename)
>
> Then I par
Caleb Hattingh wrote:
> I tried this:
>
> from timeit import *
>
> #Try readlines
> print Timer('import
> gzip;lines=gzip.GzipFile("gztest.txt.gz").readlines();[i+"1" for i in
> lines]').timeit(200) # This is one line
>
>
> # Try file object - uses buffering?
> print Timer('import gzip;[i+"1"
I tried this:
from timeit import *
#Try readlines
print Timer('import
gzip;lines=gzip.GzipFile("gztest.txt.gz").readlines();[i+"1" for i in
lines]').timeit(200) # This is one line
# Try file object - uses buffering?
print Timer('import gzip;[i+"1" for i in
gzip.GzipFile("gztest.txt.gz")]').time
Bill wrote:
> The Java version of this code is roughly 2x-3x faster than the Python
> version. I can get around this problem by replacing the Python
> GzipFile object with a os.popen call to gzcat, but then I sacrifice
> portability. Is there something that can be improved in the Python
> version