subject:"Re\: Python vs. Java gzip performance"

Re: Python vs. Java gzip performance

2006-03-22 Thread Martin v. Löwis

Felipe Almeida Lessa wrote: > def readlines(self, sizehint=None): > if sizehint is None: > return self.read().splitlines(True) > # ... > > Is it okay? Or is there any embedded problem I couldn't see? It's dangerous, if the file is really large - it might exhaust your mem

Re: Python vs. Java gzip performance

2006-03-22 Thread Felipe Almeida Lessa

Em Qua, 2006-03-22 às 00:47 +0100, "Martin v. Löwis" escreveu: > Caleb Hattingh wrote: > > What does ".readlines()" do differently that makes it so much slower > > than ".read().splitlines(True)"? To me, the "one obvious way to do it" > > is ".readlines()". [snip] > Anyway, decompressing the entir

Re: Python vs. Java gzip performance

2006-03-21 Thread Martin v. Löwis

Caleb Hattingh wrote: > What does ".readlines()" do differently that makes it so much slower > than ".read().splitlines(True)"? To me, the "one obvious way to do it" > is ".readlines()". readlines reads 100 bytes (at most) at a time. I'm not sure why it does that (probably in order to not read fu

Re: Python vs. Java gzip performance

2006-03-21 Thread Caleb Hattingh

Hi Peter Clearly I misunderstood what Martin was saying :)I was comparing operations on lines via the file generator against first loading the file's lines into memory, and then performing the concatenation. What does ".readlines()" do differently that makes it so much slower than ".read().sp

Re: Python vs. Java gzip performance

2006-03-17 Thread Serge Orlov

Bill wrote: > Is there something that can be improved in the Python version? Seems like GzipFile.readlines is not optimized, file.readline works better: C:\py>python -c "file('tmp.txt', 'w').writelines('%d This is a test\n' % n for n in range(1))" C:\py>python -m timeit "open('tmp.txt').read

Re: Python vs. Java gzip performance

2006-03-17 Thread Andrew MacIntyre

Bill wrote: > I've written a small program that, in part, reads in a file and parses > it. Sometimes, the file is gzipped. The code that I use to get the > file object is like so: > > if filename.endswith(".gz"): > file = GzipFile(filename) > else: > file = open(filename) > > Then I par

Re: Python vs. Java gzip performance

2006-03-17 Thread Peter Otten

Caleb Hattingh wrote: > I tried this: > > from timeit import * > > #Try readlines > print Timer('import > gzip;lines=gzip.GzipFile("gztest.txt.gz").readlines();[i+"1" for i in > lines]').timeit(200) # This is one line > > > # Try file object - uses buffering? > print Timer('import gzip;[i+"1"

Re: Python vs. Java gzip performance

2006-03-17 Thread Caleb Hattingh

I tried this: from timeit import * #Try readlines print Timer('import gzip;lines=gzip.GzipFile("gztest.txt.gz").readlines();[i+"1" for i in lines]').timeit(200) # This is one line # Try file object - uses buffering? print Timer('import gzip;[i+"1" for i in gzip.GzipFile("gztest.txt.gz")]').time

Re: Python vs. Java gzip performance

2006-03-17 Thread Martin v. Löwis

Bill wrote: > The Java version of this code is roughly 2x-3x faster than the Python > version. I can get around this problem by replacing the Python > GzipFile object with a os.popen call to gzcat, but then I sacrifice > portability. Is there something that can be improved in the Python > version

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

Re: Python vs. Java gzip performance

9 matches

Site Navigation

Mail list logo

Footer information