Thanks guys for all your posts... So I am a bit confused....Fuzzy, the code I saw looks like it decompresses as a stream (i.e. per byte). Is this the case or are you just compressing for file storage but the actual data set has to be exploded in memory?
fuzzylollipop wrote: > Fredrik Lundh wrote: > > fuzzylollipop wrote: > > > > > you got no idea what you are talking about, anyone knows that something > > > like this is IO bound. > > > > which of course explains why some XML parsers for Python are a 100 times > > faster than other XML parsers for Python... > > > > dependes on the CODE and the SIZE of the file, in this case > > processing 10GB of file, unless that file is heavly encrypted or > compressed will, the process will be IO bound PERIOD! > > And in the case of XML unless the PARSER is extremely inefficient, and > I assume, that would be an edge case, the parser is NOT the bottle neck > in this case. > > The relativel performance of Python XML parsers is irrelvant in > relationship to this being an IO bound process, even the slowest parser > could only process the data as fast as it can be read off the disk. > > Anyone saying that using C instead of Python will be faster when 99% of > the time in this case is just waiting on the disk to feed a buffer, has > no idea what they are talking about. > > I work with TeraBytes of files, and all our Python code is just as fast > as equivelent C code for IO bound processes. -- http://mail.python.org/mailman/listinfo/python-list