Did you consider the mmap library? Perhaps it is possible to avoid to hold these big stings in memory. BTW: AFAIK it is not possible in 32bit windows for an ordinary programm to allocate more than 2 GB. That restriction comes from the jurrasic MIPS-Processors, that reserved the upper 2 GB for the OS.
HTH, Gerald Claudio Grondi schrieb: > "Fredrik Lundh" <[EMAIL PROTECTED]> schrieb im Newsbeitrag > news:[EMAIL PROTECTED] > >>Claudio Grondi wrote: >> >> >>>What started as a simple test if it is better to load uncompressed data >>>directly from the harddisk or >>>load compressed data and uncompress it (Windows XP SP 2, Pentium4 3.0 > > GHz > >>>system with 3 GByte RAM) >>>seems to show that none of the in Python available compression libraries >>>really works for large sized >>>(i.e. 500 MByte) strings. >>> >>>Test the provided code and see yourself. >>> >>>At least on my system: >>> zlib fails to decompress raising a memory error >>> pylzma fails to decompress running endlessly consuming 99% of CPU time >>> bz2 fails to compress running endlessly consuming 99% of CPU time >>> >>>The same works with a 10 MByte string without any problem. >>> >>>So what? Is there no compression support for large sized strings in > > Python? > >>you're probably measuring windows' memory managment rather than the com- >>pression libraries themselves (Python delegates all memory allocations >>256 bytes >>to the system). >> >>I suggest using incremental (streaming) processing instead; from what I > > can tell, > >>all three libraries support that. >> >></F> > > > Have solved the problem with bz2 compression the way Frederic suggested: > > fObj = file(r'd:\strSize500MBCompressed.bz2', 'wb') > import bz2 > objBZ2Compressor = bz2.BZ2Compressor() > lstCompressBz2 = [] > for indx in range(0, len(strSize500MB), 1048576): > lowerIndx = indx > upperIndx = indx+1048576 > if(upperIndx > len(strSize500MB)): upperIndx = len(strSize500MB) > > lstCompressBz2.append(objBZ2Compressor.compress(strSize500MB[lowerIndx:upper > Indx])) > #:for > lstCompressBz2.append(objBZ2Compressor.flush()) > strSize500MBCompressed = ''.join(lstCompressBz2) > fObj.write(strSize500MBCompressed) > fObj.close() > > :-) > > so I suppose, that the decompression problems can also be solved that way, > but : > > This still doesn't for me answer the question what the core of the problem > was, how to avoid it and what are the memory request limits which should be > considered when working with large strings? > Is it actually so, that on other systems than Windows 2000/XP there is no > problem with the original code I have provided? > Maybe a good reason to go for Linux instead of Windows? Does e.g. Suse or > Mandriva Linux have also a memory limit a single Python process can use? > Please let me know about your experience. > > Claudio > > -- http://mail.python.org/mailman/listinfo/python-list