On Fri, 15 Aug 2014, Jan Hubicka wrote: > > > patched: > > > real 6m12.437s > > > user 51m18.829s > > > sys 4m30.809s > > > > > > WPA is 129s, stream in 29.23s, stream out 12.16s. > > > > > > Patched + fast compression > > > real 6m4.383s > > > user 49m15.123s > > > sys 4m31.166s > > > > > > WPA is 124s, stream in 29.39, stream out 7.33s. > > > > > > So I guess the difference is close to noise factor now. I am sure there > > > are better compression backends than zlib for this purpose but it seems > > > to work well enough. > > > > Yeah, we might want to pursue that lz4 thing at some point. > > > > I'll take the above as an ok to go forward with this change > > (moving compression to the "stream" level from section level). > > Yep, I would go with fast compression for wpa->ltrans objects. Those are going > to be consumed just once and the compression level increase is probably not > terrible (i.e. not as bad as current growth caused by not compressing strings > :) > > 3-fold decrease in /tmp usage is nice, but it +-10% does not matter much. BTW > if I remember well, zlib algorithm works on 64Kb blocks independently, so > perhaps havin 2MB buffer is unnecessarily large.
Yeah, the 2MB was just a "guess", I'll change it to 64k blocks. Note the original code exponentially increased block size to not have too many blocks (for whatever reason). A 800MB compressed decl section would need 12800 64k blocks. But in the end it matters only that the block allocations are "efficient" for the memory allocator (so don't allocate 1-byte blocks). Our internal overhead is one pointer (to point to the next buffer). Of course in the end I want to implement streaming right into the file rather than queuing up the whole compressed data (or mmapping it). Btw, I'll first try to get rid of the separate string section which would also make it compressed again and be less awkwardly abusing the data-streamer. Richard.