Re: [Xmldatadumps-l] [Wikitech-l] Compressing full-history dumps faster

2014-03-08 Thread Federico Leva (Nemo)
Randall Farmer, 21/01/2014 23:26: Trying to get quick-and-dirty long-range matching into LZMA isn't feasible for me personally and there may be inherent technical difficulties. Still, I left a note on the 7-Zip boards as folks suggested; feel free to add anything there:

Re: [Xmldatadumps-l] [Wikitech-l] Compressing full-history dumps faster

2014-03-08 Thread Randall Farmer
I see you got more pointers there. :) Did you manage to explore them? The blocker is that I didn't hear much interest from dump folks in a non-7z archive format even if it boosted compression speed a lot. Of the packers Bulat replied with (zpaq, exdupe, pcompress, his own srep), exdupe and srep

Re: [Xmldatadumps-l] [Wikitech-l] Compressing full-history dumps faster

2014-01-21 Thread Randall Farmer
That does not sound like much economically. Do keep in mind the cost of porting, deploying, maintaining, obtaining, and so on, new tools. Briefly, yes, CPU-hours don't cost too much, but I don't think the potential win is limited to the direct CPU-hours saved. In more detail: For Wikimedia a

Re: [Xmldatadumps-l] [Wikitech-l] Compressing full-history dumps faster

2014-01-21 Thread Randall Farmer
Ack, sorry for the (no subject); again in the right thread: For external uses like XML dumps integrating the compression strategy into LZMA would however be very attractive. This would also benefit other users of LZMA compression like HBase. For dumps or other uses, 7za -mx=3 / xz -3 is your