Re: plzip: manual gives very false numbers, real defaults are huge!

Antonio Diaz Diaz Wed, 08 May 2024 08:38:18 -0700

Hi Steffen,

Steffen Nurpmeso wrote:

   #?0|kent:plzip-1.11$ cp /x/balls/gcc-13.2.0.tar.xz X1
   #?0|kent:plzip-1.11$ cp X1 X2
[...]
   -rw-r----- 1 steffen steffen 89049959 May  7 22:14 X1.lz
   -rw-r----- 1 steffen steffen 89079463 May  7 22:14 X2.lz

Note that if you use uncompressible files as input, you'll always obtainsimilar compressed sizes, no matter the compression level or the dictionarysize. Try the test with gcc-13.2.0.tar and you'll see the difference. (As inyour other test with /x/doc/coding/austin-group/202x_d4.txt).

I think dynamically scalling according to the processors, talking
into account the dictionary size, as you said above, is the sane
approach for "saturating" with plzip, in the above job there are
quite a lot of files, of varying size (the spam DB being very
large), and one recipe is not good for them all.

Maybe there is a better way (almost optimal for many files) to compress thespam DB that does not require a parallel compressor, but uses all theprocessors in your machine. (And, as a bonus, achieves maximum compressionon files of any size and produces reproducible files).


  ls | xargs -n1 -P4 lzip -9

The command above should produce better results than a saturated plzip.

'ls' may be replaced by any way to generate a list of the files to becompressed. Seehttp://www.gnu.org/software/findutils/manual/html_node/find_html/xargs-options.html


Hope this helps,
Antonio.

Re: plzip: manual gives very false numbers, real defaults are huge!

Reply via email to