Re: [developer] Re: 2nd March 2022 OpenZFS Leadership Meeting

2022-04-11 Thread Rich
I do have branches with other zlibs.

I don't have shiny graphs for them handy on this computer, but zlib-ng
wasn't the first, just the one that someone specifically asked about on a
bug on Github so I had some graphs for it.

It was mostly an experiment because gzip producing different results on
different OSes annoyed me.

- Rich


On Mon, Apr 11, 2022 at 6:21 PM René J.V. Bertin 
wrote:

> On Sunday April 10 2022 18:52:07 Matthew Ahrens via openzfs-developer
> wrote:
> >gzip update
> >Kidding…
> >…but I really did experiment with it.
> >Tried benchmarking a few different zlib forks instead of Linux builtin,
> none were compellingly different so far
> >Quick and dirty graphs of zlib-ng, which was the most different, here
> >Seems like at best maybe the decompressor might be worth examining, and
> the SIMD improvements don’t seem to make much difference for our use cases.
>
> It's been a while since I compared standard zlib with zlib-ng but my
> findings were largely the same. IIRC zlib-ng don't aim for better
> performance but for more features. Faster performance zlib is CloudFlare
> turf.
>
> R
>
> --
> openzfs: openzfs-developer
> Permalink:
> https://openzfs.topicbox.com/groups/developer/Tde9acb4c64be171e-Me510ffde47f6cd17558e8bbb
> Delivery options:
> https://openzfs.topicbox.com/groups/developer/subscription
>

--
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/Tde9acb4c64be171e-M0a6b71c18adc38d857340163
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription


Re: [developer] Re: 2nd March 2022 OpenZFS Leadership Meeting

2022-04-11 Thread René J . V . Bertin
On Sunday April 10 2022 18:52:07 Matthew Ahrens via openzfs-developer wrote:
>gzip update
>Kidding…
>…but I really did experiment with it.
>Tried benchmarking a few different zlib forks instead of Linux builtin, none 
>were compellingly different so far
>Quick and dirty graphs of zlib-ng, which was the most different, here
>Seems like at best maybe the decompressor might be worth examining, and the 
>SIMD improvements don’t seem to make much difference for our use cases.

It's been a while since I compared standard zlib with zlib-ng but my findings 
were largely the same. IIRC zlib-ng don't aim for better performance but for 
more features. Faster performance zlib is CloudFlare turf.

R

--
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/Tde9acb4c64be171e-Me510ffde47f6cd17558e8bbb
Delivery options: https://openzfs.topicbox.com/groups/developer/subscription


[developer] Re: 2nd March 2022 OpenZFS Leadership Meeting

2022-04-11 Thread Matthew Ahrens via openzfs-developer
Thanks to everyone who participated, and to Christian for these detailed
notes!

Meeting recording 

   -

   Update on compression (Rich)
   -

  I promise I’ll stop eventually
  -

  Design review requested for PR #13244 zstd “early abort”
  
  -

 Goal was more or less “make higher zstd levels more generally
 useful”. They currently aren’t in scenarios with mixed
compressibility data
 due to high CPU cost on incompressible parts.
 -

 Approach: use cheap compression algorithms as heuristics to
 determine whether a large block (>= 128k)  is compressible.
Only use the
 high zstd level if heuristic matches. Currently, heuristics
are LZ4 first,
 then zstd-1.
 -

 Benchmark results:
 -

The Pi 4 compression chart on incompressible data is
pretty…drastic

,
going from over 2 hours to 15 minutes to recv a 41 GB
dataset at zstd-15.
(It’s probably even more drastic for 18, but that might
take over a day to
get the baseline for…)
-

The Ryzen 5900X chart isn’t bad either

,
4+ minutes to under 1 at zstd-18.
-

Peak losses observed in compression savings were under 100 MB
across a 41 GB fairly incompressible and 54 GB very
compressible dataset,
which seems more than acceptable to me - and there’s an
easy tunable if
you’d like to not use it.
-

 Regarding correctness: Since it doesn’t change the compression
 parameters, it doesn’t produce different compressed results,
either the
 same compressed results or uncompressed.
 -

 Could use reviewers, not that it’s a particularly invasive change
 (I have an even less invasive version, just want to benchmark the
 additional overhead of it isn’t noticeable)
 -

 Could have a dataset parameter to control whether it’s tried or
 not, if desired
 -

 Probably also works with LZ4+gzip-1 on gzip, though I haven’t
 tried the experiment yet.
 -

 Aspects discussed during the meeting that need follow-up:
 -

Add kstats to keep track of heuristic’s decisions. Will help
characterize different datasets and determine when this is useful.
-

Need to determine the worst-case
CPU-time-saved/compression-lost ratio due to this feature.
-

   Synthetic benchmark where we always run lz4, zstd-1, and
   zstd-4. (zstd-4 is the first level at which this
mechanism kicks in.)
   -

   A pathological dataset where lz4 and zstd-1 heuristic match,
   but the data is not zstd-4 compressible.
   -

Consider publishing the scripts that create the datasets, run
the benchmarks, and post-process results.
-

  (Below this is just negative results, feel free to skip if people
  aren’t interested in why updating LZ4/zstd/gzip don’t look like wins
  currently)
  -

  LZ4 update:
  -

 Branch works, lz4_version property lets you swap between 1.9.3 and
 legacy at will per dataset
 -

 Want to do more stress testing, add more tests, teach ztest to
 randomly swap that property while doing its other games, but…
 -

 Not convinced the compressor’s a win, really, performance or
 savings wise
 -

5900X incompressible chart


-

5900X compressible chart


-

 No feature flag needed, older LZ4 happily decompresses it, even
 with a version number hidden past where the LZ4 size “header”
thinks the
 stream ends to avoid breaking older readers
 -

 Send/recv prints an annoying message about not knowing the
 lz4_version property if you send -p it to someone who doesn’t
know it, but
 it doesn’t error out
 -

 PR: If I find a compelling reason to use the new version
 -

  zstd update
  -

 Branch works, swapping compressor with zstd_version property
 -

 Plays nice with the above lz4_version property too
 -

 Benchmarking 1.5.0/1.5.1/1.5.2/1.5.0+(SSE|NEON)/1.5.2+(SSE|NEON)
 continues
 -

 Pre