Re: Profiling of man-db database generation with zlib vs zstd

Maxim Cournoyer Wed, 30 Mar 2022 07:49:19 -0700

Hi Ludovic,

Ludovic Courtès <l...@gnu.org> writes:


[...]

> To isolate the problem, you could allocate the 4 MiB buffer outside of
> the loop and use ‘get-bytevector-n!’, and also remove code that writes
> to ‘output’.

I've adjusted the benchmark like so:

--8<---------------cut here---------------start------------->8---
(use-modules (ice-9 binary-ports)
             (ice-9 match)
             (rnrs bytevectors)
             (zstd))

(define MiB (expt 2 20))
(define block-size (* 4 MiB))
(define bv (make-bytevector block-size))
(define input-file "/tmp/chromium-98.0.4758.102.tar.zst")

(define (run)
  (call-with-input-file input-file
    (lambda (port)
      (call-with-zstd-input-port port
        (lambda (input)
          (while (not (eof-object?
                       (get-bytevector-n! input bv 0 block-size)))))))))

(run)
--8<---------------cut here---------------end--------------->8---

It now runs much faster:

--8<---------------cut here---------------start------------->8---
$ time+ zstd -cdk /tmp/chromium-98.0.4758.102.tar.zst > /dev/null
cpu: 98%, mem: 10560 KiB, wall: 0:09.56, sys: 0.37, usr: 9.06
--8<---------------cut here---------------end--------------->8---

--8<---------------cut here---------------start------------->8---
$ time+ guile ~/src/guile-zstd/benchmark.scm
cpu: 100%, mem: 25152 KiB, wall: 0:11.69, sys: 0.38, usr: 11.30
--8<---------------cut here---------------end--------------->8---

So guile-zstd was about 20% slower, not too far.

For completeness, here's the same benchmark adjusted for guile-zlib:

--8<---------------cut here---------------start------------->8---
(use-modules (ice-9 binary-ports)
             (ice-9 match)
             (rnrs bytevectors)
             (zlib))

(define MiB (expt 2 20))
(define block-size (* 4 MiB))
(define bv (make-bytevector block-size))
(define input-file "/tmp/chromium-98.0.4758.102.tar.gz")

(define (run)
  (call-with-input-file input-file
    (lambda (port)
      (call-with-gzip-input-port port
        (lambda (input)
          (while (not (eof-object?
                       (get-bytevector-n! input bv 0 block-size)))))))))

(run)
--8<---------------cut here---------------end--------------->8---

--8<---------------cut here---------------start------------->8---
$ time+ guile ~/src/guile-zstd/benchmark-zlib.scm
cpu: 86%, mem: 14552 KiB, wall: 0:23.50, sys: 1.09, usr: 19.15
--8<---------------cut here---------------end--------------->8---

--8<---------------cut here---------------start------------->8---
$ time+ gunzip -ck /tmp/chromium-98.0.4758.102.tar.gz > /dev/null
cpu: 98%, mem: 2304 KiB, wall: 0:35.99, sys: 0.60, usr: 34.99
--8<---------------cut here---------------end--------------->8---

Surprisingly, here guile-zlib appears to be faster than the 'gunzip'
command; guile-zstd is about twice as fast to decompress this 4 GiB
something archive (compressed with zstd at level 19).

So, it seems the foundation we're building on is sane after all.  This
suggests that compression is not the bottleneck when generating the man
pages database, probably because it only needs to read the first few
bytes of each compressed manpage to gather the information it needs, and
that the rest is more expensive compared to that (such as
string-tokenize'ing the lines read to parse the data).

To be continued...

Thanks!

Maxim

Re: Profiling of man-db database generation with zlib vs zstd

Reply via email to