Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v2

2014-11-30 Thread Denys Vlasenko
On Fri, Nov 28, 2014 at 10:44 AM, Lauri Kasanen cur...@operamail.com wrote:
 v2: Add missing check on open

 The performance and number of processes for a depmod -a with gzipped
 modules was abysmal. This patch adds a fast path without fork for well-
 behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

 modinfo radeon.ko.gz, a single-file reader, got 30% faster.

 depmod -a, which used to fork over 800 times, got 20% faster. And of course
 a whole lot less processes - much saved RAM.

 function old new   delta
 inflate_get_next_window-1877   +1877
 xmalloc_unpack_gz  - 356+356
 check_header_gzip  - 298+298
 xmalloc_inflate_unzip_internal - 223+223
 xmalloc_open_zipped_read_close73 176+103
 inflate_init   -  97 +97
 inflate_store_unused   -  35 +35
 unpack_gz_stream 567 299-268
 inflate_unzip_internal  2304 172   -2132
 --
 (add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes


This feels somewhat big.

Looking at the code, you have significant code duplication.

I think a following approach can work:

* Extend transformer_aux_data_t so that it can specify
  I want decompressed data to go into a mem buffer.
  Say, size_t aux-mem_output_size. If 0, it's the maximum
  amount of bytes you allow to decompress into it
  (0 means decompress into dst_fd).
  The result is char *aux-mem_output_buf.

* Pass aux pointer into inflate_unzip_internal().

* There, modify a decompression loop so that it stores
  result in aux-mem_output_buf if aux-mem_output_size 0
  (xreallocing it so that it grows with decompression)

* In xmalloc_open_zipped_read, you only need to construct
  a suitable aux, call unpack_gz_stream(), and get
  aux-mem_output_buf as your result buffer.

This way, you don't need xmalloc_unpack_gz()
and xmalloc_inflate_unzip_internal().

The special-casing of GZ in xmalloc_open_zipped_read looks ugly,
but demanding a generic mechanism for in-memory unpacking
is too much for one patch...
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


[PATCH] Add a gzip fastpath for the xmalloc readers, v2

2014-11-28 Thread Lauri Kasanen
v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
xmalloc_open_zipped_read_close73 176+103
inflate_init   -  97 +97
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes

-- 
http://www.fastmail.com - The way an email service should be

From e5a58da5d54b8a8e2eef657ac6be60231683a662 Mon Sep 17 00:00:00 2001
From: Lauri Kasanen cur...@operamail.com
Date: Thu, 27 Nov 2014 14:48:17 +0200
Subject: [PATCH] Add a gzip fastpath for the xmalloc readers

v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
xmalloc_open_zipped_read_close73 176+103
inflate_init   -  97 +97
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes

Signed-off-by: Lauri Kasanen cur...@operamail.com
---
 archival/libarchive/decompress_gunzip.c | 155 
 archival/libarchive/open_transformer.c  |  20 +
 include/bb_archive.h|   1 +
 3 files changed, 160 insertions(+), 16 deletions(-)

diff --git a/archival/libarchive/decompress_gunzip.c 
b/archival/libarchive/decompress_gunzip.c
index 7c6f38e..1f68ebd 100644
--- a/archival/libarchive/decompress_gunzip.c
+++ b/archival/libarchive/decompress_gunzip.c
@@ -968,19 +968,12 @@ static int inflate_get_next_window(STATE_PARAM_ONLY)
/* Doesnt get here */
 }
 
-
-/* Called from unpack_gz_stream() and inflate_unzip() */
-static IF_DESKTOP(long long) int
-inflate_unzip_internal(STATE_PARAM int in, int out)
+static void inflate_init(STATE_PARAM_ONLY)
 {
-   IF_DESKTOP(long long) int n = 0;
-   ssize_t nwrote;
-
/* Allocate all global buffers (for DYN_ALLOC option) */
gunzip_window = xmalloc(GUNZIP_WSIZE);
gunzip_outbuf_count = 0;
gunzip_bytes_out = 0;
-   gunzip_src_fd = in;
 
/* (re) initialize state */
method = -1;
@@ -994,6 +987,31 @@ inflate_unzip_internal(STATE_PARAM int in, int out)
gunzip_crc = ~0;
 
error_msg = corrupted data;
+}
+
+static void inflate_store_unused(STATE_PARAM_ONLY)
+{
+   /* Store unused bytes in a global buffer so calling applets can access 
it */
+   if (gunzip_bk = 8) {
+   /* Undo too much lookahead. The next read will be byte aligned
+* so we can discard unused bits in the last meaningful byte. */
+   bytebuffer_offset--;
+   bytebuffer[bytebuffer_offset] = gunzip_bb  0xff;
+   gunzip_bb = 8;
+   gunzip_bk -= 8;
+   }
+}
+
+/* Called from unpack_gz_stream() and inflate_unzip() */
+static IF_DESKTOP(long long) int
+inflate_unzip_internal(STATE_PARAM int in, int out)
+{
+   IF_DESKTOP(long long) int n = 0;
+   ssize_t nwrote;
+
+   gunzip_src_fd = in;
+   inflate_init(PASS_STATE_ONLY