Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v3

2014-12-07 Thread Lauri Kasanen
On Sun, Dec 7, 2014, at 01:51, Denys Vlasenko wrote:
  It misses the plus one zero byte behavior of the non-compressed path
  (xmalloc_read). I bet some callers depend on it, using str* functions on
  the returned buffer.
 
 Fixed, and committed to git. Please try it.

Results are correct and speed is equal to my patch. All good, thanks.

- Lauri

-- 
http://www.fastmail.com - Send your email first class

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v3

2014-12-06 Thread Denys Vlasenko
Please see attached patch (only compile-tested so far).
It builds on your work, implementing in-memory compression
for all unpackers we have.

Unfortunately, bloatcheck is not inspiring:

function old new   delta
transformer_write  - 133+133
setup_transformer_on_fd- 133+133
complete_cmd_dir_file773 895+122
open_transformer_state -  92 +92
xmalloc_open_zipped_read_close63 112 +49
xtransformer_write -  24 +24
unpack_lzma_stream  27172737 +20
unpack_xz_stream23932407 +14
unpack_Z_stream 11731183 +10
unpack_gz_stream 693 696  +3
decode_one_format726 729  +3
refresh  665 667  +2
inflate_unzip111 113  +2
unpack_bz2_stream359 360  +1
swap_on_off_main 420 418  -2
scan_recursive   380 378  -2
inflate_unzip_internal  19451924 -21
open_zipped   89  47 -42
setup_unzip_on_fd142  53 -89
--
(add/remove: 4/0 grow/shrink: 10/5 up/down: 608/-156) Total: 452 bytes
   textdata bss dec hex filename
 923218 928   17684  941830   e5f06 busybox_old
 923686 928   17684  942298   e60da busybox_unstripped
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v3

2014-12-06 Thread Denys Vlasenko
On Sat, Dec 6, 2014 at 5:11 PM, Denys Vlasenko vda.li...@googlemail.com wrote:
 Please see attached patch (only compile-tested so far).
 It builds on your work, implementing in-memory compression
 for all unpackers we have.


1.patch
Description: Binary data
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v3

2014-12-06 Thread Lauri Kasanen
On Sat, Dec 6, 2014, at 18:11, Denys Vlasenko wrote:
 Please see attached patch (only compile-tested so far).
 It builds on your work, implementing in-memory compression
 for all unpackers we have.

It misses the plus one zero byte behavior of the non-compressed path
(xmalloc_read). I bet some callers depend on it, using str* functions on
the returned buffer.

- Lauri

-- 
http://www.fastmail.com - Same, same, but different...

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: [PATCH] Add a gzip fastpath for the xmalloc readers, v2

2014-11-30 Thread Denys Vlasenko
On Fri, Nov 28, 2014 at 10:44 AM, Lauri Kasanen cur...@operamail.com wrote:
 v2: Add missing check on open

 The performance and number of processes for a depmod -a with gzipped
 modules was abysmal. This patch adds a fast path without fork for well-
 behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

 modinfo radeon.ko.gz, a single-file reader, got 30% faster.

 depmod -a, which used to fork over 800 times, got 20% faster. And of course
 a whole lot less processes - much saved RAM.

 function old new   delta
 inflate_get_next_window-1877   +1877
 xmalloc_unpack_gz  - 356+356
 check_header_gzip  - 298+298
 xmalloc_inflate_unzip_internal - 223+223
 xmalloc_open_zipped_read_close73 176+103
 inflate_init   -  97 +97
 inflate_store_unused   -  35 +35
 unpack_gz_stream 567 299-268
 inflate_unzip_internal  2304 172   -2132
 --
 (add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes


This feels somewhat big.

Looking at the code, you have significant code duplication.

I think a following approach can work:

* Extend transformer_aux_data_t so that it can specify
  I want decompressed data to go into a mem buffer.
  Say, size_t aux-mem_output_size. If 0, it's the maximum
  amount of bytes you allow to decompress into it
  (0 means decompress into dst_fd).
  The result is char *aux-mem_output_buf.

* Pass aux pointer into inflate_unzip_internal().

* There, modify a decompression loop so that it stores
  result in aux-mem_output_buf if aux-mem_output_size 0
  (xreallocing it so that it grows with decompression)

* In xmalloc_open_zipped_read, you only need to construct
  a suitable aux, call unpack_gz_stream(), and get
  aux-mem_output_buf as your result buffer.

This way, you don't need xmalloc_unpack_gz()
and xmalloc_inflate_unzip_internal().

The special-casing of GZ in xmalloc_open_zipped_read looks ugly,
but demanding a generic mechanism for in-memory unpacking
is too much for one patch...
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


[PATCH] Add a gzip fastpath for the xmalloc readers, v3

2014-11-30 Thread Lauri Kasanen
v3: Use the aux struct as requested by Denys
v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_unzip_internal  23042521+217
xmalloc_open_zipped_read_close73 201+128
unpack_gz_stream 567 570  +3
--
(add/remove: 0/0 grow/shrink: 3/0 up/down: 348/0) Total: 348 bytes

-- 
http://www.fastmail.com - Does exactly what it says on the tin

From 5ad6804ed4485eae176da45524ea848a00b11929 Mon Sep 17 00:00:00 2001
From: Lauri Kasanen cur...@operamail.com
Date: Sun, 30 Nov 2014 21:37:10 +0200
Subject: [PATCH] Add a gzip fastpath for the xmalloc readers, v3

v3: Use the aux struct as requested by Denys
v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_unzip_internal  23042521+217
xmalloc_open_zipped_read_close73 201+128
unpack_gz_stream 567 570  +3
--
(add/remove: 0/0 grow/shrink: 3/0 up/down: 348/0) Total: 348 bytes

Signed-off-by: Lauri Kasanen cur...@operamail.com
---
 archival/libarchive/decompress_gunzip.c | 38 -
 archival/libarchive/open_transformer.c  | 31 ++-
 include/bb_archive.h|  2 ++
 3 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/archival/libarchive/decompress_gunzip.c 
b/archival/libarchive/decompress_gunzip.c
index 7c6f38e..938d21f 100644
--- a/archival/libarchive/decompress_gunzip.c
+++ b/archival/libarchive/decompress_gunzip.c
@@ -971,10 +971,11 @@ static int inflate_get_next_window(STATE_PARAM_ONLY)
 
 /* Called from unpack_gz_stream() and inflate_unzip() */
 static IF_DESKTOP(long long) int
-inflate_unzip_internal(STATE_PARAM int in, int out)
+inflate_unzip_internal(STATE_PARAM transformer_aux_data_t *aux, int in, int 
out)
 {
IF_DESKTOP(long long) int n = 0;
ssize_t nwrote;
+   size_t bufsize = 0;
 
/* Allocate all global buffers (for DYN_ALLOC option) */
gunzip_window = xmalloc(GUNZIP_WSIZE);
@@ -1002,16 +1003,43 @@ inflate_unzip_internal(STATE_PARAM int in, int out)
 
while (1) {
int r = inflate_get_next_window(PASS_STATE_ONLY);
-   nwrote = full_write(out, gunzip_window, gunzip_outbuf_count);
+
+   if (aux-mem_output_size) {
+   nwrote = gunzip_outbuf_count;
+   if (gunzip_outbuf_count + n  bufsize) {
+   // increase by four blocks each time
+   const size_t newsize = bufsize + 4 * 
gunzip_outbuf_count + 1;
+   aux-mem_output_buf = 
xrealloc(aux-mem_output_buf, newsize);
+   bufsize = newsize;
+   }
+
+   if (bufsize  aux-mem_output_size) {
+   free(aux-mem_output_buf);
+   aux-mem_output_buf = NULL;
+   n = -1;
+   goto ret;
+   }
+
+   memcpy(aux-mem_output_buf + n, gunzip_window, 
gunzip_outbuf_count);
+   } else {
+   nwrote = full_write(out, gunzip_window, 
gunzip_outbuf_count);
+   }
if (nwrote != (ssize_t)gunzip_outbuf_count) {
bb_perror_msg(write);
n = -1;
goto ret;
}
-   IF_DESKTOP(n += nwrote;)
+   n += nwrote;
if (r == 0) break;
}
 
+   /* Final realloc, plus zero byte */
+   if (aux-mem_output_size) {
+   aux-mem_output_buf = xrealloc(aux-mem_output_buf, n + 1);
+   aux-mem_output_size = n;
+   aux-mem_output_buf[n] = '\0

[PATCH] Add a gzip fastpath for the xmalloc readers, v2

2014-11-28 Thread Lauri Kasanen
v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
xmalloc_open_zipped_read_close73 176+103
inflate_init   -  97 +97
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes

-- 
http://www.fastmail.com - The way an email service should be

From e5a58da5d54b8a8e2eef657ac6be60231683a662 Mon Sep 17 00:00:00 2001
From: Lauri Kasanen cur...@operamail.com
Date: Thu, 27 Nov 2014 14:48:17 +0200
Subject: [PATCH] Add a gzip fastpath for the xmalloc readers

v2: Add missing check on open

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
xmalloc_open_zipped_read_close73 176+103
inflate_init   -  97 +97
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2989/-2400)Total: 589 bytes

Signed-off-by: Lauri Kasanen cur...@operamail.com
---
 archival/libarchive/decompress_gunzip.c | 155 
 archival/libarchive/open_transformer.c  |  20 +
 include/bb_archive.h|   1 +
 3 files changed, 160 insertions(+), 16 deletions(-)

diff --git a/archival/libarchive/decompress_gunzip.c 
b/archival/libarchive/decompress_gunzip.c
index 7c6f38e..1f68ebd 100644
--- a/archival/libarchive/decompress_gunzip.c
+++ b/archival/libarchive/decompress_gunzip.c
@@ -968,19 +968,12 @@ static int inflate_get_next_window(STATE_PARAM_ONLY)
/* Doesnt get here */
 }
 
-
-/* Called from unpack_gz_stream() and inflate_unzip() */
-static IF_DESKTOP(long long) int
-inflate_unzip_internal(STATE_PARAM int in, int out)
+static void inflate_init(STATE_PARAM_ONLY)
 {
-   IF_DESKTOP(long long) int n = 0;
-   ssize_t nwrote;
-
/* Allocate all global buffers (for DYN_ALLOC option) */
gunzip_window = xmalloc(GUNZIP_WSIZE);
gunzip_outbuf_count = 0;
gunzip_bytes_out = 0;
-   gunzip_src_fd = in;
 
/* (re) initialize state */
method = -1;
@@ -994,6 +987,31 @@ inflate_unzip_internal(STATE_PARAM int in, int out)
gunzip_crc = ~0;
 
error_msg = corrupted data;
+}
+
+static void inflate_store_unused(STATE_PARAM_ONLY)
+{
+   /* Store unused bytes in a global buffer so calling applets can access 
it */
+   if (gunzip_bk = 8) {
+   /* Undo too much lookahead. The next read will be byte aligned
+* so we can discard unused bits in the last meaningful byte. */
+   bytebuffer_offset--;
+   bytebuffer[bytebuffer_offset] = gunzip_bb  0xff;
+   gunzip_bb = 8;
+   gunzip_bk -= 8;
+   }
+}
+
+/* Called from unpack_gz_stream() and inflate_unzip() */
+static IF_DESKTOP(long long) int
+inflate_unzip_internal(STATE_PARAM int in, int out)
+{
+   IF_DESKTOP(long long) int n = 0;
+   ssize_t nwrote;
+
+   gunzip_src_fd = in;
+   inflate_init(PASS_STATE_ONLY

[PATCH] Add a gzip fastpath for the xmalloc readers

2014-11-27 Thread Lauri Kasanen
Hi,

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of
xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of
course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
inflate_init   -  97 +97
xmalloc_open_zipped_read_close73 159 +86
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2972/-2400)Total: 572
bytes

--

It's currently guarded by CONFIG_DESKTOP. If you'd like a new config
option instead, please say so.

- Lauri

-- 
http://www.fastmail.com - Choose from over 50 domains or use your own

From 4eaee06a748ce07b004b40aa10fa1bcc3e5e8928 Mon Sep 17 00:00:00 2001
From: Lauri Kasanen cur...@operamail.com
Date: Thu, 27 Nov 2014 14:48:17 +0200
Subject: [PATCH] Add a gzip fastpath for the xmalloc readers

The performance and number of processes for a depmod -a with gzipped
modules was abysmal. This patch adds a fast path without fork for well-
behaved gzip files, benefiting all users of xmalloc_open_zipped_read_close.

modinfo radeon.ko.gz, a single-file reader, got 30% faster.
depmod -a, which used to fork over 800 times, got 20% faster. And of course
a whole lot less processes - much saved RAM.

function old new   delta
inflate_get_next_window-1877   +1877
xmalloc_unpack_gz  - 356+356
check_header_gzip  - 298+298
xmalloc_inflate_unzip_internal - 223+223
inflate_init   -  97 +97
xmalloc_open_zipped_read_close73 159 +86
inflate_store_unused   -  35 +35
unpack_gz_stream 567 299-268
inflate_unzip_internal  2304 172   -2132
--
(add/remove: 6/0 grow/shrink: 1/2 up/down: 2972/-2400)Total: 572 bytes

Signed-off-by: Lauri Kasanen cur...@operamail.com
---
 archival/libarchive/decompress_gunzip.c | 155 
 archival/libarchive/open_transformer.c  |  16 
 include/bb_archive.h|   1 +
 3 files changed, 156 insertions(+), 16 deletions(-)

diff --git a/archival/libarchive/decompress_gunzip.c 
b/archival/libarchive/decompress_gunzip.c
index 7c6f38e..1f68ebd 100644
--- a/archival/libarchive/decompress_gunzip.c
+++ b/archival/libarchive/decompress_gunzip.c
@@ -968,19 +968,12 @@ static int inflate_get_next_window(STATE_PARAM_ONLY)
/* Doesnt get here */
 }
 
-
-/* Called from unpack_gz_stream() and inflate_unzip() */
-static IF_DESKTOP(long long) int
-inflate_unzip_internal(STATE_PARAM int in, int out)
+static void inflate_init(STATE_PARAM_ONLY)
 {
-   IF_DESKTOP(long long) int n = 0;
-   ssize_t nwrote;
-
/* Allocate all global buffers (for DYN_ALLOC option) */
gunzip_window = xmalloc(GUNZIP_WSIZE);
gunzip_outbuf_count = 0;
gunzip_bytes_out = 0;
-   gunzip_src_fd = in;
 
/* (re) initialize state */
method = -1;
@@ -994,6 +987,31 @@ inflate_unzip_internal(STATE_PARAM int in, int out)
gunzip_crc = ~0;
 
error_msg = corrupted data;
+}
+
+static void inflate_store_unused(STATE_PARAM_ONLY)
+{
+   /* Store unused bytes in a global buffer so calling applets can access 
it */
+   if (gunzip_bk = 8) {
+   /* Undo too much lookahead. The next read will be byte aligned
+* so we can discard unused bits in the last meaningful byte. */
+   bytebuffer_offset--;
+   bytebuffer[bytebuffer_offset] = gunzip_bb  0xff;
+   gunzip_bb = 8;
+   gunzip_bk -= 8;
+   }
+}
+
+/* Called from unpack_gz_stream() and inflate_unzip() */
+static IF_DESKTOP(long long) int
+inflate_unzip_internal(STATE_PARAM int in, int out)
+{
+   IF_DESKTOP(long long) int n = 0;
+   ssize_t nwrote

Re: [PATCH] Add a gzip fastpath for the xmalloc readers

2014-11-27 Thread walter harms
--- a/archival/libarchive/open_transformer.c
+++ b/archival/libarchive/open_transformer.c
@@ -213,6 +213,22 @@ void* FAST_FUNC xmalloc_open_zipped_read_close(const char 
*fname, size_t *maxsz_
int fd;
char *image;

+   /* Fast path for well-behaved gzip files, avoiding forks. */
+   if (ENABLE_FEATURE_SEAMLESS_GZ  ENABLE_DESKTOP  BB_MMU) {
+   uint16_t magic;
+   fd = open(fname, O_RDONLY);
+   xread(fd, magic, 2);
+

are you sure that open will succeed at this place ?

re.
 wh


Am 27.11.2014 15:27, schrieb Lauri Kasanen:
 Hi,
 
 The performance and number of processes for a depmod -a with gzipped
 modules was abysmal. This patch adds a fast path without fork for well-
 behaved gzip files, benefiting all users of
 xmalloc_open_zipped_read_close.
 
 modinfo radeon.ko.gz, a single-file reader, got 30% faster.
 depmod -a, which used to fork over 800 times, got 20% faster. And of
 course
 a whole lot less processes - much saved RAM.
 
 function old new   delta
 inflate_get_next_window-1877   +1877
 xmalloc_unpack_gz  - 356+356
 check_header_gzip  - 298+298
 xmalloc_inflate_unzip_internal - 223+223
 inflate_init   -  97 +97
 xmalloc_open_zipped_read_close73 159 +86
 inflate_store_unused   -  35 +35
 unpack_gz_stream 567 299-268
 inflate_unzip_internal  2304 172   -2132
 --
 (add/remove: 6/0 grow/shrink: 1/2 up/down: 2972/-2400)Total: 572
 bytes
 
 --
 
 It's currently guarded by CONFIG_DESKTOP. If you'd like a new config
 option instead, please say so.
 
 - Lauri
 
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox