Hi! On Mon, Nov 28, 2011 at 05:03:03PM +0200, Lasse Collin wrote: > On 2011-11-28 Stefan Westerfeld wrote: > > Now the problem is that for those files I cannot predict the size. > > Often they will be quite small, but they also could be 100 MB in size > > or more. So I use xz -9 to get the best compression. > > > > The problem is now that xz takes a lot of time to start: > > > > stefan@ubuntu:/tmp$ time echo "foo" | xz -9 >/dev/null > > > > real 0m0.155s > > user 0m0.052s > > sys 0m0.096s > > The match finder hash table has to be initialized. It cannot be avoided. > The bigger the dictionary, the bigger the hash table. It's about 64 MiB > when using 64 MiB dictionary (xz -9). With 8 MiB dictionary (xz -6) > it's about 16 MiB. So at a lower setting the initialization is faster. > > xz allocates much more memory for other things. Most of that memory > isn't initialized beforehand. Uninitialized memory doesn't cause a > significant speed penalty because many kernels don't physically allocate > large allocations before the memory will actually be used.
Just a thought: could performance be improved if xz requested the memory via mmap(), like char *buffer = (char *) mmap (NULL, 64 * 1024 * 1024, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); I wrote a little test program which seems to indicate that mmap() is much faster for getting zero initialized memory than malloc() + memset(). But thats for the case where the application does not access the memory. For xz the question is how much of the memory will be accessed, and how much not having to zero-initialize the memory will save. Cu... Stefan ------ (call with malloc or mmap as argument) ----- #include <stdlib.h> #include <string.h> #include <sys/mman.h> #include <assert.h> #include <stdio.h> int main (int argc, char **argv) { assert (argc == 2); if (strcmp (argv[1], "malloc") == 0) { void *buffer = malloc (64 * 1024 * 1024); memset (buffer, 0, 64 * 1024 * 1024); } else { char *buffer = (char *) mmap (NULL, 64 * 1024 * 1024, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); } } -- Stefan Westerfeld, Hamburg/Germany, http://space.twc.de/~stefan