On 2011-11-28 Stefan Westerfeld wrote:
> Just a thought: could performance be improved if xz requested the
> memory via mmap(), like
> 
>   char *buffer = (char *) mmap (NULL, 64 * 1024 * 1024,
> PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> I wrote a little test program which seems to indicate that mmap() is
> much faster for getting zero initialized memory than malloc() +
> memset(). But thats for the case where the application does not
> access the memory. For xz the question is how much of the memory will
> be accessed, and how much not having to zero-initialize the memory
> will save.

With tiny input the memory won't be accessed much. With BT4 match
finder, it's one read and one write per uncompressed input byte. Each
read and write is a 32-bit integer. Since it's a hash table, it's
random access. There are actually three hash tables in BT4, which are
allocated at the same time, but the other two tables are small.

If you do a few thousand random 32-bit reads and writes, the mmap
method can still be faster, but it's not as huge difference as your
test makes it look like.

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode

Reply via email to