[Python-Dev] Re: radix tree arena map for obmalloc

Tim Peters Mon, 17 Jun 2019 09:14:22 -0700

[Inada Naoki]
>> Increasing pool size is one obvious way to fix these problems.
>> I think 16KiB pool size and 2MiB (huge page size of x86) arena size is
>> a sweet spot for recent web servers (typically, about 32 threads, and
>> 64GiB), but there is no evidence about it.


[Antoine]
> Note that the OS won't give a huge page automatically, because memory
> management becomes much more inflexible then.
>
> For example, the Linux madvise() man page has this to say about
> MADV_HUGEPAGE:
>
>               This feature is primarily aimed at  applications  that
>               use large mappings of data and access large regions of
>               that memory at a time  (e.g.,  virtualization  systems
>               such as QEMU).  It can very easily waste memory (e.g.,
>               a 2 MB mapping that only ever  accesses  1  byte  will
>               result  in  2 MB  of  wired memory instead of one 4 KB
>               page).  See the Linux kernel  source  file  Documenta‐
>               tion/vm/transhuge.txt for more details.
>
> I'm not sure a small objects allocator falls into the right use case
> for huge pages.

The SuperMalloc paper I recently pointed at notes that it uses huge
pages only for "huge" requests.  Not for "small", "medium", or "large"
requests.

But it carves up 2 MiB chunks. aligned at 2 MiB addresses, for each
size class anyway (which use 4K pages).

There are a mix of reasons for that.  Partly for the same reasons I
want bigger pools and arenas:  to stay in the fastest code paths.
Hitting page/arena/chunk boundaries costs cycles for computation and
conditional branches, and clobbers cache lines to access & mutate
bookkeeping info that the fast paths don't touch.

Also to reduce the fraction of allocator space "wasted" on bookkeeping
info.  48 header bytes out of a 4K pool is a bigger percentage hit
than, say, two 4K pages (to hold fancier allocator bookkeeping data
structures) out of a 2M chunk.

And partly for the same reason Neil is keen for bigger arenas in his
branch:  to reduce the size of data structures to keep track of other
bookkeeping info (in Neil's case, a radix tree, which can effectively
shift away the lowest ARENA_BITS bits of addresses it needs to store).

Which hints at much of why it wants "huge" chunks, but doesn't explain
why it doesn't want huge pages except to satisfy huge requests.
That's because it strives to be able to release physical RAM back to
the system on a page basis (which is also part of why it needs fancier
bookkeeping data structures to manage its chunks - it needs to keep
track of which pages are in use, and apply page-based heuristics to
push toward freeing pages).

So that combines very much larger "pools" (2M v 4K) with better
chances of actually returning no-longer-used pages to the system (on a
4K basis rather than a 256K basis).  But it's built on piles of
platform-specific code, and isn't suitable at all for 32-bit boxes
(it' relies on that virtual address space is an abundant resource on
64-bit boxes - reserving 2M of address space is close to trivial, and
could potentially be done millions of times without getting in
trouble).
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CST4C3PF7EGNPEXN3F3LRPGJ7NNDAHTE/

[Python-Dev] Re: radix tree arena map for obmalloc

Reply via email to