On Feb 9, 2016, at 11:07 AM, Steve Cobb <[email protected]> wrote:
> Are there any statistics available on the meta-data overhead for using 
> jemalloc? What is the usual percentage of meta-data overhead to user memory - 
> is that number available? Are there any comparisons available - say comparing 
> to other "popular" malloc implementations?

jemalloc 4.x directly reports metadata statistics 
(http://canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#stats.metadata
 
<http://canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#stats.metadata>);
 in earlier versions it can be roughly inferred with some knowledge of how 
jemalloc works.  Note that some sparsely touched global data structures are 
accounted for in this statistic, yet the physical memory impact is not as high 
as the number suggests.

> Is the amount of meta-data tuneable in any way - any compilation 
> flags/configurations?

Some features affect metadata overhead, e.g. heap profiling, redzones, and 
quarantine, and for that matter, the stats functionality.  In general, the 
fewer optional features are in use, the lower the metadata overhead.  In 
practice though, you don't need to worry about this for large-scale 
applications; it only warrants serious consideration for embedded systems.

> Right now, we are using Jemalloc 3.6, but will be moving to the latest 
> release - is there any difference in the amount of meta-data between those 
> releases?

Metadata overhead is generally proportional to the number of chunks being used 
for small/large allocations, so the biggest determinant of metadata overhead is 
fragmentation.  Per chunk overhead did not change much between 3.6 and 4.x 
despite substantial restructuring, but fragmentation tends to be lower with 
4.x, so metadata overhead also tends to be lower with 4.x.

> Right now, for a general purpose allocator, it looks like the overhead is 
> very much higher than Glibc malloc. We are looking very seriously at jemalloc 
> for some large-scale applications on Linux - in particular, we are interested 
> in the ability of jemalloc to return/unmap memory. But if the overhead is 
> very high, this will be a real problem. Can any guidance be given on the 
> typical overhead?

I seriously doubt that the metadata overhead for jemalloc is higher than that 
for glib under typical operating conditions.  glibc uses per object headers to 
store metadata, whereas jemalloc uses more centralized metadata storage, and 
less than two bits per allocation even for e.g. 8-byte objects.  If you are 
using virtual memory as a metric, instead look at resident memory usage.

Yes, jemalloc can return unused dirty memory to the OS.  It uses munmap() 
and/or madvise(), though on Linux it does not use munmap() by default due to 
kernel VM fragmentation issues.  Watch 
https://github.com/jemalloc/jemalloc/issues/325 
<https://github.com/jemalloc/jemalloc/issues/325> to keep tabs on an 
in-progress feature that I expect to further improve unused dirty memory 
management.

Jason
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to