On Feb 15, 2015, at 10:58 PM, Vasily Galkin <[email protected]> wrote:
> I'm trying to implement a technique that would allow to store array limits 
> and specific array position in a single pointer, calculating array limits 
> from pointer alignment.
> So I want to allocate arrays of 2**N bytes at 2**N boundary.
> The idea is quite genereic so N's value may vary from 8 to huge values like 
> 34 (for 64-bit system).

Note that e.g. a 4 MiB naturally aligned allocation happens to also be 
naturally aligned at all lesser alignments (2 MiB, 1 MiB, ..., 16 B, etc.), so 
I don't understand how you're going to make this work unless you prevent e.g. 
16 byte allocations from being page-aligned.

> Analyzing source of glibc allocator for this case I found that for 
> aligned_alloc of 2**N block on 2**N boundary it may introduce worse case 2 
> times overhead by trying to allocate size+align block, finding aligned 
> address in it and freeing unused part.
> jemalloc is more complex so I decided to ask about it.
> 
> What is jemalloc worst case overhead for allocating 2**N bytes at 2**N 
> boundary?
> I think for jemalloc-huge objects "no overhead" can be achived depends via 
> mmap with fixed aligned adress. Does jemalloc do it such way?
> For jemalloc-small and jemalloc-large allocations: does typical 2**N allocs 
> are always aligned to 2**N boundaries or this requirement will change the way 
> how jemalloc handle this allocations?

I think you're asking about metadata overhead.  jemalloc stores all metadata 
separately from allocations, so there are no inherent issues as there would be 
if allocations had metadata headers.  Also, if you specify 
--with-lg-size-class-group=0 while configuring the dev version of jemalloc, all 
size classes will be powers of 2, and all allocations will be naturally aligned.

Regarding worst case fragmentation, the answer depends on size class category 
and allocation size mixture:

- Small: All 2^n-aligned allocations of size 2^n will incur no additional 
overhead, due to how small allocations are aligned and packed.

- Large: The worst case size is half the chunk size, in which case only one 
allocation per chunk can be allocated.  If the remaining (nearly) half of the 
chunk isn't otherwise useful for smaller allocations, the overhead will 
essentially be 50%.  However, assuming you use a diverse mixture of size 
classes, the actual overhead shouldn't be a significant issue in practice.

- Huge: Extra virtual memory is mapped, then the excess is trimmed and 
unmapped.  This can leave virtual memory holes, but it incurs no physical 
memory overhead.  Earlier versions of jemalloc heuristically attempted to 
optimistically map chunks without excess that would need to be trimmed, but it 
didn't save much system call overhead in practice, and I ripped the code out 
during a simplification pass.

 Jason
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss

Reply via email to