When creating large arrays, Arrow uses realloc quite intensively. I have an example where y read a gzipped parquet column (strings) that expands from 8MB to 100+MB when loaded into Arrow. Of course Jemalloc cannot anticipate this and every reallocate call above 1MB (the most critical ones) ends up being a copy.
I think that knowing that we like using realloc in Arrow, we could come up with an allocator for large objects that would behave a lot better than Jemalloc. For smaller objects, this allocator could just let the memory request being handled by Jemalloc. Not trying to outsmart the brilliant guys from Facebook and co ;-) But for larger objects, we could adopt a custom strategy: - if an allocation or a re-allocation larger than 1MB (or maybe even 512K) is made on our memory pool, call mmap with size XGB (X being slightly smaller than the total physical memory on the system). This is ok because mmap will not physically allocate this memory as long as it is not touched. - we keep track of all allocations that we created this way, by storing the pointer + the actual used size inside our XGB alloc in a map. - when growing an alloc mmaped this way we will always have contiguous memory available, (otherwise we would already have OOMed because X is the physical memory size). - when reducing the alloc size we can free with madvice (optional: if the alloc becomes small enough, we might copy it back into a Jemalloc allocation). I am not an expert of these matters, and I just learned what an allocator really is, so my approach might be naive. In this case feel free ton enlighten me! Please note that I'm not sure about the level of portability of this solution. Have a nice day! Remi