On Apr 25, 2014, at 2:55 AM, Daniel Micay <[email protected]> wrote:
> This option was originally disabled by default due to fragmentation
> issues. It provides a significant performance win for Rust's vectors at
> very large sizes, so I'm curious about the severity of this issue and
> whether it is still around in the latest Linux kernel releases.
As far as I know, this problem still exists in Linux. The problem is that
Linux doesn't have a reliable way to find the first fit for an mmap() request
other than linear scan, so it uses heuristics to decide where to start the
scan. It's quite easy to trigger pathological behavior where a chunk of memory
is unmapped, but the kernel doesn't revise its scan start point, and the VM map
hole remains indefinitely. The more holes there are, the more mapped regions
there are to linearly scan. I don't remember what the common triggers of
linear scans are, but they definitely happen enough to cause a performance
issue, at least for some of the heavily loaded network server applications
Facebook runs.
One way to reduce the impact of huge reallocs would be to use exponential size
class increases, rather than linear increases. jemalloc will always round up
to the nearest multiple of the chunk size, but it it were to instead use e.g.
[4, 8, 16, 32, 64, ...] MiB as size classes, the realloc overhead would
amortize away. I've been thinking about exploring this strategy for large size
classes, [4 KiB .. 4 MiB), and I just wrote up a tracking issue that also keeps
your use case in mind:
https://github.com/jemalloc/jemalloc/issues/77
Thanks,
Jason
_______________________________________________
jemalloc-discuss mailing list
[email protected]
http://www.canonware.com/mailman/listinfo/jemalloc-discuss