On Thu, 25 Nov 2021 at 12:25, Willy Tarreau <w...@1wt.eu> wrote: > > On Thu, Nov 25, 2021 at 04:38:27PM +0500, ???? ??????? wrote: > > > Thus I think that instead of focusing on the OS we ought to continue > > > to focus on the allocator and improve runtime detection: > > > > > > - glibc (currently detected using detect_allocator) > > > => use malloc_trim() > > > - jemalloc at build time (mallctl != NULL) > > > => use mallctl() as you did > > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) > > > => use mallctl() as you did > > > - others > > > => no trimming > > > > > > > I never imagined earlier that high level applications (such as reverse > > https/tcp proxy) cares about such low level things as allocator behaviour. > > no jokes, really. > > Yes it does count a lot. That's also why we spent a lot of time optimizing > the pools, to limit the number of calls to the system's allocator for > everything that uses a fixed size. I've seen some performance graphs in > our internal ticket tracker showing the memory consumption between and > after the switch to jemalloc, and the CPU usage as well, and sometimes > it was very important. > > Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by > implementing a per-thread cache in its ptmalloc. But in our case it's > still not as good as jemalloc, and neither perform as well as our > thread-local pools for fixed sizes. > > I'm seeing in a paper about snmalloc that it performs exceptionally well > for small allocations. I just don't know how this degrades depending on > the access patterns. For example some allocators are fast when you free() > in the exact reverse allocation order, but can start to fragment or have > more work to do finding holes if you don't free() in the exact same order. >
If you re curious there is also mimalloc (with a pretty rich C api) from Microsoft too. > But that's something to keep an eye on in the future. > > Willy