Thanks for the reply. I had disabled jemalloc via ARROW_DEFAULT_MEMORY_POOL so that was not the issue.
The issue was (I think) that the arrow lib I was using was built with compiler builtins (such as __builtin_posix_memalign) so that even the system default allocator wasn't able to be intercepted. One way to solve this is to build Arrow with -fno-builtin, but unfortunately that disables a lot of builtins that a person may still want. Since allocation is a whole family of functions and not just a few, it is somewhat difficult to determine which builtins to selectively disallow. It would be nice if some project (arrow? mimalloc?) made such documentation for popular compilers that substitute builtins for allocation routines. I opened an issue on mimalloc for this documentation... or at least a warning about builtins for those using the interception techniques such as LD_PRELOAD. -John On Tue, Jun 14, 2022 at 3:40 PM Sutou Kouhei <k...@clear-code.com> wrote: > Hi, > > posix_memalign() in memory_pool.cc of libarrow-dev uses > jemalloc's posix_memalign() (je_posix_memalign()). Because > it's built with ARROW_JEMALLOC=ON (default) and > JEMALLOC_MANGLE > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool.cc#L53 > . So we can't use mimalloc with LD_PRELOAD. > > The comment for JEMALLOC_MANGLE in > memory_pool.c said "Needed to support jemalloc 3 and 4" bu > we bundle jemalloc 5.2.1 now. So we can remove JEMALLOC_MANGLE. > > Could you open an issue on Jira > https://issues.apache.org/jira/browse/ARROW to add support > for overriding system memory pool's allocator by LD_PRELOAD? > (Do you want to work on this?) > > > Thanks, > -- > kou > > In <cack8hr5ltedfwrat3flsdp1hq5bsoj+dcilvqjdzpdome29...@mail.gmail.com> > "Custom default C++ memory pool on Linux, and/or interception/auditing > of system pool" on Tue, 14 Jun 2022 09:06:51 -0500, > John Muehlhausen <j...@jgm.org> wrote: > > > Hello, > > > > This comment is regarding installation with `apt` on ubuntu 18.04 ... > > `libarrow-dev/bionic,now 8.0.0-1 amd64` > > > > I'm a bit confused about the memory pool situation: > > > > * I run with `ARROW_DEFAULT_MEMORY_POOL=system` and check that > > `arrow::default_memory_pool()->backend_name() == > > arrow::system_memory_pool()->backend_name()` > > > > * I then LD_PRELOAD a customized (*) mimalloc according to the directions > > at the mimalloc git repo and things like `strm->Reset(INT32_MAX);` seem > not > > to be hitting it... I figured that is a big enough chunk to jostle it > into > > doing something... `BufferOutputStream::Create(INT32_MAX)` is also not > > intercepted by mimalloc. Is the "system" pool somehow going around the > > typical allocation interfaces on linux? I built my own .so and linked it > > to the app and malloc() is getting intercepted. > > > > * `arrow::mimalloc_memory_pool(&mmmp);` does return something... but > > apparently not "my" mimalloc ... statically linked? > > > > * what is going on in Arrow with constructor (pre-main()) allocations? > > Some of this does hit my LD_PRELOADed mimalloc > > > > * any way to get symbols for the apt-installed libs or would I need to > > build from source to get backtrace with symbols? (for chasing down > sources > > of allocations) > > > > * what is the C++ lib equivalent of the following from the Python code? > I > > figure I could stop trying to understand the built-in/default allocators > if > > I could just replace them... but this may also intersect with my question > > about constructors. Maybe I'd have to make sure my constructor runs > first > > to perform the switch-a-roo before anything else tries to use the default > > pool? > > > > ``` > > namespace py { > > > > static std::mutex memory_pool_mutex; > > static MemoryPool* default_python_pool = nullptr; > > > > void set_default_memory_pool(MemoryPool* pool) { > > std::lock_guard<std::mutex> guard(memory_pool_mutex); > > default_python_pool = pool; > > } > > ``` > > > > > > (*) the mimalloc customization: the main app has a weak reference that > ends > > up defined by the LD_PRELOAD mimalloc, where the function so-supplied > > allows the app to install a function pointer (back to the main app) that > > gets called (if defined) at various interesting points in mimalloc > > > > > > Thanks, > > John >