Thanks for the reply.  I had disabled jemalloc
via ARROW_DEFAULT_MEMORY_POOL so that was not the issue.

The issue was (I think) that the arrow lib I was using was built with
compiler builtins (such as __builtin_posix_memalign) so that even the
system default allocator wasn't able to be intercepted.

One way to solve this is to build Arrow with -fno-builtin, but
unfortunately that disables a lot of builtins that a person may still
want.  Since allocation is a whole family of functions and not just a few,
it is somewhat difficult to determine which builtins to selectively
disallow.  It would be nice if some project (arrow? mimalloc?) made such
documentation for popular compilers that substitute builtins for allocation
routines.

I opened an issue on mimalloc for this documentation... or at least a
warning about builtins for those using the interception techniques such as
LD_PRELOAD.

-John

On Tue, Jun 14, 2022 at 3:40 PM Sutou Kouhei <k...@clear-code.com> wrote:

> Hi,
>
> posix_memalign() in memory_pool.cc of libarrow-dev uses
> jemalloc's posix_memalign() (je_posix_memalign()). Because
> it's built with ARROW_JEMALLOC=ON (default) and
> JEMALLOC_MANGLE
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool.cc#L53
> . So we can't use mimalloc with LD_PRELOAD.
>
> The comment for JEMALLOC_MANGLE in
> memory_pool.c said "Needed to support jemalloc 3 and 4" bu
> we bundle jemalloc 5.2.1 now. So we can remove JEMALLOC_MANGLE.
>
> Could you open an issue on Jira
> https://issues.apache.org/jira/browse/ARROW to add support
> for overriding system memory pool's allocator by LD_PRELOAD?
> (Do you want to work on this?)
>
>
> Thanks,
> --
> kou
>
> In <cack8hr5ltedfwrat3flsdp1hq5bsoj+dcilvqjdzpdome29...@mail.gmail.com>
>   "Custom default C++ memory pool on Linux, and/or interception/auditing
> of system pool" on Tue, 14 Jun 2022 09:06:51 -0500,
>   John Muehlhausen <j...@jgm.org> wrote:
>
> > Hello,
> >
> > This comment is regarding installation with `apt` on ubuntu 18.04 ...
> > `libarrow-dev/bionic,now 8.0.0-1 amd64`
> >
> > I'm a bit confused about the memory pool situation:
> >
> > * I run with `ARROW_DEFAULT_MEMORY_POOL=system` and check that
> > `arrow::default_memory_pool()->backend_name() ==
> > arrow::system_memory_pool()->backend_name()`
> >
> > * I then LD_PRELOAD a customized (*) mimalloc according to the directions
> > at the mimalloc git repo and things like `strm->Reset(INT32_MAX);` seem
> not
> > to be hitting it... I figured that is a big enough chunk to jostle it
> into
> > doing something... `BufferOutputStream::Create(INT32_MAX)` is also not
> > intercepted by mimalloc.  Is the "system" pool somehow going around the
> > typical allocation interfaces on linux?  I built my own .so and linked it
> > to the app and malloc() is getting intercepted.
> >
> > * `arrow::mimalloc_memory_pool(&mmmp);` does return something... but
> > apparently not "my" mimalloc ... statically linked?
> >
> > * what is going on in Arrow with constructor (pre-main()) allocations?
> > Some of this does hit my LD_PRELOADed mimalloc
> >
> > * any way to get symbols for the apt-installed libs or would I need to
> > build from source to get backtrace with symbols? (for chasing down
> sources
> > of allocations)
> >
> > * what is the C++ lib equivalent of the following from the Python code?
> I
> > figure I could stop trying to understand the built-in/default allocators
> if
> > I could just replace them... but this may also intersect with my question
> > about constructors.  Maybe I'd have to make sure my constructor runs
> first
> > to perform the switch-a-roo before anything else tries to use the default
> > pool?
> >
> > ```
> > namespace py {
> >
> > static std::mutex memory_pool_mutex;
> > static MemoryPool* default_python_pool = nullptr;
> >
> > void set_default_memory_pool(MemoryPool* pool) {
> >   std::lock_guard<std::mutex> guard(memory_pool_mutex);
> >   default_python_pool = pool;
> > }
> > ```
> >
> >
> > (*) the mimalloc customization: the main app has a weak reference that
> ends
> > up defined by the LD_PRELOAD mimalloc, where the function so-supplied
> > allows the app to install a function pointer (back to the main app) that
> > gets called (if defined) at various interesting points in mimalloc
> >
> >
> > Thanks,
> > John
>

Reply via email to