Hi Ivan,

Earlier we did add some instructions to profile memory allocations on the
memory pools ("big allocations" as described by Sasha above). Docs are here
[1]. If you do come up with some other method, it would be great to
document in an adjacent section :)

Another suggestion I heard a while ago was to use OpenTelemetry to collect
memory usage / allocation metrics. I'm not super close to those efforts,
but I believe there's already been some work to integrate OTel and Acero. I
recorded the issue here [2].

I hope that's helpful info!

Best,

Will Jones

[1] https://arrow.apache.org/docs/cpp/memory.html#memory-profiling
[2] https://issues.apache.org/jira/browse/ARROW-15512

On Wed, Jul 6, 2022 at 1:17 PM Sasha Krassovsky <krassovskysa...@gmail.com>
wrote:

> Hi Ivan,
> Inside of Acero, we can think of allocations as coming in two classes:
> - "Big” allocations, which go through `MemoryPool`, using `Buffer`. These
> are used for representing columns of input data and hash tables.
> - “Small” allocations, which are usually small, local STL containers like
> std::vector and std::unordered_map. These go through `operator new` (in
> more detail: they are templated to use `std::allocator::allocate` which
> ends up calling `operator new`).
>
> You’ll need to do different things to track these two classes.
>
> For big allocations, you can make your own implementation of the
> MemoryPool interface which performs all of the statistics gathering you’d
> need to do (you can see `LoggingMemoryPool` as an example, which just
> prints to stdout every time there is an allocation). You can then pass this
> memory pool in via the `ExecPlan`’s `ExecContext`.
>
> For small allocations, I think you should just be able to implement your
> own `operator new` and `operator delete` inside of your own benchmark file.
> This will replace the default `operator new` and `operator delete` and let
> you gather statistics. One note is that you’ll have to call `malloc` and
> `free` in your implementations as the default `operator new` and `operator
> delete` will be inaccessible.
>
> Sasha
>
> > On Jul 6, 2022, at 12:42 PM, Ivan Chau <ivan.m.c...@gmail.com> wrote:
> >
> > Hi all,
> >
> >
> > My name is Ivan -- some of you may know me from some of my contributions
> > benchmarking node performances on Acero. Thank you for all the help so
> far!
> >
> >
> >
> > In addition to my runtime benchmarking, I am interested in pursuing some
> > method of memory profiling to further assess our streaming capabilities.
> > I’ve taken a short look at Google Benchmarks’ memory profiling, of which
> I
> > could really find https://github.com/google/benchmark/issues/1217, as
> the
> > most salient example usage. It allows you to plug in your own Memory
> > Manager, and specify what to return at the beginning and end of every
> > benchmark.
> >
> >
> >
> > To my understanding, we would need to rework our existing memory pool /
> > execution context to aggregate the number_of_allocs and bytes_used that
> are
> > reported by Google Benchmarks, but I’d imagine there could be better
> tools
> > for the job which might yield more interesting information (line by line
> > analysis, time plots, etc., peak stats and other metrics, etc.)
> >
> >
> >
> > Do you have any advice on what direction I should take for this or know
> > someone who does? I’ve run some one-off tests using valgrind but I am
> > wondering if I could help implement something more general (and helpful)
> > for the cpp arrow codebase.
> >
> >
> >
> > Best,
> >
> > Ivan
>
>

Reply via email to