Hi Arrow community,

My team operates an arrow Flight service implemented in java and we also do 
additional allocations in the service via Arrow's BufferAllocator API .
We've been recently struggling with OOM issues for both heap and off-heap 
memory.
One potential issue we've noticed from a heap dump is that we have java 
DirectByteBuffer objects that are eligible for GC that hang onto a significant 
volume of underlying direct memory buffers that can't be freed until their 
owners are GC'd.

To that end, we'd like to understand more about how Arrow's memory management 
works.
My understanding is that the arrow library uses netty for the allocation and 
the NettyAllocationManager internally uses java.nio.DirectByteBuffer and 
performs some pooling.
When we call VectorSchemaRoot::close, will the underlying direct memory buffers 
owned by the root be immediately freed?

On the write path, we batch incoming VSRs into a larger VSR that we allocate 
and use as a buffer. We append the smaller VSRs to this buffer via 
VectorSchemaRootAppender::append and then immediately close the smaller VSRs. 
We're wondering if this extra allocation can result in us using extra memory if 
the buffers held by the smaller VSRs aren't immediately freed when we call 
close.

For context, the arrow version we are using is 7.0.0. Thank you for your help!

Sincerely,
Hrishi Dharam


Reply via email to