LiaCastaneda commented on issue #8938: URL: https://github.com/apache/arrow-rs/issues/8938#issuecomment-3738907488
> My confusion remains how this is practically any different from users registering the memory allocation after it emerges from the kernel. The main advantage is that callers creating Arrays in multiple locations of their project won't need to remember to claim data manually at each location. This also helps track intermediate allocations. For instance, if a kernel merges two arrays, it will probably need to allocate intermediate buffers (like null buffers or offset buffers). Without automatic claiming, the pool will only report that memory is exhausted after the kernel returns and intermediate allocations are complete, rather than detecting exhaustion while attempting the intermediate allocation itself. Another option I considered is exposing the MemoryPool API at the `RecordBatch` level. However, this would miss any Array creations that happen within the operator itself (for example, Accumulators in DataFusion), giving incomplete accounting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
