Re: [I] [DISCUSSION] Memory accounting model discussion [datafusion]

via GitHub Sat, 22 Nov 2025 05:18:12 -0800


alamb commented on issue #16841:
URL: https://github.com/apache/datafusion/issues/16841#issuecomment-3566702940


   For GroupedHashAggregate stream in particular, another potential solution 
would be to implement a GroupsAccumulator for whatever aggregate you are 
working on, rather than rely on ScalarValue (which can hold pointers into 
shared buffers)
   
   They are trickey to do, but are often faster and could be much more precise 
with reporting their allocation
   
   I think the challenge of any sort of arrow_pool based solution is that the 
arrow buffers themselves are shared, so it will be very hard to know when to 
know the memory is "reclaimed" and not to double count it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [DISCUSSION] Memory accounting model discussion [datafusion]

Reply via email to