alamb commented on issue #16841: URL: https://github.com/apache/datafusion/issues/16841#issuecomment-3566702940
For GroupedHashAggregate stream in particular, another potential solution would be to implement a GroupsAccumulator for whatever aggregate you are working on, rather than rely on ScalarValue (which can hold pointers into shared buffers) They are trickey to do, but are often faster and could be much more precise with reporting their allocation I think the challenge of any sort of arrow_pool based solution is that the arrow buffers themselves are shared, so it will be very hard to know when to know the memory is "reclaimed" and not to double count it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
