rluvaton opened a new issue, #16904:
URL: https://github.com/apache/datafusion/issues/16904

   ### Is your feature request related to a problem or challenge?
   
   Yes, debugging memory problems are hard, when running DF in production and 
the memory pool does not able to grow the memory it will return 
`ResourcesExhausted` error or panic depending on the memory pool 
implementation, however we don't know what takes all the memory.
   
   In my case I had to manually patch DataFusion `row_hash` file to print on 
error what every accumulator takes and the internals
   
   ### Describe the solution you'd like
   
   I want to add new API - that has `size` function (not required just because 
I think it should be combined with `explain_memory`) and `explain_memory` 
function or something (similar to `Debug` trait) to get string with breakdown 
of the size with every thing that takes memory.
   
   So for `GroupedHashAggregateStream` it would say how much group values 
takes, and for each accumulators it would call the explain memory on each
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to