fmonjalet commented on issue #16841:
URL: https://github.com/apache/datafusion/issues/16841#issuecomment-3107483475

   > I would argue that when accounting anything that is not owned (like 
ArrayRef, ScalarValue::List(...), ...), the memory reserved should be equal to 
the bytes occupied by the reference itself (size_of::<Arc>() or potentially 
even 0) and not the size of referenced object.
   
   The tricky part is that there is no single place that properly owns the 
data: it's a shared ownership (`Arc`). Meaning that even if you only use one 
row, keeping the pointer alive keeps the whole Array alive.
   A good example may be if you have 1000 record batches of 1MiB each and 
retain one value per record batch as an aggregation key:
   - In theory you are only retaining 1000 rows, so not a lot of data
   - But in practice you are keeping 1GiB of data alive, which breaks the 
assumptions of the streaming model
   So this data has to be accounted for. In this example, 1000 values indeed 
retain 1GiB.
   
   But the issue is when there are multiple reference to the same Array 
retained. If you have 1 record batch of 1MiB and retain 1000 rows in it, 
without copying the values, the current memory accounting will count the same 
Array 1000 times, effectively counting 1GiB of used memory instead of 1MiB. 
This is the main memory over accounting we face when aggregating on arrays (as 
in: a column in which each row is an `Array<String>` for example).
   
   I had written a longer analysis there: 
https://github.com/apache/datafusion/issues/16055, although this particular 
instance was solved thanks to the new `compact()` API.
   
   The solution of copying can be less efficient in terms of CPU but is the 
best in terms of retained memory (it keeps the goodies of the streaming model 
even in nasty corner cases). A good trade-off would probably be to avoid double 
counting Arrow array buffers, but I am not sure how this could be integrated.
   
   Anyway, strong +1 for this issue as it's been a recurring problem that makes 
the (otherwise super great) memory tracking fail queries that should easily 
pass. Thanks @gabotechs!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to