pepijnve commented on PR #19287: URL: https://github.com/apache/datafusion/pull/19287#issuecomment-3670284256
One remaining question I have is how `skip_aggregation_probe` and `group_values_soft_limit` are expected to interact with the memory reservation system. In `SkipAggregationProbe::update_state` the total number of input rows is accumulated, but the only the current number of group values is compared against that. If disk spilling or early emission is in effect, then the consequence is that the ratio will only decrease since the denominator of the ratio keeps on increasing. In other words, it's unlikely to ever kick in. The same goes for `group_values_soft_limit`. It's compared against the number of group values that are held in memory by the aggregation stream itself. If we keep flushing values either down the pipeline or to disk, it's also unlikely to ever trigger. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
