GitHub user alamb added a comment to the discussion: Best practices for memory-efficient deduplication of pre-sorted Parquet files
Yes, please, I actually did some testing today, - https://github.com/apache/datafusion/issues/16899 - https://github.com/apache/datafusion/pull/16900 What I would expect in this case is to see an `AggregateExec` in the plan that had the annotation of `ordering_mode=PartiallySorted([0]` (note that is different than the "Partial" annotation) ```sql AggregateExec: mode=Partial, gby=[a@0 as a, b@1 as b], aggr=[count(Int64(1))], ordering_mode=PartiallySorted([0]) ``` Perhaps you can double check the explain plan like `EXPLAIN FORMAT INDENT ..` (which will produce a more detailed version of explain that has many more details) Thanks for sticking with this GitHub link: https://github.com/apache/datafusion/discussions/16776#discussioncomment-13881971 ---- This is an automatically sent email for github@datafusion.apache.org. To unsubscribe, please send an email to: github-unsubscr...@datafusion.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org