GitHub user alamb added a comment to the discussion: Best practices for 
memory-efficient deduplication of pre-sorted Parquet files

Yes, please, I actually did some testing today, 
- https://github.com/apache/datafusion/issues/16899
- https://github.com/apache/datafusion/pull/16900

What I would expect in this case is to see an `AggregateExec` in the plan that 
had the annotation of `ordering_mode=PartiallySorted([0]` (note that is 
different than the "Partial" annotation)



```sql
AggregateExec: mode=Partial, gby=[a@0 as a, b@1 as b], aggr=[count(Int64(1))], 
ordering_mode=PartiallySorted([0])
```

Perhaps you can double check the explain plan like `EXPLAIN FORMAT INDENT ..` 
(which will produce a more detailed version of explain that has many more 
details)

Thanks for sticking with this


GitHub link: 
https://github.com/apache/datafusion/discussions/16776#discussioncomment-13881971

----
This is an automatically sent email for github@datafusion.apache.org.
To unsubscribe, please send an email to: 
github-unsubscr...@datafusion.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to