GitHub user zheniasigayev added a comment to the discussion: Best practices for 
memory-efficient deduplication of pre-sorted Parquet files

I created a GitHub issue with relevant details summarized. See: `Streaming 
Aggregate operator not being used in deduplication of pre-sorted Parquet files` 
#16919. @alamb, let me know what other help I can try to provide from my end.

GitHub link: 
https://github.com/apache/datafusion/discussions/16776#discussioncomment-13893368

----
This is an automatically sent email for github@datafusion.apache.org.
To unsubscribe, please send an email to: 
github-unsubscr...@datafusion.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to