erratic-pattern commented on issue #10426: URL: https://github.com/apache/datafusion/issues/10426#issuecomment-2105361124
Are there any potential issues with simply using the existing `Hash` implementation of `Expr` to create `HashSet`s? Serveral other optimization passes use string names as keys for expressions in data structures. I am wondering if any of these could also be refactored to simply use `HashSet<Expr>` or `HashSet<&Expr>` synthetic group by expressions for aggregates: https://github.com/apache/datafusion/blob/accce9732e26723cab2ffc521edbf5a3fe7460b3/datafusion/expr/src/logical_plan/builder.rs#L1246-L1270 functional dependencies heavily uses `display_name` to represent group by exprs: https://github.com/apache/datafusion/blob/main/datafusion/common/src/functional_dependencies.rs decorrelate: https://github.com/apache/datafusion/blob/accce9732e26723cab2ffc521edbf5a3fe7460b3/datafusion/optimizer/src/decorrelate.rs#L65 push down filter for aggregates: https://github.com/apache/datafusion/blob/accce9732e26723cab2ffc521edbf5a3fe7460b3/datafusion/optimizer/src/push_down_filter.rs#L788-L837 single distinct to group by: https://github.com/apache/datafusion/blob/accce9732e26723cab2ffc521edbf5a3fe7460b3/datafusion/optimizer/src/single_distinct_to_groupby.rs#L69-L96 https://github.com/apache/datafusion/blob/accce9732e26723cab2ffc521edbf5a3fe7460b3/datafusion/optimizer/src/single_distinct_to_groupby.rs#L185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org