andygrove opened a new issue, #2456: URL: https://github.com/apache/arrow-datafusion/issues/2456
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We often need to create a `DFField` to represent the output of an `Expr` in a schema. We typically attempt to do this today based on an input schema. Examples: - `fn to_field(&self, input_schema: &DFSchema)` in `ExprSchemable` - `pub fn exprlist_to_fields<'a>(expr: impl IntoIterator<Item = &'a Expr>, plan: &LogicalPlan) -> Result<Vec<DFField>>` This approach is problematic because the input schema loses a lot of information compared the plan it represents. For example, it might contain a column named `substr(c0, 1, 2)` and may not contain the column `c0`, making it impossible to reference `c0` later on. This use case comes up during aggregates and is at least partly the cause of bugs such as https://github.com/apache/arrow-datafusion/issues/2430 **Describe the solution you'd like** We should change the signatures of the functions above to accept an input plan instead of an input schema. This gives us more control over resolving expressions. **Describe alternatives you've considered** None **Additional context** None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org