andygrove opened a new issue, #2456:
URL: https://github.com/apache/arrow-datafusion/issues/2456

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   We often need to create a `DFField` to represent the output of an `Expr` in 
a schema. We typically attempt to do this today based on an input schema.
   
   Examples:
   
   - `fn to_field(&self, input_schema: &DFSchema)` in `ExprSchemable`
   - `pub fn exprlist_to_fields<'a>(expr: impl IntoIterator<Item = &'a Expr>, 
plan: &LogicalPlan) -> Result<Vec<DFField>>`
   
   This approach is problematic because the input schema loses a lot of 
information compared the plan it represents. For example, it might contain a 
column named `substr(c0, 1, 2)` and may not contain the column `c0`, making it 
impossible to reference `c0` later on. This use case comes up during aggregates 
and is at least partly the cause of bugs such as 
https://github.com/apache/arrow-datafusion/issues/2430
   
   **Describe the solution you'd like**
   We should change the signatures of the functions above to accept an input 
plan instead of an input schema. This gives us more control over resolving 
expressions.
   
   **Describe alternatives you've considered**
   None
   
   **Additional context**
   None
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to