linhr commented on PR #18921:
URL: https://github.com/apache/datafusion/pull/18921#issuecomment-3668603319

   This is an exciting initiative!
   
   I skimmed through the PR, and I have some thoughts regarding how this can 
fit into DataFusion's existing setup. I wonder if we can have 
`Expr::LambdaFunction(LambdaFunction)`, similar to 
`Expr::ScalarFunction(ScalarFunction)`? Here are my reasoning:
   1. `LambdaFunction` can represent a "resolved" lambda function call in the 
logical plan. In contrast, `Expr::Lambda` and `Expr::LambdaColumn` are only 
fragments whose data types etc. are not well-defined which may be challenging 
to work with in other parts of the library (e.g. `ExprSchemable`).
   2. I feel `ScalarUDF` may not fit cleanly for lambda functions, so we can 
then have a separate abstraction (e.g. `LambdaUDF`) inside 
`Expr::LambdaFunction`. The `ArrayTransform` example shows that we would have 
to use `ScalarFunctionArgs` which was not originally designed for the lambda 
use case. (IMO when the function is actually "invoked" with Arrow arrays, the 
lambda parameter should have been resolved and removed from the argument list.)
   3. Once we have a self-contained logical expression, the entire query 
analysis flow might become easier to reason about (from `SqlToRel` to 
`Expr::LambdaFunction` to a dedicated `PhysicalExpr` and the corresponding 
physical planner).
   
   This is just my rough thoughts. Happy to discuss!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to