Re: [PR] DRAFT: Resolve function calls by name during planning [arrow-datafusion]

via GitHub Tue, 12 Dec 2023 12:39:23 -0800


2010YOUY01 commented on PR #8447:
URL: 
https://github.com/apache/arrow-datafusion/pull/8447#issuecomment-1852777952


   Thank you for taking this issue! @edmondop 
   
   Looks like `Expr` API is exposed in multiple places
   1. `DataFrame` API
   2.  `Expr` which can be directly evaluated
   3. Constructing `LogicalPlan` with `Expr`
   
   I found there are several issues for this rewrite to resolve approach:
   1. This can only make `call_fn()` backward compatible for `DataFrame` API, 
because only `DataFrame` API is created using context/`SessionState`, other 
APIs are created statelessly and can't access context(FunctionRegistry). The 
inconsistency might be confusing.
   2. Even for `DataFrame` API, rewrite approach seem also not easy to do: 
inside `DataFrame` it will construct `LogicalPlan`s before analyzing, which 
will validate the schema by checking the function return type. We might have to 
resolve function names before analyzing. 
https://github.com/apache/arrow-datafusion/blob/7f312c8a1d82b19f03c640e4a5d7d6437b2ee835/datafusion/expr/src/logical_plan/plan.rs#L1783
   
   Is it possible to just do `ctx.call_fn()` instead? I think this API change 
can make function `Expr` creation consistent, also easier to implement. @alamb 
   
   Reference: current `Expr` for UDFs are created like, `ctx.call_fn()` is a 
bit simpler and also can support both builtins and UDF during transition period
   ```rust
       let pow = df.registry().udf("pow")?;
       let expr1 = pow.call(vec![col("a"), col("b")]);
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] DRAFT: Resolve function calls by name during planning [arrow-datafusion]

Reply via email to