2010YOUY01 commented on PR #8447: URL: https://github.com/apache/arrow-datafusion/pull/8447#issuecomment-1852777952
Thank you for taking this issue! @edmondop Looks like `Expr` API is exposed in multiple places 1. `DataFrame` API 2. `Expr` which can be directly evaluated 3. Constructing `LogicalPlan` with `Expr` I found there are several issues for this rewrite to resolve approach: 1. This can only make `call_fn()` backward compatible for `DataFrame` API, because only `DataFrame` API is created using context/`SessionState`, other APIs are created statelessly and can't access context(FunctionRegistry). The inconsistency might be confusing. 2. Even for `DataFrame` API, rewrite approach seem also not easy to do: inside `DataFrame` it will construct `LogicalPlan`s before analyzing, which will validate the schema by checking the function return type. We might have to resolve function names before analyzing. https://github.com/apache/arrow-datafusion/blob/7f312c8a1d82b19f03c640e4a5d7d6437b2ee835/datafusion/expr/src/logical_plan/plan.rs#L1783 Is it possible to just do `ctx.call_fn()` instead? I think this API change can make function `Expr` creation consistent, also easier to implement. @alamb Reference: current `Expr` for UDFs are created like, `ctx.call_fn()` is a bit simpler and also can support both builtins and UDF during transition period ```rust let pow = df.registry().udf("pow")?; let expr1 = pow.call(vec![col("a"), col("b")]); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
