Jefffrey commented on PR #17085: URL: https://github.com/apache/datafusion/pull/17085#issuecomment-3379521826
> Not at all 😄 > > * `acc_args.schema.field(i)` — returns the raw Arrow `Field` from the (physical) input schema at position `i` (name, type, nullability, metadata exactly as in that schema). > > * `acc_args.exprs[i].return_field(&acc_args.schema)?` — asks the expression for the effective `FieldRef` for argument `i` given the full schema. It incorporates expression semantics (casts, literals, computed types, extension metadata, nullability changes, etc.) and returns an owned/Arc `FieldRef` (and can fail), not just a borrowed `&Field`. Thank you for the explanations, I'm still trying to wrap my head around all the parts involved here 😅 So I think my main confusion lies around the difference between the physical input schema, and the effective `FieldRef` argument; is there a reason we provide the ability to access both? This fix only synthesizes a schema if the physical schema is missing as you mentioned, but would it be incorrect behaviour to instead _always_ synthesize the schema from the physical schema (whether present or not)? If we look at scalar & window functions, I don't see them having equivalent logic around providing direct access to the physical schema, instead they provide methods to directly access `Field`s: - https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/function/struct.ExpressionArgs.html - https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/function/struct.PartitionEvaluatorArgs.html - https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/struct.ScalarFunctionArgs.html I'm trying to understand why `AccumulatorArgs` seems to be the odd one out here; I'm sure there's some historical reason but the limited existing documentation on `AccumulatorArgs` makes it hard for me to reason that this fix is the correct approach 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
