alamb commented on issue #6735: URL: https://github.com/apache/arrow-rs/issues/6735#issuecomment-3029094259
> > Often computing the transformation may be non trivial (e.g. matching columns by name) so it would be nice to do the mapping calculation once per schema rather than once per batch / StructArrayschema. For example DF's SchemaAdapter computes the mapping once and can then apply that to multiple batches. > > I'm tot sure how this would happen in practice: there's no state in `UDFs` in DataFusion. So if we e.g. wanted to implement `cast(...)` in terms of a SchemaAdapter we have nowhere to store the pre-computed value. I think we'd have to introduce some sort of build step that goes around the expression tree and optimizes expressions for the given input / output schemas. I wonder if we could use the new snapshot machinery 🤔 https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html#method.snapshot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org