alamb opened a new issue, #14247: URL: https://github.com/apache/datafusion/issues/14247
### Is your feature request related to a problem or challenge? It seems the design of Arrow extension types is nearing consensus and will arrive soon - https://github.com/apache/arrow-rs/pull/5822 The extension type information is encoded in an Arrow [`Field` (doclink link)](https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html) (which has both a `DataType` and the metadata information) In this world, supporting a user function for a user defined type (e.g. a geometry type) I think would look like 1. Creating a user defined function and declaring in the signature that it takes `DataType::Binary` 2. Implementing the `return_type_from_args` function which would then try to get the user defined type information from the Binary column and verify it was correct However, since the `ReturnTypeInfo` only provides `DataType` the the `Field` information will not be present and thus UDF writers will not be able to access extension type information https://github.com/apache/datafusion/blob/274e5356ceb4c559ab4105478e75817a302d2f13/datafusion/expr/src/udf.rs#L359 ### Describe the solution you'd like Since we have not released `return_type_from_args` yet (it will be released in DataFusion 45) I would like to try and change the API before release to support user defined types ### Describe alternatives you've considered Specifically, I would like to pass in `Field` instead of `DataType` in `ReturnTypeArgs` So instead of ```rust pub struct ReturnTypeArgs<'a> { /// The data types of the arguments to the function pub arg_types: &'a [DataType], /// ... pub scalar_arguments: &'a [Option<&'a ScalarValue>], /// Can argument `i` (ever) null? pub nullables: &'a [bool], } ``` I think it would be better to be ```rust pub struct ReturnTypeArgs<'a> { /// The schema fields of the arguments. Fields include DataType, nullability and other information. pub arg_fields: &'a [Field], /// ... pub scalar_arguments: &'a [Option<&'a ScalarValue>], } ``` ### Additional context This was inspired by a comment from @milenkovicm on the DataFusion sync up call yesterday -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
