paleolimbot commented on issue #12644: URL: https://github.com/apache/datafusion/issues/12644#issuecomment-3110153170
I think it's matter of what "support" entails...with metadata support it's certainly possible for metadata-based types like Arrow extension types to exist and for people to write functions that consume and produce them. There are also opportunities to support extension types in built-in DataFusion operations like signatures, casting, UNION ALL, pretty printing in the CLI, CSV output, SQL parsing/unparsing, and maybe a few I'm forgetting. It's possible for an engine based on DataFusion to implement all of those today, but also DataFusion could take responsibility for some or all of those. Other engines (I'm thinking of DuckDB and Arrow C++) do this by creating an "extension" or "user defined" type object that defines behaviour that needs to be used by internals. In DuckDB this is just registering casts...in Pyarrow this is pretty much just defining type equality and type display. The [vctrs package in R](https://vctrs.r-lib.org/articles/s3-vector.html) allows defining a few more things like defining how arithmetic is done and how to concatenate an array with some other arbitrary type. I'm happy to help with any of that but also I feel like I'm a bad guesser with respect to how the DataFusion/arrow-rs community would like to handle extensibility 🙂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org