paleolimbot commented on issue #12644:
URL: https://github.com/apache/datafusion/issues/12644#issuecomment-3110153170

   I think it's matter of what "support" entails...with metadata support it's 
certainly possible for metadata-based types like Arrow extension types to exist 
and for people to write functions that consume and produce them. There are also 
opportunities to support extension types in built-in DataFusion operations like 
signatures, casting, UNION ALL, pretty printing in the CLI, CSV output, SQL 
parsing/unparsing, and maybe a few I'm forgetting.
   
   It's possible for an engine based on DataFusion to implement all of those 
today, but also DataFusion could take responsibility for some or all of those. 
Other engines (I'm thinking of DuckDB and Arrow C++) do this by creating an 
"extension" or "user defined" type object that defines behaviour that needs to 
be used by internals. In DuckDB this is just registering casts...in Pyarrow 
this is pretty much just defining type equality and type display. The [vctrs 
package in R](https://vctrs.r-lib.org/articles/s3-vector.html) allows defining 
a few more things like defining how arithmetic is done and how to concatenate 
an array with some other arbitrary type.
   
   I'm happy to help with any of that but also I feel like I'm a bad guesser 
with respect to how the DataFusion/arrow-rs community would like to handle 
extensibility 🙂 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to