ion-elgreco commented on issue #1227: URL: https://github.com/apache/datafusion-python/issues/1227#issuecomment-3303320257
I agree with removing pyarrow as a dependency. We have to consider that there are users out there that want to create small containers but do data operations in them, if you pull in pyarrow which is so heavyweight that complicates it a lot. In delta-rs I've already done this, https://github.com/delta-io/delta-rs/pull/3459 some time ago, which was very easy due to Arro3. (Thanks @kylebarron :D) As I am thinking about it, @kylebarron a pointer might make more sense since it's more to be interopable between other libs, if you want to analyze the data it can be done directly in datafusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
