jonkeane commented on issue #45438: URL: https://github.com/apache/arrow/issues/45438#issuecomment-2675030143
@thisisnic is spot on there. There are options for this, but they all come with tradeoffs. If it's in Acero it's (for now? likely?) gotta be C++. You could pass data without serialization costs with the C-data interface[^1], and pass it to DataFusion, Polars or something else. UDFs are probably not what you want since that operates row by row so is not particularly efficient. [^1]: this is what [`to_duckdb()`](https://arrow.apache.org/docs/r/reference/to_duckdb.html) does to pass the data to duckdb, and this can integrate with dplyr style pipelines via dbplyr so you get the same ergonomics there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
