Hi, There is also a new Arrow C library (one .h and one .c file) which makes it easier to use it from the postgresql codebase.
https://arrow.apache.org/blog/2023/03/07/nanoarrow-0.1.0-release/ https://github.com/apache/arrow-nanoarrow/tree/main/dist Best regards, Adam Lippai On Thu, Apr 13, 2023 at 2:35 PM Adam Lippai <a...@rigo.sk> wrote: > Hi, > > There are two bigger developments in this topic: > > 1. Pandas 2.0 is released and it can use Apache Arrow as a backend > 2. Apache Arrow ADBC is released which standardizes the client API. > Currently it uses the postgresql wire protocol underneath > > Best regards, > Adam Lippai > > On Thu, Apr 21, 2022 at 10:41 AM Adam Lippai <a...@rigo.sk> wrote: > >> Hi, >> >> would it be possible to add Apache Arrow streaming format to the copy >> backend + frontend? >> The use case is fetching (or storing) tens or hundreds of millions of >> rows for client side data science purposes (Pandas, Apache Arrow compute >> kernels, Parquet conversion etc). It looks like the serialization overhead >> when using the postgresql wire format can be significant. >> >> Best regards, >> Adam Lippai >> >