Re: Apache Arrow file format

2023-10-17 Thread wish maple
Arrow IPC file is great, it focuses on in-memory representation and direct computation. Basically, it can support compression and dictionary encoding, and can zero-copy deserialize the file to memory Arrow format. Parquet provides some strong functionality, like Statistics, which could help

Re: Apache Arrow file format

2023-10-17 Thread Adam Lippai
Also there is https://github.com/lancedb/lance between the two formats. Depending on the use case it can be a great choice. Best regards Adam Lippai On Tue, Oct 17, 2023 at 22:44 Matt Topol wrote: > One benefit of the feather format (i.e. Arrow IPC file format) is the > ability to mmap the

Re: Apache Arrow file format

2023-10-17 Thread Matt Topol
One benefit of the feather format (i.e. Arrow IPC file format) is the ability to mmap the file to easily handle reading sections of a larger than memory file of data. Since, as Felipe mentioned, the format is focused on in-memory representation, you can easily and simply mmap the file and use the

Re: Apache Arrow file format

2023-10-17 Thread Felipe Oliveira Carvalho
It’s not the best since the format is really focused on in- memory representation and direct computation, but you can do it: https://arrow.apache.org/docs/python/feather.html — Felipe On Tue, 17 Oct 2023 at 23:26 Nara wrote: > Hi, > > Is it a good idea to use Apache Arrow as a file format?

Apache Arrow file format

2023-10-17 Thread Nara
Hi, Is it a good idea to use Apache Arrow as a file format? Looks like projecting columns isn't available by default. One of the benefits of Parquet file format is column projection, where the IO is limited to just the columns projected. Regards , Nara

Re: Language-specific discussion (with C# example)

2023-10-17 Thread Felipe Oliveira Carvalho
The Zulip is https://ursalabs.zulipchat.com/ On Tue, Oct 17, 2023 at 9:55 PM Will Jones wrote: > Hi Curt, > > I think the most visible place for now would be creating an issue for > discussion. > > In the future, if you and some others want to have a place to discuss C# > development, you

Re: Language-specific discussion (with C# example)

2023-10-17 Thread Will Jones
Hi Curt, I think the most visible place for now would be creating an issue for discussion. In the future, if you and some others want to have a place to discuss C# development, you could create a channel in a chat app. For example, Arrow Rust has both a Slack channel in the official ASF Slack as

Language-specific discussion (with C# example)

2023-10-17 Thread Curt Hagenlocher
I'm curious what other (sub-) communities do about implementation-specific considerations that aren't directly tied to the Arrow standard. I don't see much of that kind of discussion on the dev list; does that mean these happen largely in the context of specific pull requests -- or perhaps not at

Re: [ANNOUNCE] New Arrow committer: Curt Hagenlocher

2023-10-17 Thread Joris Van den Bossche
Welcome to the team, Curt! On Mon, 16 Oct 2023 at 23:17, Curt Hagenlocher wrote: > > Thanks, all! > > On Mon, Oct 16, 2023 at 9:19 AM Dane Pitkin > wrote: > > > Congrats Curt! > > > > On Mon, Oct 16, 2023 at 12:00 PM Kevin Gurney > > > > wrote: > > > > > Congratulations, Curt! > > >