Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Dewey Dunnington
> This begs the question of what happens if a consumer receives an unknown flag > value That's a great point...I might be the only person who has implemented a deep copy of an ArrowSchema in C, but it does blindly pass along a schema's flag value (which in the scenario I proposed could lead to a

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Dewey Dunnington
I'm afraid I've derailed the discussion into solving a bigger problem than strictly necessary. I don't think this is the time to solve the general problem of the C data interface having no way to communicate buffer sizes, particularly since there's no immediate agreement on its utility or implement

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Antoine Pitrou
Le 26/10/2023 à 20:02, Benjamin Kietzman a écrit : Is this buffer lengths buffer only present if the array type is Utf8View? IIUC, the proposal would add the buffer lengths buffer for all types if the schema's flags include ARROW_FLAG_BUFFER_LENGTHS. I do find it appealing to avoid the specia

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Benjamin Kietzman
> Is this buffer lengths buffer only present if the array type is Utf8View? IIUC, the proposal would add the buffer lengths buffer for all types if the schema's flags include ARROW_FLAG_BUFFER_LENGTHS. I do find it appealing to avoid the special case and that `n_buffers` would continue to be consi

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Weston Pace
Is this buffer lengths buffer only present if the array type is Utf8View? Or are you suggesting that other types might want to adopt this as well? On Thu, Oct 26, 2023 at 10:00 AM Dewey Dunnington wrote: > > I expect C code to not be much longer then this :-) > > nanoarrow's buffer-length-calcul

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Antoine Pitrou
Le 26/10/2023 à 18:59, Dewey Dunnington a écrit : That sounds a bit hackish to me. Including only *some* buffer sizes in array->buffers[array->n_buffers] special-cased for only two types (or altering the number of buffers required by the IPC format vs. the number of buffers required by the

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Dewey Dunnington
> I expect C code to not be much longer then this :-) nanoarrow's buffer-length-calculation and validation concepts are (perhaps inadvisably) intertwined...even with both it is not that much code (perhaps I was remembering how much time it took me to figure out which 35 lines to write :-)) > That

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Antoine Pitrou
Le 26/10/2023 à 17:45, Dewey Dunnington a écrit : The lack of buffer sizes is something that has come up for me a few times working with nanoarrow (which dedicates a significant amount of code to calculating buffer sizes, which it uses to do validation and more efficient copying). By the wa

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Antoine Pitrou
Le 26/10/2023 à 17:45, Dewey Dunnington a écrit : > A potential alternative might be to allow any ArrowArray to declare > its buffer sizes in array->buffers[array->n_buffers], perhaps with a > new flag in schema->flags to advertise that capability. That sounds a bit hackish to me. I'd rather l

Re: [DISCUSS][Format] C data interface for Utf8View

2023-10-26 Thread Dewey Dunnington
Ben kindly explained to me offline that the need for the buffer sizes is because when Arrow C++ imports an Array it creates Buffer class wrappers around the imported pointers. Arrow C++ does not have a notion of a buffer of unknown size to my knowledge, which leaves two undesirable alternatives: (1

Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-26 Thread Benjamin Kietzman
Congratulations! On Thu, Oct 26, 2023 at 10:35 AM Dane Pitkin wrote: > Congratulations, Xuwei! > > On Thu, Oct 26, 2023 at 9:34 AM Joris Van den Bossche < > jorisvandenboss...@gmail.com> wrote: > > > Congrats! > > > > On Wed, 25 Oct 2023 at 08:23, Ian Joiner wrote: > > > > > > Congrats! > > > >

Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-26 Thread Dane Pitkin
Congratulations, Xuwei! On Thu, Oct 26, 2023 at 9:34 AM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > Congrats! > > On Wed, 25 Oct 2023 at 08:23, Ian Joiner wrote: > > > > Congrats! > > > > On Mon, Oct 23, 2023 at 2:33 AM Sutou Kouhei wrote: > > > > > On behalf of the Arrow PMC

Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-26 Thread Joris Van den Bossche
Congrats! On Wed, 25 Oct 2023 at 08:23, Ian Joiner wrote: > > Congrats! > > On Mon, Oct 23, 2023 at 2:33 AM Sutou Kouhei wrote: > > > On behalf of the Arrow PMC, I'm happy to announce that Xuwei Fu > > has accepted an invitation to become a committer on Apache > > Arrow. Welcome, and thank you f