Sure, I've added you to the JIRA contributors role (so you can self-assign in the future) and assigned the JIRA to you. Thanks for taking this on.
-David On Mon, Jul 26, 2021, at 11:06, Nate Bauernfeind wrote: > David, do you mind assigning to me? I'll get a PR up within the next two > days. The JIRA username is 'nbauernfeind'. > > On Mon, Jul 26, 2021 at 8:41 AM David Li <lidav...@apache.org> wrote: > > > I filed ARROW-13449 for this, thanks for the clarification. > > > > -David > > > > On Mon, Jul 26, 2021, at 10:25, Wes McKinney wrote: > > > IIRC, we did intend to encapsulate the Schema in an IPC message. > > > Updating the docs / comments in the Flight.proto is the right thing to > > > do. > > > > > > On Sun, Jul 25, 2021 at 12:47 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > > > > > > I think if all reference implementations are doing this the same way we > > > > should update the docs and I don't think a vote is necessary. > > > > > > > > On Sun, Jul 25, 2021 at 10:27 AM David Li <lidav...@apache.org> wrote: > > > > > > > > > Hey Nate, > > > > > > > > > > Good catch. I would say it was intentional to use an IPC message > > (while > > > > > the length is redundant, it also contains the metadata version and > > custom > > > > > metadata), and the comment is a little vague. I'm not sure if we > > need a > > > > > vote to update this since it is changing files in the format dir. > > > > > > > > > > I did check the Java implementation quickly and even in the initial > > > > > version, the schema is IPC-encapsulated[1]. > > > > > > > > > > -David > > > > > > > > > > [1]: > > > > > > > https://github.com/apache/arrow/commit/b3cd616f6ce39e9a820b658ba2f2e2fd3b153555#diff-80e8aef2d6b475949641689fe8d686db2cc1d6703cafe8f6f0dc120b04f964b7R106 > > > > > > > > > > On Fri, Jul 23, 2021, at 20:28, Nate Bauernfeind wrote: > > > > > > In flight.proto [1] it states that the encoded bytes are as > > described in > > > > > > the flatbuffer schema. > > > > > > > > > > > > ``` > > > > > > /* > > > > > > * Wrap the result of a getSchema call > > > > > > */ > > > > > > message SchemaResult { > > > > > > // schema of the dataset as described in Schema.fbs::Schema. > > > > > > bytes schema = 1; > > > > > > } > > > > > > ``` > > > > > > > > > > > > However, both this schema and the schema on the flight info are > > actually > > > > > > encoded like an IPC file stream message. > > > > > > > > > > > > They start with a 4-byte IPC_CONTINUATION_TOKEN, followed by a > > 4-byte > > > > > > message size, followed by a Message wrapping a Schema. See [2] and > > [3] > > > > > for > > > > > > evidence in the Java client. > > > > > > > > > > > > Is this an accidental bug that has propagated to all client > > > > > implementations? > > > > > > Or was it intentional and I should submit a PR to update the > > comments? > > > > > > > > > > > > Thanks, > > > > > > Nate > > > > > > [1] > > https://github.com/apache/arrow/blob/master/format/Flight.proto#L195 > > > > > > [2] > > > > > > > > > > > > > https://github.com/apache/arrow/blob/master/java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightInfo.java#L141 > > > > > > [3] > > > > > > > > > > > > > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/MessageSerializer.java#L161 > > > > > > -- > > > > > > > > > > > > > > > > > > > -- >