I believe you have to extend the ipc::MessageReader interface, have you looked at the details in
https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/client.cc#L425 ? (there is analogous code handling the Put side in server.cc) The idea is that you feed the stream of IPC messages and the dictionary accounting/record batch reconstruction is handled internally. On Thu, Feb 18, 2021 at 12:14 PM Dawson D'Almeida < [email protected]> wrote: > Hi Wes, > > We have our own implementation of something like Flight for flexibility of > use. > > The main thing that I am trying to figure out is how to get the dictionary > record batches properly deserialized on the server side. On the client > side, I can deserialize them properly using the dictionarymemo directly > from the record batch we create, but on the other side I do not have access > to the same dictionarymemo. How is this passed in Flight? I have been > trying to find this in the source code but haven't yet. > > Thanks, > Dawson > > On Fri, Feb 12, 2021 at 3:34 PM Wes McKinney <[email protected]> wrote: > >> hi Dawson — you need to follow the IPC stream protocol, e.g. what >> RecordBatchStreamWriter or RecordBatchStreamReader are doing >> internally. Is there a reason you cannot use these interfaces >> (particularly their internal bits, which are also used to implement >> Flight where messages are split across different elements of a gRPC >> stream)? >> >> I'm not sure that I would advise you to deal with dictionary >> disassembly and reconstruction on your own unless it's your only >> option. That said if you look in the unit test suite you should be >> able to find examples of where DictionaryBatch IPC messages are >> reconstructed manually, and then used to reconstitute a RecordBatch >> IPC message using the arrow::ipc::ReadRecordBatch API. We can try to >> help you look in the right place, let us know. >> >> Thanks, >> Wes >> >> On Fri, Feb 12, 2021 at 2:58 PM Dawson D'Almeida >> <[email protected]> wrote: >> > >> > I am trying to create a record batch containing any number of >> dictionary and/or normal arrow arrays, serialize the record batch into >> bytes (a normal std::string), and send it via grpc to another server >> process. On that end we receive the arrow bytes and deserialize using the >> bytes and the schema. >> > >> > Is there a standard way to serialize/deserialize these dictionary >> arrays? It seems like all of the info is packaged correctly into the record >> batch. >> > >> > I've looked through a lot of the c++ apache arrow source and test code >> but I can't find how to approach our use case. >> > >> > The current failure is: >> > Field with memory address 140283497044320 not found >> > from the returns status from arrow::ipc::ReadRecordBatch >> > >> > Thanks, >> > -- >> > Dawson d'Almeida >> > Software Engineer >> > >> > MOBILE +1 360 499 1852 >> > EMAIL [email protected] >> > >> > >> > Snowflake Inc. >> > 227 Bellevue Way NE >> > Bellevue, WA, 98004 >> > > > -- > Dawson d'Almeida > Software Engineer > > MOBILE +1 360 499 1852 > EMAIL [email protected] <[email protected]> > > > Snowflake Inc. > 227 Bellevue Way NE > Bellevue, WA, 98004 >
