Hi all, OmniSci (formerly MapD) has been a long time user of Arrow for IPC serialization and mem sharing of query results, primarily through our python connector. We recently upgraded from Arrow 0.13 to Arrow 0.16. This required us to change our Arrow conversion routines to handle the new DictionaryMemo for serializing dictionaries. For CPU, this was fairly easy as I was able to just write the record batch stream using `arrow::ipc::WriteRecordBatchStream` (and read it using `RecordBatchStreamReader` on the client). For GPU/CUDA, however, I did not see a way to serialize the dictionary alongside the CUDA data and wrap that in a single "object" (the semantics of which probably need to be broken down, which I will do in a second). So, I came up with our own: https://github.com/omnisci/omniscidb/blob/4ab6622bd0ee15e478bff4263f083ab761fc965c/QueryEngine/ArrowResultSetConverter.cpp#L219
Essentially, I assemble a RecordBatch with the dictionaries I want to serialize and call WriteRecordBatchStream to serialize into a CPU IPC stream, which I copy to CPU shared memory. I then serialize the GPU record batch using SerializeRecordBatch into a CUDABuffer. The CudaBuffer is exported for IPC sharing, and I send both memory handles (CPU and GPU) over to the client. The client then has to read the RecordBatch containing the dictionaries and place the dictionaries into a DictionaryMemo, which is used to read the record batches from GPU. The process of building the DictionaryMemo on the client is here: https://github.com/omnisci/omniscidb/blob/master/Tests/ArrowIpcIntegrationTest.cpp#L380 This seems to work ok, at least for C++, but I am interested in making it more compact and possibly contributing some or all to mainline Arrow. Therefore, I have two questions: 1) Does this look like a reasonable way to go about handling a serialized RecordBatch in CUDA (that is, separate the dictionaries and return two objects, or a single object holding two handles)? 2) Is this something that the Arrow community would be interested in seeing contributed in whatever form we agree upon for (1)? Thanks, Alex