The easiest approach to take is probably to write each batch to some chunk of shared memory using an IPC stream writer. This will introduce two copies. First, a copy from a C++ RecordBatch to the buffer and second a copy from the buffer to a C# RecordBatch.
One way to do this is by allocating an unmanaged buffer and sharing via that. I have created an extremely barebones example of this here (https://github.com/westonpace/csharp-cpp-arrow-interop-example/tree/main). Alternatively, you could create a memory mapped file and share that way. I tested both approaches and the unmanaged buffer and the memory mapped file introduce about the same amount of overhead. If you really wanted to avoid the copy then it would theoretically be possible by basically "rewrapping" the buffers (similar to the way the Python lib works). However, to the best of my knowledge, the Arrow C# library does not have the capability today. So you would need to invent it yourself. At the end of the day you'd end up with a ton of unmanaged objects and I would guess the tedium of dealing with that is going to outweigh any performance benefit. On Tue, Jun 1, 2021 at 12:51 AM Bjoern Bachmann <[email protected]> wrote: > > Hello Arrow users, > > I would need some help to understand how I can pass an RecordBatch or a table > created in a C++ build DLL into a C# application? How could this be handled > efficiently without much data copies? > > > > Kind Regards, > Bjoern Bachmann. > > > C++ sample code which is exported in the DLL: > > std::queue<std::shared_ptr<arrow::RecordBatch>> wfmQueue; > > ARROW_EXAM_API long EnqueueWaveformChunks() > { > //En-Queue Loop > std::cout << "En-Queue Data ... \n"; > for (int i = 0; i < chunk_size; i++) { > createWfmSrc(col0_data, col1_data); > WaveformChunkWriterHelper testWfm{ col0_data, col1_data }; > //creates schema and adds the data > testWfm.createRecBatch(); > wfmQueue.push(testWfm.getRecBatch()); > } > return S_OK; > } > > ARROW_EXAM_API long DequeueSingleWaveformChunk(arrow::RecordBatch* > recBatch) > { > std::shared_ptr<arrow::RecordBatch> queueItem; > if (!wfmQueue.empty()) { > queueItem = wfmQueue.front(); > wfmQueue.pop(); > std::cout << "Get RecBatch from Queue\n"; > > recBatch = queueItem.get(); > //ExportRecordBatch(*queueItem, c_array); > //recBatch = reinterpret_cast<void*>(queueItem.get()); > } > return S_OK; > } >
