The easiest approach to take is probably to write each batch to some
chunk of shared memory using an IPC stream writer.  This will
introduce two copies.  First, a copy from a C++ RecordBatch to the
buffer and second a copy from the buffer to a C# RecordBatch.

One way to do this is by allocating an unmanaged buffer and sharing
via that.  I have created an extremely barebones example of this here
(https://github.com/westonpace/csharp-cpp-arrow-interop-example/tree/main).
Alternatively, you could create a memory mapped file and share that
way.  I tested both approaches and the unmanaged buffer and the memory
mapped file introduce about the same amount of overhead.

If you really wanted to avoid the copy then it would theoretically be
possible by basically "rewrapping" the buffers (similar to the way the
Python lib works).  However, to the best of my knowledge, the Arrow C#
library does not have the capability today.  So you would need to
invent it yourself.  At the end of the day you'd end up with a ton of
unmanaged objects and I would guess the tedium of dealing with that is
going to outweigh any performance benefit.

On Tue, Jun 1, 2021 at 12:51 AM Bjoern Bachmann <[email protected]> wrote:
>
> Hello Arrow users,
>
> I would need some help to understand how I can pass an RecordBatch or a table 
> created in a C++ build DLL into a C# application? How could this be handled 
> efficiently without much data copies?
>
>
>
> Kind Regards,
> Bjoern Bachmann.
>
>
> C++ sample code which is exported in the DLL:
>
> std::queue<std::shared_ptr<arrow::RecordBatch>> wfmQueue;
>
> ARROW_EXAM_API long EnqueueWaveformChunks()
>     {
>         //En-Queue Loop
>         std::cout << "En-Queue Data ... \n";
>         for (int i = 0; i < chunk_size; i++) {
>             createWfmSrc(col0_data, col1_data);
>             WaveformChunkWriterHelper testWfm{ col0_data, col1_data };  
> //creates schema and adds the data
>             testWfm.createRecBatch();
>             wfmQueue.push(testWfm.getRecBatch());
>         }
>         return S_OK;
>     }
>
>     ARROW_EXAM_API long DequeueSingleWaveformChunk(arrow::RecordBatch* 
> recBatch)
>     {
>         std::shared_ptr<arrow::RecordBatch> queueItem;
>         if (!wfmQueue.empty()) {
>             queueItem = wfmQueue.front();
>             wfmQueue.pop();
>             std::cout << "Get RecBatch from Queue\n";
>
>             recBatch = queueItem.get();
>             //ExportRecordBatch(*queueItem, c_array);
>             //recBatch = reinterpret_cast<void*>(queueItem.get());
>         }
>         return S_OK;
>     }
>

Reply via email to