Wes McKinney created ARROW-5377:
-----------------------------------

             Summary: [C++] Develop interface for writing a RecordBatch IPC 
stream into pre-allocated space (e.g. memory map) that avoids unnecessary 
serialization
                 Key: ARROW-5377
                 URL: https://issues.apache.org/jira/browse/ARROW-5377
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Wes McKinney


As discussed in recent mailing list thread

https://lists.apache.org/thread.html/b756209052fecb8c28a5eb37db7aecb82a5f5351fa79a9d86f0dba3e@%3Cuser.arrow.apache.org%3E

The only viable process at the moment for getting an accurate report of stream 
size is to write a simulated stream using {{MockOutputStream}}. This is 
suboptimal for a couple of reasons:

* Flatbuffers metadata must be created twice
* Record batch disassembly into IpcPayload must be performed twice

It seems like an interface with a very constrained public API could be provided 
to deconstruct a sequence of RecordBatches and report the size of the produced 
IPC stream (based on metadata sizes, and padding), and then this deconstructed 
set of IPC payloads can be written out to a stream (e.g. using 
{{FixedSizeBufferWriter}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to