Two question about Plasma; my use case is sharing Arrow data between a
C++ and Python application (eventually also R).
1. What's the typical memory allocation procedure when using Plasma and
Arrow? Do I first construct a builder, populate it, finish it, and
*then* copy it into mmaped buffer? Or do I obtain mmaped buffer from
Plasma first, in which the builder operates incrementally until it's
full? If I understand it correctly, a Plasma buffer has a fixed size,
so I wonder how you accommodate the fact that the Arrow builder
constructs a record batches incrementally, while at the same time
avoiding extra copying of large memory chunks after finishing the
builder.
1. Do I need Plasma to exchange the mmapped buffers between the two
apps? Or could I mmap my Arrow data manually and tell pyarrow through
a different mechanism to obtain the shared buffer?
Matthias
- General questions about Arrow & Plasma Matthias Vallentin
-