I think you might be a bit confused about what zero copy means if that’s what 
you’re concerned about. If you have a bigger than memory file, then Plasma 
wasn’t going to help since its design always involved copying the arrow buffers 
to memory.

If you have larger than memory arrow files in the first place, just open them 
using mmap (should be automatically done for non-compressed arrow files).

--
-Dan Nugent
On Jan 26, 2021, 13:07 -0500, Thomas Browne <[email protected]>, wrote:
> don't I lose the benefit of mmapping huge files with a ramdisk? Cos the file 
> has to now fit on my ramdisk.
>
> Personally working with financial tick data which can be enormous.
> On 26/01/2021 18:00, Daniel Nugent wrote:
> > Is there a problem with just using a RAM disk as the method for sharing the 
> > arrow buffers? It just seems easier and less finicky than a separate API to 
> > program against.
> >
> > It also makes storing the data permanently a lot  more straightforward, I 
> > think.
> >
> > --
> > -Dan Nugent
> > On Jan 26, 2021, 12:47 -0500, Thomas Browne <[email protected]>, wrote:
> > > So one of the big advantages of Arrow is the common format in memory, on
> > > the wire, across languages.
> > >
> > > I get that this makes it very easy and fast to transfer data between
> > > nodes, and between languages, which will all share the in-memory format
> > > and therefore the (often expensive) serialisation step is removed.
> > >
> > > However, is it true that one of the core objectives of the project is
> > > also to allow shared memory objects across different languages on the
> > > same node? For example, a fast C-based ingest system constantly
> > > populates a pyarrow buffer, which can be read directly by any other
> > > application on that node, through pointer sharing?
> > >
> > > If this is a core objective, what is the canonical way for brokering the
> > > "pointers" to this data between languages? Is it the Plasma store? And
> > > if so, are there plans for Plasma to move be implemented in other client
> > > languages?
> > >
> > > In short. Is Plasma (or if not Plasma, the functionality it provides
> > > implemented some other way), a core objective of the project?
> > >
> > > Or instead is Flight supposed to be used between languages on the same
> > > node, and if so, does Flight provide true zero-copy (ie - the same
> > > buffer, not copying the buffer) if run between processes on the same node?
> > >
> > > Many thanks.

Reply via email to