l Message-
> From: Wes McKinney [mailto:w...@cloudera.com]
> Sent: Thursday, March 17, 2016 6:03 AM
> To: dev@arrow.apache.org
> Subject: Re: Understanding "shared" memory implications
>
> On Wed, Mar 16, 2016 at 2:33 PM, Jacques Nadeau <jacq...@apache.org> wrote:
>&g
, March 18, 2016 6:51 AM
To: dev@arrow.apache.org
Subject: Re: Understanding "shared" memory implications
hi Kai,
This sounds like it might merit a separate thread to discuss the growth of
Arrow as a modular ecosystem of libraries in different programming languages
and related to
I've been under the impression that exposing memory to be shared directly
and not copied WAS, in fact, the responsibility of Arrow. In fact, I read
this in [1] and this is turned me on to Arrow in the first place.
[1]
It has always been the expectation that no system would be required to
use a particular piece of Arrow software to "use Arrow" (hence the
importance of having a well-defined specification for memory and
metadata). However, we should also not expect all systems to create
their own implementations
To: dev@arrow.apache.org
Subject: Re: Understanding "shared" memory implications
On Wed, Mar 16, 2016 at 2:33 PM, Jacques Nadeau <jacq...@apache.org> wrote:
>
> For Arrow, let's make sure that we do our best to accomplish both (1)
> and (2). They seem like entirely comp
I have similar concerns as Todd stated below. With an mmap-based approach,
we are treating shared memory objects like files. This brings in all
filesystem related considerations like ACL and lifecycle mgmt.
Stepping back a little, the shared-memory work isn't really specific to
Arrow. A few
I always thought Arrow was just an in-memory format, and it is the
responsibility of whoever else that want to use it to carry that
responsibilities out, because depending on workloads, different frameworks
might pick very different applications. Otherwise it seems to be doing too
much and having
@Todd: agree entirely on prototyping design. My goal is throw out some
ideas and some POC code and then we can explore from there.
My main thoughts have initially been around lifecycle management. I've done
some work previously where a consistently sized shared buffer using mmap
has improved
On Tue, Mar 15, 2016 at 5:54 PM, Jacques Nadeau wrote:
> How do others feel of my redefinition of IPC to mean the same memory space
> communication (either via shared memory or rdma) versus RPC as socket based
> communication?
>
IPC already has a strong definition which is
Having thought about this quite a bit in the past, I think the mechanics of
how to share memory are by far the easiest part. The much harder part is
the resource management and ownership. Questions like:
- if you are using an mmapped file in /dev/shm/, how do you make sure it
gets cleaned up if
@Corey
The POC Steven and Wes are working on is based on MappedBuffer but I'm
looking at using netty's fork of tcnative to use shared memory directly.
@Yiannis
We need to have both RPC and a shared memory mechanisms (what I'm inclined
to call IPC but is a specific kind of IPC). The idea is we
I was seeing Netty's unsafe classes being used here, not mapped byte
buffer not sure if that statement is completely correct but I'll have to
dog through the code again to figure that out.
The more I was looking at unsafe, it makes sense why that would be
used.apparently it's also supposed to be
Hi Wes,
can you please clarify something I don't understand? The next versions of
arrow will include the shared memory control flow as well?
So then, what is needed for HBase (for instance) to be integrated is the
adapter to the arrow format?
If yes, then who will be responsible for keeping the
13 matches
Mail list logo