Re: Understanding "shared" memory implications

2016-03-19 Thread Wes McKinney
l Message- > From: Wes McKinney [mailto:w...@cloudera.com] > Sent: Thursday, March 17, 2016 6:03 AM > To: dev@arrow.apache.org > Subject: Re: Understanding "shared" memory implications > > On Wed, Mar 16, 2016 at 2:33 PM, Jacques Nadeau <jacq...@apache.org> wrote: >&g

RE: Understanding "shared" memory implications

2016-03-19 Thread Zheng, Kai
, March 18, 2016 6:51 AM To: dev@arrow.apache.org Subject: Re: Understanding "shared" memory implications hi Kai, This sounds like it might merit a separate thread to discuss the growth of Arrow as a modular ecosystem of libraries in different programming languages and related to

Re: Understanding "shared" memory implications

2016-03-19 Thread Corey Nolet
I've been under the impression that exposing memory to be shared directly and not copied WAS, in fact, the responsibility of Arrow. In fact, I read this in [1] and this is turned me on to Arrow in the first place. [1]

Re: Understanding "shared" memory implications

2016-03-19 Thread Wes McKinney
It has always been the expectation that no system would be required to use a particular piece of Arrow software to "use Arrow" (hence the importance of having a well-defined specification for memory and metadata). However, we should also not expect all systems to create their own implementations

RE: Understanding "shared" memory implications

2016-03-19 Thread Zheng, Kai
To: dev@arrow.apache.org Subject: Re: Understanding "shared" memory implications On Wed, Mar 16, 2016 at 2:33 PM, Jacques Nadeau <jacq...@apache.org> wrote: > > For Arrow, let's make sure that we do our best to accomplish both (1) > and (2). They seem like entirely comp

Re: Understanding "shared" memory implications

2016-03-19 Thread Zhe Zhang
I have similar concerns as Todd stated below. With an mmap-based approach, we are treating shared memory objects like files. This brings in all filesystem related considerations like ACL and lifecycle mgmt. Stepping back a little, the shared-memory work isn't really specific to Arrow. A few

Re: Understanding "shared" memory implications

2016-03-19 Thread Reynold Xin
I always thought Arrow was just an in-memory format, and it is the responsibility of whoever else that want to use it to carry that responsibilities out, because depending on workloads, different frameworks might pick very different applications. Otherwise it seems to be doing too much and having

Re: Understanding "shared" memory implications

2016-03-19 Thread Jacques Nadeau
@Todd: agree entirely on prototyping design. My goal is throw out some ideas and some POC code and then we can explore from there. My main thoughts have initially been around lifecycle management. I've done some work previously where a consistently sized shared buffer using mmap has improved

Re: Understanding "shared" memory implications

2016-03-18 Thread Ted Dunning
On Tue, Mar 15, 2016 at 5:54 PM, Jacques Nadeau wrote: > How do others feel of my redefinition of IPC to mean the same memory space > communication (either via shared memory or rdma) versus RPC as socket based > communication? > IPC already has a strong definition which is

Re: Understanding "shared" memory implications

2016-03-15 Thread Todd Lipcon
Having thought about this quite a bit in the past, I think the mechanics of how to share memory are by far the easiest part. The much harder part is the resource management and ownership. Questions like: - if you are using an mmapped file in /dev/shm/, how do you make sure it gets cleaned up if

Re: Understanding "shared" memory implications

2016-03-15 Thread Jacques Nadeau
@Corey The POC Steven and Wes are working on is based on MappedBuffer but I'm looking at using netty's fork of tcnative to use shared memory directly. @Yiannis We need to have both RPC and a shared memory mechanisms (what I'm inclined to call IPC but is a specific kind of IPC). The idea is we

Re: Understanding "shared" memory implications

2016-03-15 Thread Corey Nolet
I was seeing Netty's unsafe classes being used here, not mapped byte buffer not sure if that statement is completely correct but I'll have to dog through the code again to figure that out. The more I was looking at unsafe, it makes sense why that would be used.apparently it's also supposed to be

Re: Understanding "shared" memory implications

2016-03-15 Thread Yiannis Gkoufas
Hi Wes, can you please clarify something I don't understand? The next versions of arrow will include the shared memory control flow as well? So then, what is needed for HBase (for instance) to be integrated is the adapter to the arrow format? If yes, then who will be responsible for keeping the