Philipp Moritz created ARROW-1163:
-------------------------------------

             Summary: [Plasma] Java client for Plasma
                 Key: ARROW-1163
                 URL: https://issues.apache.org/jira/browse/ARROW-1163
             Project: Apache Arrow
          Issue Type: New Feature
            Reporter: Philipp Moritz


We should start thinking about how a Java client for plasma would look like. 
Given the focus of arrow to support Python, C++ and Java really well, it is the 
next important target after Python and C++.

My preliminary thoughts on it are the following ones: We can either go with JNI 
and wrap the C++ client or (in my opinion preferable) write a pure Java client. 
It would communicate with the Plasma store via Java flatbuffers over sockets.

It seems that the only thing blocking a pure Java client at the moment is the 
way we ship file descriptors for the memory mapped files between store and 
client (see the file fling.cc in the Plasma repo). We would need to get rid of 
that because there is no pure Java API that allows transferring file 
descriptors over a process boundary. So the way to transfer memory mapped files 
over process boundaries then is probably to use the file system and keep the 
memory mapped files in the file system instead of unlinking them immediately 
(as we do at the moment), so they can be opened by the client process via their 
path.

The challenge in this case is how to clean the files up and make sure they are 
not lying around if the plasma store crashes. One option is to store the plasma 
store PID with the file (i.e. as part of the file name) and let the plasma 
store clean them up the next time it is started); maybe there is OS level 
support for temporary files we can reuse.

I probably won't get to this for a while, so if anybody needs this or has free 
cycles, they should feel free to chime in. Also opinions on the design are 
appreciated!

-- Philipp.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to