SparkNet may have some interesting ideas - 
Haven't had a deep look at it yet but it seems to have some functionality 
allowing caffe to read data from RDDs, though I'm not certain the memory is 

Sent from Mailbox

On Mon, Dec 7, 2015 at 9:55 PM, Robin East <> wrote:

> Hi Annabel
> I certainly did read your post. My point was that Spark can read from HDFS 
> but is in no way tied to that storage layer . A very interesting use case 
> that sounds very similar to Jia's (as mentioned by another poster) is 
> contained in The comments 
> section provides a specific example of processing very large images using a 
> pre-existing c++ library.
> Robin
> Sent from my iPhone
>> On 7 Dec 2015, at 18:50, Annabel Melongo <> 
>> wrote:
>> Jia,
>> I'm so confused on this. The architecture of Spark is to run on top of HDFS. 
>> What you're requesting, reading and writing to a C++ process, is not part of 
>> that requirement.
>> On Monday, December 7, 2015 1:42 PM, Jia <> wrote:
>> Thanks, Annabel, but I may need to clarify that I have no intention to write 
>> and run Spark UDF in C++, I'm just wondering whether Spark can read and 
>> write data to a C++ process with zero copy.
>> Best Regards,
>> Jia
>>> On Dec 7, 2015, at 12:26 PM, Annabel Melongo <> 
>>> wrote:
>>> My guess is that Jia wants to run C++ on top of Spark. If that's the case, 
>>> I'm afraid this is not possible. Spark has support for Java, Python, Scala 
>>> and R.
>>> The best way to achieve this is to run your application in C++ and used the 
>>> data created by said application to do manipulation within Spark.
>>> On Monday, December 7, 2015 1:15 PM, Jia <> wrote:
>>> Thanks, Dewful!
>>> My impression is that Tachyon is a very nice in-memory file system that can 
>>> connect to multiple storages.
>>> However, because our data is also hold in memory, I suspect that connecting 
>>> to Spark directly may be more efficient in performance.
>>> But definitely I need to look at Tachyon more carefully, in case it has a 
>>> very efficient C++ binding mechanism.
>>> Best Regards,
>>> Jia
>>>> On Dec 7, 2015, at 11:46 AM, Dewful <> wrote:
>>>> Maybe looking into something like Tachyon would help, I see some sample 
>>>> c++ bindings, not sure how much of the current functionality they 
>>>> support...
>>>> Hi, Robin, 
>>>> Thanks for your reply and thanks for copying my question to user mailing 
>>>> list.
>>>> Yes, we have a distributed C++ application, that will store data on each 
>>>> node in the cluster, and we hope to leverage Spark to do more fancy 
>>>> analytics on those data. But we need high performance, that’s why we want 
>>>> shared memory.
>>>> Suggestions will be highly appreciated!
>>>> Best Regards,
>>>> Jia
>>>>> On Dec 7, 2015, at 10:54 AM, Robin East <> wrote:
>>>>> -dev, +user (this is not a question about development of Spark itself so 
>>>>> you’ll get more answers in the user mailing list)
>>>>> First up let me say that I don’t really know how this could be done - I’m 
>>>>> sure it would be possible with enough tinkering but it’s not clear what 
>>>>> you are trying to achieve. Spark is a distributed processing system, it 
>>>>> has multiple JVMs running on different machines that each run a small 
>>>>> part of the overall processing. Unless you have some sort of idea to have 
>>>>> multiple C++ processes collocated with the distributed JVMs using named 
>>>>> memory mapped files doesn’t make architectural sense. 
>>>>> -------------------------------------------------------------------------------
>>>>> Robin East
>>>>> Spark GraphX in Action Michael Malak and Robin East
>>>>> Manning Publications Co.
>>>>>> On 6 Dec 2015, at 20:43, Jia <> wrote:
>>>>>> Dears, for one project, I need to implement something so Spark can read 
>>>>>> data from a C++ process. 
>>>>>> To provide high performance, I really hope to implement this through 
>>>>>> shared memory between the C++ process and Java JVM process.
>>>>>> It seems it may be possible to use named memory mapped files and JNI to 
>>>>>> do this, but I wonder whether there is any existing efforts or more 
>>>>>> efficient approach to do this?
>>>>>> Thank you very much!
>>>>>> Best Regards,
>>>>>> Jia
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail:
>>>>>> For additional commands, e-mail:

Reply via email to