Re: Can I share the RDD between multiprocess

Tathagata Das Fri, 24 Jan 2014 19:26:56 -0800

Using pure spark, you will have to write an RDD to a Hadoop compatible
filesystem (rdd.saveAs****), and read it back from a different process
(sparkContext.****File ... ).

You can also take a look at the tachyon project (
https://github.com/amplab/tachyon/wiki) which makes this super fast by
using an in-memory caching layer (outside spark).

TD

On Fri, Jan 24, 2014 at 6:53 PM, Binh Nguyen <ngb...@gmail.com> wrote:

> RDD is immutable so you should be able to.
>
>
> On Fri, Jan 24, 2014 at 6:06 PM, D.Y Feng <yyfeng88...@gmail.com> wrote:
>
>> How can I share the RDD between multiprocess?
>>
>> --
>>
>>
>> DY.Feng(叶毅锋)
>> yyfeng88625@twitter
>> Department of Applied Mathematics
>> Guangzhou University,China
>> dyf...@stu.gzhu.edu.cn
>>
>>
>
>
>
> --
>
> Binh Nguyen
>
>

Re: Can I share the RDD between multiprocess

Reply via email to