Using pure spark, you will have to write an RDD to a Hadoop compatible filesystem (rdd.saveAs****), and read it back from a different process (sparkContext.****File ... ).
You can also take a look at the tachyon project ( https://github.com/amplab/tachyon/wiki) which makes this super fast by using an in-memory caching layer (outside spark). TD On Fri, Jan 24, 2014 at 6:53 PM, Binh Nguyen <ngb...@gmail.com> wrote: > RDD is immutable so you should be able to. > > > On Fri, Jan 24, 2014 at 6:06 PM, D.Y Feng <yyfeng88...@gmail.com> wrote: > >> How can I share the RDD between multiprocess? >> >> -- >> >> >> DY.Feng(叶毅锋) >> yyfeng88625@twitter >> Department of Applied Mathematics >> Guangzhou University,China >> dyf...@stu.gzhu.edu.cn >> >> > > > > -- > > Binh Nguyen > >