Re: Can I share the RDD between multiprocess

coolfrood Mon, 11 Aug 2014 08:50:13 -0700

Reviving this discussion again...

I'm interested in using Spark as the engine for a web service.


The SparkContext and its RDDs only exist in the JVM that started it.  While
RDDs are resilient, this means the context owner isn't resilient, so I may
be able to serve requests out of a single "service" JVM, but I'll lose all
my RDDs if the service dies.

It's possible to share RDDs by writing them into Tachyon, but with that I'll
end up having at least 2 copies of the same data in memory; even more if I
access the data from multiple contexts.

Is there a way around this?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Can-I-share-the-RDD-between-multiprocess-tp916p11901.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Can I share the RDD between multiprocess

Reply via email to