Tachyon is another option - this is the off heap StorageLevel specified
when persisting RDDs:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.storage.StorageLevel
or just use HDFS. this requires subsequent Applications/SparkContext's to
reload the data from disk, of course.
On Tue, Jun 3, 2014 at 6:58 AM, Gerard Maas gerard.m...@gmail.com wrote:
I don't think that's supported by default as when the standalone context
will close, the related RDDs will be GC'ed
You should explore Spark-Job Server, which allows to cache RDDs by name
and reuse them within a context.
https://github.com/ooyala/spark-jobserver
-kr, Gerard.
On Tue, Jun 3, 2014 at 3:45 PM, Oleg Proudnikov oleg.proudni...@gmail.com
wrote:
HI All,
Is it possible to run a standalone app that would compute and
persist/cache an RDD and then run other standalone apps that would gain
access to that RDD?
--
Thank you,
Oleg