You can persist off-heap, for example with tachyon, now called Alluxio. Take a look at off heap peristance
Regards Il giovedì 24 marzo 2016, Holden Karau <hol...@pigscanfly.ca> ha scritto: > Even checkpoint() is maybe not exactly what you want, since if reference > tracking is turned on it will get cleaned up once the original RDD is out > of scope and GC is triggered. > If you want to share persisted RDDs right now one way to do this is > sharing the same spark context (using something like the spark job server > or IBM Spark Kernel). > > On Thu, Mar 24, 2016 at 11:28 AM, Nicholas Chammas < > nicholas.cham...@gmail.com > <javascript:_e(%7B%7D,'cvml','nicholas.cham...@gmail.com');>> wrote: > >> Isn’t persist() only for reusing an RDD within an active application? >> Maybe checkpoint() is what you’re looking for instead? >> >> >> On Thu, Mar 24, 2016 at 2:02 PM Afshartous, Nick <nafshart...@turbine.com >> <javascript:_e(%7B%7D,'cvml','nafshart...@turbine.com');>> wrote: >> >>> >>> Hi, >>> >>> >>> After calling RDD.persist(), is then possible to come back later and >>> access the persisted RDD. >>> >>> Let's say for instance coming back and starting a new Spark shell >>> session. How would one access the persisted RDD in the new shell session ? >>> >>> >>> Thanks, >>> >>> -- >>> >>> Nick >>> >> > > > -- > Cell : 425-233-8271 > Twitter: https://twitter.com/holdenkarau > -- Ing. Marco Colombo