Even checkpoint() is maybe not exactly what you want, since if reference tracking is turned on it will get cleaned up once the original RDD is out of scope and GC is triggered. If you want to share persisted RDDs right now one way to do this is sharing the same spark context (using something like the spark job server or IBM Spark Kernel).
On Thu, Mar 24, 2016 at 11:28 AM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Isn’t persist() only for reusing an RDD within an active application? > Maybe checkpoint() is what you’re looking for instead? > > > On Thu, Mar 24, 2016 at 2:02 PM Afshartous, Nick <nafshart...@turbine.com> > wrote: > >> >> Hi, >> >> >> After calling RDD.persist(), is then possible to come back later and >> access the persisted RDD. >> >> Let's say for instance coming back and starting a new Spark shell >> session. How would one access the persisted RDD in the new shell session ? >> >> >> Thanks, >> >> -- >> >> Nick >> > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau