I've never had this need and I've never done it. There are options that allow this. For example, I know there are web apps out there that work like the spark REPL. One of these I think is called Zepplin. . I've never used them, but I've seen them demoed. There is also Tachyon that Spark supports.. Hopefully, that gives you a place to start. On Thu, Nov 5, 2015 at 9:21 PM Deepak Sharma <deepakmc...@gmail.com> wrote:
> Thanks Christian. > So is there any inbuilt mechanism in spark or api integration to other > inmemory cache products such as redis to load the RDD to these system upon > program exit ? > What's the best approach to have long lived RDD cache ? > Thanks > > > Deepak > On 6 Nov 2015 8:34 am, "Christian" <engr...@gmail.com> wrote: > >> The cache gets cleared out when the job finishes. I am not aware of a way >> to keep the cache around between jobs. You could save it as an object file >> to disk and load it as an object file on your next job for speed. >> On Thu, Nov 5, 2015 at 6:17 PM Deepak Sharma <deepakmc...@gmail.com> >> wrote: >> >>> Hi All >>> I am confused on RDD persistence in cache . >>> If I cache RDD , is it going to stay there in memory even if my spark >>> program completes execution , which created it. >>> If not , how can I guarantee that RDD is persisted in cache even after >>> the program finishes execution. >>> >>> Thanks >>> >>> >>> Deepak >>> >>