Re: Re: Spark RDD cache persistence

2015-12-09 Thread Calvin Jia
/hadoop-hdfs/ArchivalStorage.html#Archival_Storage_SSD__Memory >> . >> Hive tmp table use this function to speed job. >> https://issues.apache.org/jira/browse/HIVE-7313 >> >> ------ >> r7raul1...@163.com >> >> >> *From:* Christian &

Re: Spark RDD cache persistence

2015-11-05 Thread Christian
I've never had this need and I've never done it. There are options that allow this. For example, I know there are web apps out there that work like the spark REPL. One of these I think is called Zepplin. . I've never used them, but I've seen them demoed. There is also Tachyon that Spark supports..

Re: Re: Spark RDD cache persistence

2015-11-05 Thread r7raul1...@163.com
To: Deepak Sharma CC: user Subject: Re: Spark RDD cache persistence I've never had this need and I've never done it. There are options that allow this. For example, I know there are web apps out there that work like the spark REPL. One of these I think is called Zepplin. . I've never used them

Re: Re: Spark RDD cache persistence

2015-11-05 Thread Deenar Toraskar
le use this function to speed job. > https://issues.apache.org/jira/browse/HIVE-7313 > > -- > r7raul1...@163.com > > > *From:* Christian <engr...@gmail.com> > *Date:* 2015-11-06 13:50 > *To:* Deepak Sharma <deepakmc...@gmail.com> > *CC:* us

Re: Spark RDD cache persistence

2015-11-05 Thread Christian
The cache gets cleared out when the job finishes. I am not aware of a way to keep the cache around between jobs. You could save it as an object file to disk and load it as an object file on your next job for speed. On Thu, Nov 5, 2015 at 6:17 PM Deepak Sharma wrote: > Hi

Re: Spark RDD cache persistence

2015-11-05 Thread Deepak Sharma
Thanks Christian. So is there any inbuilt mechanism in spark or api integration to other inmemory cache products such as redis to load the RDD to these system upon program exit ? What's the best approach to have long lived RDD cache ? Thanks Deepak On 6 Nov 2015 8:34 am, "Christian"