Re: Spark RDD cache persistence

Christian Thu, 05 Nov 2015 21:51:07 -0800

I've never had this need and I've never done it. There are options that
allow this. For example, I know there are web apps out there that work like
the spark REPL. One of these I think is called Zepplin. . I've never used
them, but I've seen them demoed. There is also Tachyon that Spark
supports.. Hopefully, that gives you a place to start.
On Thu, Nov 5, 2015 at 9:21 PM Deepak Sharma <deepakmc...@gmail.com> wrote:


> Thanks Christian.
> So is there any inbuilt mechanism in spark or api integration  to other
> inmemory cache products such as redis to load the RDD to these system upon
> program exit ?
> What's the best approach to have long lived RDD cache ?
> Thanks
>
>
> Deepak
> On 6 Nov 2015 8:34 am, "Christian" <engr...@gmail.com> wrote:
>
>> The cache gets cleared out when the job finishes. I am not aware of a way
>> to keep the cache around between jobs. You could save it as an object file
>> to disk and load it as an object file on your next job for speed.
>> On Thu, Nov 5, 2015 at 6:17 PM Deepak Sharma <deepakmc...@gmail.com>
>> wrote:
>>
>>> Hi All
>>> I am confused on RDD persistence in cache .
>>> If I cache RDD , is it going to stay there in memory even if my spark
>>> program completes execution , which created it.
>>> If not , how can I guarantee that RDD is persisted in cache even after
>>> the program finishes execution.
>>>
>>> Thanks
>>>
>>>
>>> Deepak
>>>
>>

Re: Spark RDD cache persistence

Reply via email to