Re: Reading Back a Cached RDD

Marco Colombo Thu, 24 Mar 2016 14:56:57 -0700

You can persist off-heap, for example with tachyon, now called Alluxio.
Take a look at off heap peristance


Regards

Il giovedì 24 marzo 2016, Holden Karau <hol...@pigscanfly.ca> ha scritto:

> Even checkpoint() is maybe not exactly what you want, since if reference
> tracking is turned on it will get cleaned up once the original RDD is out
> of scope and GC is triggered.
> If you want to share persisted RDDs right now one way to do this is
> sharing the same spark context (using something like the spark job server
> or IBM Spark Kernel).
>
> On Thu, Mar 24, 2016 at 11:28 AM, Nicholas Chammas <
> nicholas.cham...@gmail.com
> <javascript:_e(%7B%7D,'cvml','nicholas.cham...@gmail.com');>> wrote:
>
>> Isn’t persist() only for reusing an RDD within an active application?
>> Maybe checkpoint() is what you’re looking for instead?
>> 
>>
>> On Thu, Mar 24, 2016 at 2:02 PM Afshartous, Nick <nafshart...@turbine.com
>> <javascript:_e(%7B%7D,'cvml','nafshart...@turbine.com');>> wrote:
>>
>>>
>>> Hi,
>>>
>>>
>>> After calling RDD.persist(), is then possible to come back later and
>>> access the persisted RDD.
>>>
>>> Let's say for instance coming back and starting a new Spark shell
>>> session.  How would one access the persisted RDD in the new shell session ?
>>>
>>>
>>> Thanks,
>>>
>>> --
>>>
>>>    Nick
>>>
>>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>


-- 
Ing. Marco Colombo

Re: Reading Back a Cached RDD

Reply via email to