Yes, Spark automatically removes old RDDs from the cache when you make new 
ones. Unpersist forces it to remove them right away. In both cases though, note 
that Java doesn’t garbage-collect the objects released until later.

Matei

On Mar 19, 2014, at 7:22 PM, Nicholas Chammas <nicholas.cham...@gmail.com> 
wrote:

> Related question: 
> 
> If I keep creating new RDDs and cache()-ing them, does Spark automatically 
> unpersist the least recently used RDD when it runs out of memory? Or is an 
> explicit unpersist the only way to get rid of an RDD (barring the PR 
> Tathagata mentioned)?
> 
> Also, does unpersist()-ing an RDD immediately free up space, or just allow 
> that space to be reclaimed when needed?
> 
> 
> On Wed, Mar 19, 2014 at 7:01 PM, Tathagata Das <tathagata.das1...@gmail.com> 
> wrote:
> Just a head's up, there is an active pull requeust that will automatically 
> unpersist RDDs that are not in reference/scope from the application any more. 
> 
> TD
> 
> 
> On Wed, Mar 19, 2014 at 6:58 PM, hequn cheng <chenghe...@gmail.com> wrote:
> persist and unpersist.
> unpersist:Mark the RDD as non-persistent, and remove all blocks for it from 
> memory and disk
> 
> 
> 2014-03-19 16:40 GMT+08:00 林武康 <vboylin1...@gmail.com>:
> 
> Hi, can any one tell me about the lifecycle of an rdd? I search through the 
> official website and still can't figure it out. Can I use an rdd in some 
> stages and destroy it in order to release memory because that no stages ahead 
> will use this rdd any more. Is it possible?
> 
> Thanks!
> 
> Sincerely 
> Lin wukang
> 
> 
> 

Reply via email to