Thanks Sean, that makes sense. 

Regards,
Nasrulla

-----Original Message-----
From: Sean Owen <sro...@gmail.com> 
Sent: Tuesday, May 21, 2019 6:24 PM
To: Nasrulla Khan Haris <nasrulla.k...@microsoft.com>
Cc: dev@spark.apache.org
Subject: Re: RDD object Out of scope.

I'm not clear what you're asking. An RDD itself is just an object in the JVM. 
It will be garbage collected if there are no references. What else would there 
be to clean up in your case? ContextCleaner handles cleaned up of persisted 
RDDs, etc.

On Tue, May 21, 2019 at 7:39 PM Nasrulla Khan Haris 
<nasrulla.k...@microsoft.com.invalid> wrote:
>
> I am trying to find the code that cleans up uncached RDD.
>
>
>
> Thanks,
>
> Nasrulla
>
>
>
> From: Charoes <char...@gmail.com>
> Sent: Tuesday, May 21, 2019 5:10 PM
> To: Nasrulla Khan Haris <nasrulla.k...@microsoft.com.invalid>
> Cc: Wenchen Fan <cloud0...@gmail.com>; dev@spark.apache.org
> Subject: Re: RDD object Out of scope.
>
>
>
> If you cached a RDD and hold a reference of that RDD in your code, then your 
> RDD will NOT be cleaned up.
>
> There is a ReferenceQueue in ContextCleaner, which is used to keep tracking 
> the reference of RDD, Broadcast, and Accumulator etc.
>
>
>
> On Wed, May 22, 2019 at 1:07 AM Nasrulla Khan Haris 
> <nasrulla.k...@microsoft.com.invalid> wrote:
>
> Thanks for reply Wenchen, I am curious as what happens when RDD goes out of 
> scope when it is not cached.
>
>
>
> Nasrulla
>
>
>
> From: Wenchen Fan <cloud0...@gmail.com>
> Sent: Tuesday, May 21, 2019 6:28 AM
> To: Nasrulla Khan Haris <nasrulla.k...@microsoft.com.invalid>
> Cc: dev@spark.apache.org
> Subject: Re: RDD object Out of scope.
>
>
>
> RDD is kind of a pointer to the actual data. Unless it's cached, we don't 
> need to clean up the RDD.
>
>
>
> On Tue, May 21, 2019 at 1:48 PM Nasrulla Khan Haris 
> <nasrulla.k...@microsoft.com.invalid> wrote:
>
> HI Spark developers,
>
>
>
> Can someone point out the code where RDD objects go out of scope ?. I found 
> the contextcleaner code in which only persisted RDDs are cleaned up in 
> regular intervals if the RDD is registered to cleanup. I have not found where 
> the destructor for RDD object is invoked. I am trying to understand when RDD 
> cleanup happens when the RDD is not persisted.
>
>
>
> Thanks in advance, appreciate your help.
>
> Nasrulla
>
>

Reply via email to