RDD is kind of a pointer to the actual data. Unless it's cached, we don't need to clean up the RDD.
On Tue, May 21, 2019 at 1:48 PM Nasrulla Khan Haris <nasrulla.k...@microsoft.com.invalid> wrote: > HI Spark developers, > > > > Can someone point out the code where RDD objects go out of scope ?. I > found the contextcleaner > <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ContextCleaner.scala#L178> > code in which only persisted RDDs are cleaned up in regular intervals if > the RDD is registered to cleanup. I have not found where the destructor for > RDD object is invoked. I am trying to understand when RDD cleanup happens > when the RDD is not persisted. > > > > Thanks in advance, appreciate your help. > > Nasrulla > > >