a question about RDD.checkpoint()

dachuan Fri, 08 Nov 2013 19:02:34 -0800

Hello,

I have a quick question about RDD.checkpoint().


If the user calls RDD.checkpoint() and after the job finishes, the Spark
would call RDD.doCheckpoint() to do the real physical checkpointing, that
is to say, dump this RDD's partitions into HDFS.

Does this mean that all its parents RDD scala objects and RDD's data (which
is managed by BlockManager) will be garbage collected?

And could you please point me to the relevant source code region, if
possible?

thanks,
dachuan.

-- 
Dachuan Huang
Cellphone: 614-390-7234
2015 Neil Avenue
Ohio State University
Columbus, Ohio
U.S.A.
43210

a question about RDD.checkpoint()

Reply via email to