Let's say I have an RDD which gets cached and has two children which do
something with it:

val rdd1 = .......cache()

rdd1.saveAsSequenceFile()

rdd1.groupBy()......saveAsSequenceFile()

If I were to submit both calls to saveAsSequenceFile() in  thread to take
advantage of concurrency (where possible), what's the best way to determine
when rdd1 is no longer being used by anything?

I'm hoping the best way is not to do reference counting in the futures that
are running the saveAsSequenceFile().

Reply via email to