Let's say I have an RDD which gets cached and has two children which do something with it:
val rdd1 = .......cache() rdd1.saveAsSequenceFile() rdd1.groupBy()......saveAsSequenceFile() If I were to submit both calls to saveAsSequenceFile() in thread to take advantage of concurrency (where possible), what's the best way to determine when rdd1 is no longer being used by anything? I'm hoping the best way is not to do reference counting in the futures that are running the saveAsSequenceFile().