Anand Nalya Tue, 07 Jul 2015 04:36:00 -0700

Hi,

Suppose I have an RDD that is loaded from some file and then I also have a
DStream that has data coming from some stream. I want to keep union some of
the tuples from the DStream into my RDD. For this I can use something like
this:


  var myRDD: RDD[(String, Long)] = sc.fromText...
  dstream.foreachRDD{ rdd =>
    myRDD = myRDD.union(rdd.filter(myfilter))
  }

My questions is that for how long spark will keep RDDs underlying the
dstream around? Is there some configuratoin knob that can control that?

Regards,
Anand

Reply via email to