the pseudo code :

object myApp {
  var myStaticRDD: RDD[Int]
  def main() {
  ...  //init streaming context, and get two DStream (streamA and streamB)
from two hdfs path

  //complex transformation using the two DStream
  val new_stream = streamA.transformWith(StreamB, (a, b, t) => {
      a.join(b).map(...)
    }
  )
  
  //join the new_stream's rdd with myStaticRDD
  new_stream.foreachRDD(rdd =>
    myStaticRDD = myStaticRDD.join(cur_stream)
  )

  // do complex model training every two hours.
  if (hour is 0, 2, 4, 6...) {
     model_training(myStaticRDD)   //will take 1 hour
  }
  }
}


I don't know how to write the code to realize training model every two hours
using that moment's myStaticRDD.
And when the model-training is running, the streaming task could also run
normally simultaneously...




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/union-eatch-streaming-window-into-a-static-rdd-and-use-the-static-rdd-periodicity-tp22783.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to