This sounds like something you can solve by a stateful operator. check out mapWithState. If both the message can be keyed with a common key, then you can define a keyed-state. the state will have a field for the first message.When you see the first message for a key, fill the first field with timestamp, etc. Then when the second message of the same key arrives, Spark Streaming will ensure that it calls your state update function with old state (i.e. first message filled up) and you can take the time difference.
Check out my blog - https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html On Tue, Dec 6, 2016 at 5:50 PM, sancheng <sanchuanch...@gmail.com> wrote: > any valuable feedback is appreciated! > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Spark-Streaming-How-to-do-join-two- > messages-in-spark-streaming-Probabaly-messasges-are-in- > differnet--tp28161p28163.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >