Sending out the message again.. Hopefully someone cal clarify :)

I would like some clarification on the execution model for spark streaming. 

Broadly, I am trying to understand if output operations in a DAG are only
processed after all intermediate operations are finished for all parts of
the DAG. 

Let me give an example: 

I have a dstream -A , I do map operations on this dstream and create two
different dstreams -B and C such that 

A ---> B -> (some operations) ---> kafka output 1   
 \----> C---> ( some operations) --> kafka output 2 

I want to understand will kafka output 1 and kafka output 2 wait for all
operations to finish on B and C before sending an output, or will they
simply send an output as soon as the ops in B and C are done. 

What kind of synchronization guarantees are there?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-DAG-Output-Processing-mechanism-tp28713p28715.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to