I'm looking for ways to reduce the runtime of my Spark job.  My code is a
single file of scala code and is written in this order:

(1) val lines = Import full dataset using sc.textFile
(2) val ABonly = Parse out all rows that are not of type A or B
(3) val processA = Process only the A rows from ABonly
(4) val processB = Process only the B rows from ABonly

Is Spark doing (1) then (2) then (3) then (4) ... or is it by default doing
(1) then (2) then branching to both (3) and (4) simultaneously and running
both in parallel?  If not, how can I make that happen?

Jon





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-automatically-run-different-stages-concurrently-when-possible-tp21075.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to