You may try to change the schudlingMode to FAIR, the default is FIFO. Take a look at this page
https://spark.apache.org/docs/1.1.0/job-scheduling.html#scheduling-within-an-application On Sat, Jan 10, 2015 at 10:24 AM, YaoPau <jonrgr...@gmail.com> wrote: > I'm looking for ways to reduce the runtime of my Spark job. My code is a > single file of scala code and is written in this order: > > (1) val lines = Import full dataset using sc.textFile > (2) val ABonly = Parse out all rows that are not of type A or B > (3) val processA = Process only the A rows from ABonly > (4) val processB = Process only the B rows from ABonly > > Is Spark doing (1) then (2) then (3) then (4) ... or is it by default doing > (1) then (2) then branching to both (3) and (4) simultaneously and running > both in parallel? If not, how can I make that happen? > > Jon > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-automatically-run-different-stages-concurrently-when-possible-tp21075.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >