Can you run N jobs depending on the same RDD in parallel on the driver? certainly. The context / scheduling is thread-safe and the RDD is immutable. I've done this to, for example, build and evaluate a bunch of models simultaneously on a big cluster.
On Fri, Jan 15, 2016 at 7:10 PM, Jakob Odersky <joder...@gmail.com> wrote: > I don't think RDDs are threadsafe. > More fundamentally however, why would you want to run RDD actions in > parallel? The idea behind RDDs is to provide you with an abstraction for > computing parallel operations on distributed data. Even if you were to call > actions from several threads at once, the individual executors of your spark > environment would still have to perform operations sequentially. > > As an alternative, I would suggest to restructure your RDD transformations > to compute the required results in one single operation. > > On 15 January 2016 at 06:18, Jonathan Coveney <jcove...@gmail.com> wrote: >> >> Threads >> >> >> El viernes, 15 de enero de 2016, Kira <mennou...@gmail.com> escribió: >>> >>> Hi, >>> >>> Can we run *simultaneous* actions on the *same RDD* ?; if yes how can >>> this >>> be done ? >>> >>> Thank you, >>> Regards >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/simultaneous-actions-tp25977.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org