You don't need to worry about locks as such as one thread/worker is responsible exclusively for one partition of the RDD. You can use Accumulator variables that spark provides to get the state updates.
On Mon Dec 08 2014 at 8:14:28 PM aditya.athalye <adbrihadarany...@gmail.com> wrote: > I am relatively new to Spark. I am planning to use Spark Streaming for my > OLAP use case, but I would like to know how RDDs are shared between > multiple > workers. > If I need to constantly compute some stats on the streaming data, > presumably > shared state would have to updated serially by different spark workers. Is > this managed by Spark automatically or does the application need to ensure > distributed locks are acquired? > > Thanks > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Locking-for-shared-RDDs-tp20578.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >