Suppose I have an object to broadcast and then use it in a mapper function, sth like follows, (Python codes)
obj2share = sc.broadcast("Some object here") someRdd.map(createMapper(obj2share)).collect() The createMapper function will create a mapper function using the shared object's value. Another way to do this is someRdd.map(createMapper(obj2share.value)).collect() Here the creatMapper function directly uses the shared object to create the mapper function. Is there a difference from spark side for the two methods? If there is no difference at all, I'd prefer the second, because it hides the spark from the createMapper function. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/correct-way-to-broadcast-a-variable-tp21631.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org