Suppose I have an object to broadcast and then use it in a mapper function,
sth like follows, (Python codes)

obj2share = sc.broadcast("Some object here")

someRdd.map(createMapper(obj2share)).collect()

The createMapper function will create a mapper function using the shared
object's value. Another way to do this is

someRdd.map(createMapper(obj2share.value)).collect()

Here the creatMapper function directly uses the shared object to create the
mapper function. Is there a difference from spark side for the two methods?
If there is no difference at all, I'd prefer the second, because it hides
the spark from the createMapper function. 

Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/correct-way-to-broadcast-a-variable-tp21631.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to