I'm curious to see that if you declare broadcasted wrapper as a var, and
overwrite it in the driver program, the modification can have stable impact
on all transformations/actions defined BEFORE the overwrite but was executed
lazily AFTER the overwrite:

   val a = sc.parallelize(1 to 10)

    var broadcasted = sc.broadcast("broad")

    val b = a.map(_ + broadcasted.value)
//  b.persist()
    for (line <- b.collect()) {  print(line)  }

    println("\n=======================================")
    broadcasted = sc.broadcast("cast")

    for (line <- b.collect()) {  print(line)  }

the result is:

1broad2broad3broad4broad5broad6broad7broad8broad9broad10broad
=======================================
1cast2cast3cast4cast5cast6cast7cast8cast9cast10cast

Of course, if you persist b before overwriting it will still get the
non-surprising result (both are 10broad... because they are persisted). This
can be useful sometimes but may cause confusion at other times (people can
no longer add persist at will just for backup because it may change the
result).

So far I've found no documentation supporting this feature. So can some one
confirm that its a feature craftly designed?

Yours Peng 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Bug-or-feature-Overwrite-broadcasted-variables-tp12315.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to