Github user viirya commented on the pull request: https://github.com/apache/spark/pull/2217#issuecomment-54278826 @rxin. I need a way to modify broadcasted variables locally and keep those variables for later use. The locally modified variables are used to store some values calculated at earlier stage of machine learning algorithm. Those values would be used at later stages. In particular, the algorithm calculates different parameter P for different data partitions using mapPartitionsWithIndex at its first stage. In later stage, the algorithm use parameter P to perform learning. Under current broadcasted variables, I need to collect calculated values of the earlier stage and re-broadcast them to later stages. Since current broadcasted variables are immutable, the earlier stage can not modify these variables locally for different partitions of data. So I am wondering if we can provide a mechanism to allow tasks to have locally mutable values for different partitions. Thus I do modify the broadcast interface to provide such function. However, maybe it should be separated from broadcast module.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org