[ https://issues.apache.org/jira/browse/SPARK-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-18735. ------------------------------- Resolution: Invalid (Please ask questions on the mailing list) Two reasons: 1) if the computation that uses the broadcast isn't triggered within the loop iteration, then you will be removing the broadcast before it's used 2) destroy()-ing even after the broadcast has definitely been used to compute, say, a cached RDD, is a problem because the cached RDD might still be recomputed and a destroyed broadcast can't be used. unpersist() is more correct in this case. > Why don't we destroy the broadcast variable after each iteration? > ----------------------------------------------------------------- > > Key: SPARK-18735 > URL: https://issues.apache.org/jira/browse/SPARK-18735 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 2.0.2 > Reporter: Jianfei Wang > > I think we should destroy the broadcast variable bcWeights explicitly in > spark.mllib.GradientDescent.runMiniBatchSGD > see the code below: > while (!converged && i <= numIterations) { > val start = System.nanoTime() > val bcWeights = data.context.broadcast(weights) > // some other code > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org