Github user ho3rexqj commented on a diff in the pull request: https://github.com/apache/spark/pull/20183#discussion_r161132057 --- Diff: core/src/main/scala/org/apache/spark/broadcast/BroadcastManager.scala --- @@ -52,6 +54,10 @@ private[spark] class BroadcastManager( private val nextBroadcastId = new AtomicLong(0) + private[broadcast] val cachedValues = { + new ReferenceMap(AbstractReferenceMap.HARD, AbstractReferenceMap.WEAK) --- End diff -- Suppose the first thread to request the broadcast variable's value destroyed it's instance of the broadcast variable (which, I believe, is what will happen when that thread finishes processing it's partition) - if the key were a weak reference in the above cache it would become eligible for GC at that point. I'm reasonably certain at that point the associated key/value pair would be removed from the cache; in other words, if the key were a weak reference the key/value pair would be removed as soon as the key **or** value was garbage collected. Note that I haven't used ReferenceMap extensively, so I could be wrong about the above - feel free to correct me if that's the case.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org