Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22995#discussion_r236721318 --- Diff: core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala --- @@ -92,8 +95,15 @@ private[spark] class TorrentBroadcast[T: ClassTag](obj: T, id: Long) /** The checksum for all the blocks. */ private var checksums: Array[Int] = _ - override protected def getValue() = { - _value + override protected def getValue() = synchronized { --- End diff -- Hm, that `TorrentBroadcast.synchronized` is from a very old version of the code in 2013, when more things used that lock. I'm pretty certain it's obsolete. However this code accesses broadcastCache and that needs synchronization. (It's kind of unfortunate where this object is). I think we could actually improve this by locking on broadcastCache for basically the whole block. I think that's a safe improvement as nothing else uses broadcastCache, nothing else is synchronized here, and should behave just the same.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org