Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22995#discussion_r236721318
  
    --- Diff: 
core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala ---
    @@ -92,8 +95,15 @@ private[spark] class TorrentBroadcast[T: ClassTag](obj: 
T, id: Long)
       /** The checksum for all the blocks. */
       private var checksums: Array[Int] = _
     
    -  override protected def getValue() = {
    -    _value
    +  override protected def getValue() = synchronized {
    --- End diff --
    
    Hm, that `TorrentBroadcast.synchronized` is from a very old version of the 
code in 2013, when more things used that lock. I'm pretty certain it's 
obsolete. However this code accesses broadcastCache and that needs 
synchronization. (It's kind of unfortunate where this object is). I think we 
could actually improve this by locking on broadcastCache for basically the 
whole block. I think that's a safe improvement as nothing else uses 
broadcastCache, nothing else is synchronized here, and should behave just the 
same.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to