[ https://issues.apache.org/jira/browse/SPARK-26851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-26851. --------------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23768 [https://github.com/apache/spark/pull/23768] > CachedRDDBuilder only partially implements double-checked locking > ----------------------------------------------------------------- > > Key: SPARK-26851 > URL: https://issues.apache.org/jira/browse/SPARK-26851 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0, 3.0.0 > Reporter: Bruce Robbins > Assignee: Bruce Robbins > Priority: Minor > Fix For: 3.0.0 > > > In CachedRDDBuilder, {{cachedColumnBuffers}} uses double-checked locking to > lazily initialize {{_cachedColumnBuffers}}. Also, clearCache uses > double-checked locking to likely avoid synchronization when > {{_cachedColumnBuffers}} is still null. > However, the resource (in this case, {{_cachedColumnBuffers}}) is not > declared as volatile, which could cause some visibility problems, > particularly in {{clearCache}}, which may see null reference when actually > there is an RDD. > From Java Concurrency in Practice by Brian Goetz et al: > {quote}Subsequent changes in the JMM (Java 5.0 and later) have enabled DCL to > work if resource is made volatile, and the performance impact of this is > small since volatile reads are usually only slightly more expensive than > nonvolatile reads. > {quote} > There are comments in other documentation that volatile is not needed if the > resourceĀ is immutable. While an RDD is immutable from a Spark user's point of > view, it may not be from a JVM's point of view, since not all internal fields > are final. > I've marked this as minor since the race conditions are highly unlikely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org