cloud-fan commented on a change in pull request #23644: [SPARK-26708][SQL] 
Incorrect result caused by inconsistency between a SQL cache's cached RDD and 
its physical plan
URL: https://github.com/apache/spark/pull/23644#discussion_r251281049
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala
 ##########
 @@ -180,7 +180,26 @@ class CacheManager extends Logging {
       val it = cachedData.iterator()
       while (it.hasNext) {
         val cd = it.next()
-        if (condition(cd.plan)) {
+        // If `clearCache` is false (which means the recache request comes 
from a non-cascading
+        // cache invalidation) and the cache buffer has already been loaded, 
we do not need to
+        // re-compile a physical plan because the old plan will not be used 
any more by the
+        // CacheManager although it still lives in compiled `Dataset`s and it 
could still work.
 
 Review comment:
   > the old plan will not be used any more by the CacheManager
   
   How do we guarantee it?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to