cloud-fan commented on a change in pull request #23644: [SPARK-26708][SQL] Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan URL: https://github.com/apache/spark/pull/23644#discussion_r251281049
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala ########## @@ -180,7 +180,26 @@ class CacheManager extends Logging { val it = cachedData.iterator() while (it.hasNext) { val cd = it.next() - if (condition(cd.plan)) { + // If `clearCache` is false (which means the recache request comes from a non-cascading + // cache invalidation) and the cache buffer has already been loaded, we do not need to + // re-compile a physical plan because the old plan will not be used any more by the + // CacheManager although it still lives in compiled `Dataset`s and it could still work. Review comment: > the old plan will not be used any more by the CacheManager How do we guarantee it? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org