GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21594
[SPARK-24596][SQL] Non-cascading Cache Invalidation ## What changes were proposed in this pull request? 1. Add parameter 'cascade' in CacheManager.uncacheQuery(). Under 'cascade=false' mode, only invalidate the current cache, and for other dependent caches, rebuild execution plan and reuse cached buffer. 2. Pass true/false from callers in different uncache scenarios: - Drop tables and regular (persistent) views: regular mode - Drop temporary views: non-cascading mode - Modify table contents (INSERT/UPDATE/MERGE/DELETE): regular mode - Call DataSet.unpersist(): non-cascading mode Note that a regular (persistent) view is a database object just like a table, so after dropping a regular view (whether cached or not cached), any query referring to that view should no long be valid. Hence if a cached persistent view is dropped, we need to invalidate the all dependent caches so that exceptions will be thrown for any later reference. On the other hand, a temporary view is in fact equivalent to an unnamed DataSet, and dropping a temporary view should have no impact on queries referencing that view. Thus we should do non-cascading uncaching for temporary views, which also guarantees a consistent uncaching behavior between temporary views and unnamed DataSets. ## How was this patch tested? New tests in CachedTableSuite and DatasetCacheSuite. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maryannxue/spark noncascading-cache Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21594.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21594 ---- commit 27e484b97ec5f9fdbfdaa5c8c1d9f45233cbbdbe Author: Maryann Xue <maryannxue@...> Date: 2018-06-19T04:32:11Z noncascading cache commit 483008c577c0ec7335b0a9a1c567f60311bb83a6 Author: Maryann Xue <maryannxue@...> Date: 2018-06-19T18:18:06Z code refine commit a782aacd5d4943b8bbfadde27a9c9e9d30c223fe Author: Maryann Xue <maryannxue@...> Date: 2018-06-19T18:24:57Z Merge remote-tracking branch 'origin/master' into noncascading-cache commit 0cd8dc10eb85b6df1704e13084f53f9cefe410b3 Author: Maryann Xue <maryannxue@...> Date: 2018-06-19T21:36:29Z refine test cases commit 71b93ed598833d760955e972894685c089af297b Author: Maryann Xue <maryannxue@...> Date: 2018-06-19T22:19:05Z refine test cases ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org