[ https://issues.apache.org/jira/browse/SPARK-24596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-24596: ------------------------------------ Assignee: Apache Spark > Non-cascading Cache Invalidation > -------------------------------- > > Key: SPARK-24596 > URL: https://issues.apache.org/jira/browse/SPARK-24596 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0 > Reporter: Maryann Xue > Assignee: Apache Spark > Priority: Major > Fix For: 2.4.0 > > > When invalidating a cache, we invalid other caches dependent on this cache to > ensure cached data is up to date. For example, when the underlying table has > been modified or the table has been dropped itself, all caches that use this > table should be invalidated or refreshed. > However, in other cases, like when user simply want to drop a cache to free > up memory, we do not need to invalidate dependent caches since no underlying > data has been changed. For this reason, we would like to introduce a new > cache invalidation mode: the non-cascading cache invalidation. And we choose > between the existing mode and the new mode for different cache invalidation > scenarios: > # Drop tables and regular (persistent) views: regular mode > # Drop temporary views: non-cascading mode > # Modify table contents (INSERT/UPDATE/MERGE/DELETE): regular mode > # Call DataSet.unpersist(): non-cascading mode > Note that a regular (persistent) view is a database object just like a table, > so after dropping a regular view (whether cached or not cached), any query > referring to that view should no long be valid. Hence if a cached persistent > view is dropped, we need to invalidate the all dependent caches so that > exceptions will be thrown for any later reference. On the other hand, a > temporary view is in fact equivalent to an unnamed DataSet, and dropping a > temporary view should have no impact on queries referencing that view. Thus > we should do non-cascading uncaching for temporary views, which also > guarantees a consistent uncaching behavior between temporary views and > unnamed DataSets. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org