[ https://issues.apache.org/jira/browse/SPARK-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust resolved SPARK-3212. ------------------------------------- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2501 [https://github.com/apache/spark/pull/2501] > Improve the clarity of caching semantics > ---------------------------------------- > > Key: SPARK-3212 > URL: https://issues.apache.org/jira/browse/SPARK-3212 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Michael Armbrust > Assignee: Michael Armbrust > Priority: Blocker > Fix For: 1.2.0 > > > Right now there are a bunch of different ways to cache tables in Spark SQL. > For example: > - tweets.cache() > - sql("SELECT * FROM tweets").cache() > - table("tweets").cache() > - tweets.cache().registerTempTable(tweets) > - sql("CACHE TABLE tweets") > - cacheTable("tweets") > Each of the above commands has subtly different semantics, leading to a very > confusing user experience. Ideally, we would stop doing caching based on > simple tables names and instead have a phase of optimization that does > intelligent matching of query plans with available cached data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org