Lantao Jin created SPARK-32063:
----------------------------------

             Summary: Spark native temporary table
                 Key: SPARK-32063
                 URL: https://issues.apache.org/jira/browse/SPARK-32063
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.1.0
            Reporter: Lantao Jin


Many databases and data warehouse SQL engines support temporary tables. A 
temporary table, as its named implied, is a short-lived table that its life 
will be only for current session.

In Spark, there is no temporary table. the DDL “CREATE TEMPORARY TABLE AS 
SELECT” will create a temporary view. A temporary view is totally different 
with a temporary table. 

A temporary view is just a VIEW. It doesn’t materialize data in storage. So it 
has below shortage:
 # View will not give improved performance. Materialize intermediate data in 
temporary tables for a complex query will accurate queries, especially in an 
ETL pipeline.
 # View which calls other views can cause severe performance issues. Even, 
executing a very complex view may fail in Spark. 
 # Temporary view has no database namespace. In some complex ETL pipelines or 
data warehouse applications, without database prefix is not convenient. It 
needs some tables which only used in current session.

 

More details are described in [Design 
Docs|https://docs.google.com/document/d/1RS4Q3VbxlZ_Yy0fdWgTJ-k0QxFd1dToCqpLAYvIJ34U/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to