[jira] [Commented] (SPARK-32063) Spark native temporary table

L. C. Hsieh (Jira) Tue, 23 Jun 2020 12:12:31 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-32063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143233#comment-17143233
 ]


L. C. Hsieh commented on SPARK-32063:
-------------------------------------

For 1 and 2, it seems all related to performance. In Spark, we have caching 
mechanism that materializes complex query. I think it can complement the 
shortage of temporary view.

For 3, I'm not sure about this point. Can you elaborate it more?

> Spark native temporary table
> ----------------------------
>
>                 Key: SPARK-32063
>                 URL: https://issues.apache.org/jira/browse/SPARK-32063
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> Many databases and data warehouse SQL engines support temporary tables. A 
> temporary table, as its named implied, is a short-lived table that its life 
> will be only for current session.
> In Spark, there is no temporary table. the DDL “CREATE TEMPORARY TABLE AS 
> SELECT” will create a temporary view. A temporary view is totally different 
> with a temporary table. 
> A temporary view is just a VIEW. It doesn’t materialize data in storage. So 
> it has below shortage:
>  # View will not give improved performance. Materialize intermediate data in 
> temporary tables for a complex query will accurate queries, especially in an 
> ETL pipeline.
>  # View which calls other views can cause severe performance issues. Even, 
> executing a very complex view may fail in Spark. 
>  # Temporary view has no database namespace. In some complex ETL pipelines or 
> data warehouse applications, without database prefix is not convenient. It 
> needs some tables which only used in current session.
>  
> More details are described in [Design 
> Docs|https://docs.google.com/document/d/1RS4Q3VbxlZ_Yy0fdWgTJ-k0QxFd1dToCqpLAYvIJ34U/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-32063) Spark native temporary table

Reply via email to