[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232414#comment-15232414 ]
Cheng Lian edited comment on SPARK-14488 at 4/8/16 4:17 PM: ------------------------------------------------------------ Discussed with [~yhuai] offline, and here's the summary: {{CreateTempTableUsingAsSelect}} existed since 1.3 (I'm surprised that I never noticed it!). Its semantics is: # Execute the {{SELECT}} query. # Store query result to a user specified position in filesystem. Note that this means the {{PATH}} data source option should always be set when using this DDL command. # Create a temporary table using written files. Basically, it can be used to dump query results to the filesystem without creating persisted tables. It's indeed a confusing command and is kinda equivalent to the following DDL sequence: - {{INSERT OVERWRITE DIRECTORY ... STORE AS ... SELECT ...}} - {{CREATE TEMPORARY TABLE ... USING ... OPTION (PATH ...)}} However, Spark hasn't implemented {{INSERT OVERWRITE DIRECTORY}} yet. In the long run, we should implement it and deprecate this confusing DDL command. Ticket title and description were updated accordingly. was (Author: lian cheng): Discussed with [~yhuai] offline, and here's the summary: {{CreateTempTableUsingAsSelect}} existed since 1.3 (I'm surprised that I never noticed it!). Its semantics is: # Execute the {{SELECT}} query. # Store query result to a user specified position in filesystem. Note that this means the {{PATH}} data source option should always be set when using this DDL command. # Create a temporary table using written files. Basically, it can be used to dump query results to the filesystem without creating persisted tables. It's indeed a confusing and is kinda equivalent to the following DDL sequence: - {{INSERT OVERWRITE DIRECTORY ... STORE AS ... SELECT ...}} - {{CREATE TEMPORARY TABLE ... USING ... OPTION (PATH ...)}} However, Spark hasn't implemented {{INSERT OVERWRITE DIRECTORY}} yet. In the long run, we should implement it and deprecate this confusing DDL command. Ticket title and description were updated accordingly. > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > ---------------------------------------------------------------------------- > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Cheng Lian > Assignee: Cheng Lian > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +---------+-----------+ > |tableName|isTemporary| > +---------+-----------+ > | y| false| > | x| true| > +---------+-----------+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] > +- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] > +- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org