[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235593#comment-15235593 ] Apache Spark commented on SPARK-14488: -- User 'liancheng' has created a pull request for this issue: https://github.com/apache/spark/pull/12303 > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian >Priority: Critical > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233739#comment-15233739 ] Yin Huai commented on SPARK-14488: -- btw, a useful language feature to add is {{CREATE TEMPORARY TABLE t AS SELECT}}, which is the SQL equivalent of {{sql("...").registerTempTable()}}. > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian >Priority: Critical > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232422#comment-15232422 ] Cheng Lian commented on SPARK-14488: Yea, that's why I came to this DDL command, because this command seems to be the only way to trigger {{CreateTempTableUsingAsSelect}}. However, the physical plan doesn't use it. Will look into this. Thanks! > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232414#comment-15232414 ] Cheng Lian commented on SPARK-14488: Discussed with [~yhuai] offline, and here's the summary: {{CreateTempTableUsingAsSelect}} existed since 1.3 (I'm surprised that I never noticed it!). Its semantics is: # Execute the {{SELECT}} query. # Store query result to a user specified position in filesystem. Note that this means the {{PATH}} data source option should always be set when using this DDL command. # Create a temporary table using written files. Basically, it can be used to dump query results to the filesystem without creating persisted tables. It's indeed a confusing and is kinda equivalent to the following DDL sequence: - {{INSERT OVERWRITE DIRECTORY ... STORE AS ... SELECT ...}} - {{CREATE TEMPORARY TABLE ... USING ... OPTION (PATH ...)}} However, Spark hasn't implemented {{INSERT OVERWRITE DIRECTORY}} yet. In the long run, we should implement it and deprecate this confusing DDL command. Ticket title and description were updated accordingly. > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232402#comment-15232402 ] Herman van Hovell commented on SPARK-14488: --- {{CreateTempTableUsingAsSelect}} should be planned by {{SparkStrategies DDLStrategy}}, see: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L428-L431 > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14488) "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232401#comment-15232401 ] Cheng Lian commented on SPARK-14488: Ah, sorry, the logical plan class {{CreateTableUsingAsSelect}} uses a boolean flag to indicate whether the table is temporary or not, while physical plan uses two different classes {{CreateTempTableUsingAsSelect}} and {{CreateTableUsingAsSelect}}. Then something is probably wrong in the planner. > "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table > > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Cheng Lian > > The following Spark shell snippet reproduces this bug: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +-+---+ > |tableName|isTemporary| > +-+---+ > |y| false| > |x| true| > +-+---+ > {noformat} > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}. > Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} > rather than {{CreateTempTableUsingAsSelect}}. > {noformat} > == Parsed Logical Plan == > 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- 'Project [*] >+- 'UnresolvedRelation `x`, None > == Analyzed Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Project [id#0L] >+- SubqueryAlias x > +- Range 0, 10, 1, 1, [id#0L] > == Optimized Logical Plan == > CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, > None, Overwrite, Map() > +- Range 0, 10, 1, 1, [id#0L] > == Physical Plan == > ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, > [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, > [id#0L]| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org