[jira] [Commented] (SPARK-14488) Weird behavior of DDL "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..."

Cheng Lian (JIRA) Fri, 08 Apr 2016 05:07:13 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232092#comment-15232092
 ]


Cheng Lian commented on SPARK-14488:
------------------------------------

Tried the same snippet using Spark 1.6, and got the following exception, which 
makes sense:
{noformat}
scala> sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM 
x"
java.util.NoSuchElementException: key not found: path
        at scala.collection.MapLike$class.default(MapLike.scala:228)
        at 
org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.default(ddl.scala:150)
        at scala.collection.MapLike$class.apply(MapLike.scala:141)
        at 
org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.apply(ddl.scala:150)
        at 
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:230)
        at 
org.apache.spark.sql.execution.datasources.CreateTempTableUsingAsSelect.run(ddl.scala:112)
        at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
        at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
        at 
org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
        at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
        at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
        at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
        at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
        at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:37)
        at $iwC$$iwC$$iwC.<init>(<console>:39)
        at $iwC$$iwC.<init>(<console>:41)
        at $iwC.<init>(<console>:43)
        at <init>(<console>:45)
        at .<init>(<console>:49)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
        at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
        at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
        at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
        at 
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
        at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
        at 
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at 
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at 
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
{noformat}

I tend to believe that the combination described in the ticket is invalid and 
should be rejected by either or analyzer.

> Weird behavior of DDL "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..."
> --------------------------------------------------------------------------
>
>                 Key: SPARK-14488
>                 URL: https://issues.apache.org/jira/browse/SPARK-14488
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>
> Currently, Spark 2.0 master allows DDL statements like {{CREATE TEMPORARY 
> TABLE ... USING ... AS SELECT ...}}, which imposes weird behavior and weird 
> semantics.
> Let's try the following Spark shell snippet:
> {code}
> sqlContext range 10 registerTempTable "x"
> // The problematic DDL statement:
> sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x"
> sqlContext.tables().show()
> {code}
> It shows the following result:
> {noformat}
> +---------+-----------+
> |tableName|isTemporary|
> +---------+-----------+
> |        y|      false|
> |        x|       true|
> +---------+-----------+
> {noformat}
> *Weird behavior*
> Note that {{y}} is NOT temporary although it's created using {{CREATE 
> TEMPORARY TABLE ...}}, and the query result is written in Parquet format 
> under default Hive warehouse location, which is {{/user/hive/warehouse/y}} on 
> my local machine.
> *Weird semantics*
> Secondly, even if this DDL statement does create a temporary table, the 
> semantics is still somewhat weird:
> # It has a {{AS SELECT ...}} clause, which is supposed to run a given query 
> instead of loading data from existing files.
> # It has a {{USING <format>}} clause, which is supposed to, I guess, 
> converting the result of the above query into the given format. And by 
> "converting", we have to write out the data into file system.
> # It has a {{TEMPORARY}} keyword, which is supposed to, I guess, create an 
> in-memory temporary table using the files written above?
> The main questions:
> # Is the above combination ({{TEMPORARY}} + {{USING}} + {{AS SELECT}}) a 
> valid one?
> # If it is, what is the expected semantics?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14488) Weird behavior of DDL "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..."

Reply via email to