[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232092#comment-15232092 ]
Cheng Lian commented on SPARK-14488: ------------------------------------ Tried the same snippet using Spark 1.6, and got the following exception, which makes sense: {noformat} scala> sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" java.util.NoSuchElementException: key not found: path at scala.collection.MapLike$class.default(MapLike.scala:228) at org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.default(ddl.scala:150) at scala.collection.MapLike$class.apply(MapLike.scala:141) at org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.apply(ddl.scala:150) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:230) at org.apache.spark.sql.execution.datasources.CreateTempTableUsingAsSelect.run(ddl.scala:112) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:37) at $iwC$$iwC$$iwC.<init>(<console>:39) at $iwC$$iwC.<init>(<console>:41) at $iwC.<init>(<console>:43) at <init>(<console>:45) at .<init>(<console>:49) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {noformat} I tend to believe that the combination described in the ticket is invalid and should be rejected by either or analyzer. > Weird behavior of DDL "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." > -------------------------------------------------------------------------- > > Key: SPARK-14488 > URL: https://issues.apache.org/jira/browse/SPARK-14488 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Cheng Lian > Assignee: Cheng Lian > > Currently, Spark 2.0 master allows DDL statements like {{CREATE TEMPORARY > TABLE ... USING ... AS SELECT ...}}, which imposes weird behavior and weird > semantics. > Let's try the following Spark shell snippet: > {code} > sqlContext range 10 registerTempTable "x" > // The problematic DDL statement: > sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x" > sqlContext.tables().show() > {code} > It shows the following result: > {noformat} > +---------+-----------+ > |tableName|isTemporary| > +---------+-----------+ > | y| false| > | x| true| > +---------+-----------+ > {noformat} > *Weird behavior* > Note that {{y}} is NOT temporary although it's created using {{CREATE > TEMPORARY TABLE ...}}, and the query result is written in Parquet format > under default Hive warehouse location, which is {{/user/hive/warehouse/y}} on > my local machine. > *Weird semantics* > Secondly, even if this DDL statement does create a temporary table, the > semantics is still somewhat weird: > # It has a {{AS SELECT ...}} clause, which is supposed to run a given query > instead of loading data from existing files. > # It has a {{USING <format>}} clause, which is supposed to, I guess, > converting the result of the above query into the given format. And by > "converting", we have to write out the data into file system. > # It has a {{TEMPORARY}} keyword, which is supposed to, I guess, create an > in-memory temporary table using the files written above? > The main questions: > # Is the above combination ({{TEMPORARY}} + {{USING}} + {{AS SELECT}}) a > valid one? > # If it is, what is the expected semantics? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org