[ https://issues.apache.org/jira/browse/SPARK-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527738#comment-14527738 ]
Frederick Reiss commented on SPARK-6649: ---------------------------------------- I was able to reproduce this problem on Spark 1.3.1. The Spark SQL lexical analyzer treats anything enclosed in double quotes as a string. The Spark SQL parser only allows strings to be treated as literals, not identifiers. On the spark-sql command line: {noformat} spark-sql> select "hello" as hello, 'world' as world; hello world Time taken: 0.125 seconds, Fetched 1 row(s) spark-sql> select "hello" as "hello", 'world' as "world"; 15/05/04 18:03:05 ERROR SparkSQLDriver: Failed in [select "hello" as "hello", 'world' as "world"] org.apache.spark.sql.AnalysisException: cannot recognize input near 'as' '"hello"' ',' in selection target; line 1 pos 18 at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:254) [many lines of stack trace] {noformat} The same thing happens from the Spark Scala shell: {noformat} scala> val df = sqlContext.sql("select \"hello\" as hello, 'world' as world") df: org.apache.spark.sql.DataFrame = [hello: string, world: string] scala> val df2 = sqlContext.sql("select \"hello\" as \"hello\", 'world' as \"world\"") org.apache.spark.sql.AnalysisException: cannot recognize input near 'as' '"hello"' ',' in selection target; line 1 pos 18 at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:254) at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) [many lines of stack trace] {noformat} This behavior is not consistent with the SQL standard, though I suppose it is somewhat consistent with MySQL's default behavior. According to the grammar in the SQL-92 document ([http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt], strings should be delimited by single quotes: {noformat} <character string literal> ::= [ <introducer><character set specification> ] <quote> [ <character representation>... ] <quote> [ { <separator>... <quote> [ <character representation>... ] <quote> }... ] ... <national character string literal> ::= N <quote> [ <character representation>... ] <quote> [ { <separator>... <quote> [ <character representation>... ] <quote> }... ] <bit string literal> ::= B <quote> [ <bit>... ] <quote> [ { <separator>... <quote> [ <bit>... ] <quote> }... ] <hex string literal> ::= X <quote> [ <hexit>... ] <quote> [ { <separator>... <quote> [ <hexit>... ] <quote> }... ] ... <quote> ::= ' {noformat} and identifiers *may* be delimited with double quotes: {noformat} <delimited identifier> ::= <double quote> <delimited identifier body> <double quote> ... <double quote> ::= " {noformat} Thoughts? Are there any pull requests in flight that fix this problem already? > DataFrame created through SQLContext.jdbc() failed if columns table must be > quoted > ---------------------------------------------------------------------------------- > > Key: SPARK-6649 > URL: https://issues.apache.org/jira/browse/SPARK-6649 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.0 > Reporter: Frédéric Blanc > Priority: Minor > > If I want to import the content a table from oracle, that contains a column > with name COMMENT (a reserved keyword), I cannot use a DataFrame that map all > the columns of this table. > {code:title=ddl.sql|borderStyle=solid} > CREATE TABLE TEST_TABLE ( > "COMMENT" VARCHAR2(10) > ); > {code} > {code:title=test.java|borderStyle=solid} > SQLContext sqlContext = ... > DataFrame df = sqlContext.jdbc(databaseURL, "TEST_TABLE"); > df.rdd(); // => failed if the table contains a column with a reserved > keyword > {code} > The same problem can be encounter if reserved keyword are used on table name. > The JDBCRDD scala class could be improved, if the columnList initializer > append the double-quote for each column. (line : 225) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org