[jira] [Commented] (SPARK-6649) DataFrame created through SQLContext.jdbc() failed if columns table must be quoted

Frederick Reiss (JIRA) Tue, 12 May 2015 16:09:48 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540967#comment-14540967
 ]


Frederick Reiss commented on SPARK-6649:
----------------------------------------

Did some more digging through the code and edit history.

It looks like the current version of the SQL parser uses the backtick character 
(`) as a delimiter. This change first appeared in SPARK-3483, with additional 
fixes done as part of SPARK-6898. I don't see any explanation for the use of a 
nonstandard quote character, and there doesn't seem to be any relevant 
discussion in the mailing list archives.

I've only found two test cases under org.apache.spark.sql and child packages 
that use double quotes to delimit a string literal.

In SQLQuerySuite.scala:
{noformat}
test("date row") {
    checkAnswer(sql(
      """select cast("2015-01-28" as date) from testData limit 1"""),
...
{noformat}

And in TableScanSuite.scala:
{noformat}
  test("SPARK-5196 schema field with comment") {
    sql(
      """
       |CREATE TEMPORARY TABLE student(name string comment "SN", age int 
comment "SA", grade int)
       |USING org.apache.spark.sql.sources.AllDataTypesScanSource
       |OPTIONS (
       |  from '1',
       |  to '10'
       |)
...
{noformat}

All the other test cases use single quotes per the SQL standard.

I'm going to assume that the use of double quotes to delimit string literals 
was an oversight and make changes to correct that oversight.

> DataFrame created through SQLContext.jdbc() failed if columns table must be 
> quoted
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-6649
>                 URL: https://issues.apache.org/jira/browse/SPARK-6649
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0
>            Reporter: Frédéric Blanc
>            Priority: Minor
>
> If I want to import the content a table from oracle, that contains a column 
> with name COMMENT (a reserved keyword), I cannot use a DataFrame that map all 
> the columns of this table.
> {code:title=ddl.sql|borderStyle=solid}
> CREATE TABLE TEST_TABLE (
>     "COMMENT" VARCHAR2(10)
> );
> {code}
> {code:title=test.java|borderStyle=solid}
> SQLContext sqlContext = ...
> DataFrame df = sqlContext.jdbc(databaseURL, "TEST_TABLE");
> df.rdd();   // => failed if the table contains a column with a reserved 
> keyword
> {code}
> The same problem can be encounter if reserved keyword are used on table name.
> The JDBCRDD scala class could be improved, if the columnList initializer 
> append the double-quote for each column. (line : 225)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6649) DataFrame created through SQLContext.jdbc() failed if columns table must be quoted

Reply via email to