[jira] [Updated] (SPARK-20837) Spark SQL doesn't support escape of single/double quote as SQL standard.

bing huang (JIRA) Mon, 22 May 2017 04:46:37 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-20837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


bing huang updated SPARK-20837:
-------------------------------
    Description: 
1. If we run the code below against 1.6.x, we will get "Exception in thread 
"main" java.lang.RuntimeException: [1.44] failure: ``)'' expected but "york" 
found".
2. If we run the code against 2.x.x, we can run successfully and the result is 
(1,2,3,4,5), however, based on the sql specification, to doubling up the single 
quote here is just to escape the single quote, hence the result should be 
(6,7,8,9,10), which could also be verified if you ran the same sql in 
MySQLWorkBench or SQL Server.


The code snippet I used to demonstrate the issue as below:

    val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)


    // create test dataset
    val data = (1 to 10).map{x:Int => x match {
      case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01 
13:59:59.123", 2147483647.0, Double
        .PositiveInfinity)
      case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456", 1.0, 
Double.PositiveInfinity)
    }}

    // create schema of the test dataset
    val schema = StructType(Array(
      StructField("A1", DataTypes.StringType),
      StructField("A2", DataTypes.StringType),
      StructField("A3", DataTypes.StringType),
      StructField("A4", DataTypes.DoubleType),
      StructField("A5", DataTypes.DoubleType)
    ))
    val rdd = sc.parallelize(data)
    val df = sqlContext.createDataFrame(rdd,schema)
    df.registerTempTable("test")

    val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"

    sqlContext.sql(sqlString).show(false)

  was:
The code snippet I used to demonstrate the issue:

    val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)


    // create test dataset
    val data = (1 to 10).map{x:Int => x match {
      case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01 
13:59:59.123", 2147483647.0, Double
        .PositiveInfinity)
      case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456", 1.0, 
Double.PositiveInfinity)
    }}

    // create schema of the test dataset
    val schema = StructType(Array(
      StructField("A1", DataTypes.StringType),
      StructField("A2", DataTypes.StringType),
      StructField("A3", DataTypes.StringType),
      StructField("A4", DataTypes.DoubleType),
      StructField("A5", DataTypes.DoubleType)
    ))
    val rdd = sc.parallelize(data)
    val df = sqlContext.createDataFrame(rdd,schema)
    df.registerTempTable("test")

    val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"

    sqlContext.sql(sqlString).show(false)


> Spark SQL doesn't support escape of single/double quote as SQL standard.
> ------------------------------------------------------------------------
>
>                 Key: SPARK-20837
>                 URL: https://issues.apache.org/jira/browse/SPARK-20837
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1
>            Reporter: bing huang
>
> 1. If we run the code below against 1.6.x, we will get "Exception in thread 
> "main" java.lang.RuntimeException: [1.44] failure: ``)'' expected but "york" 
> found".
> 2. If we run the code against 2.x.x, we can run successfully and the result 
> is (1,2,3,4,5), however, based on the sql specification, to doubling up the 
> single quote here is just to escape the single quote, hence the result should 
> be (6,7,8,9,10), which could also be verified if you ran the same sql in 
> MySQLWorkBench or SQL Server.
> The code snippet I used to demonstrate the issue as below:
>     val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
>     val sc = new SparkContext(conf)
>     val sqlContext = new SQLContext(sc)
>     // create test dataset
>     val data = (1 to 10).map{x:Int => x match {
>       case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01 
> 13:59:59.123", 2147483647.0, Double
>         .PositiveInfinity)
>       case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456", 
> 1.0, Double.PositiveInfinity)
>     }}
>     // create schema of the test dataset
>     val schema = StructType(Array(
>       StructField("A1", DataTypes.StringType),
>       StructField("A2", DataTypes.StringType),
>       StructField("A3", DataTypes.StringType),
>       StructField("A4", DataTypes.DoubleType),
>       StructField("A5", DataTypes.DoubleType)
>     ))
>     val rdd = sc.parallelize(data)
>     val df = sqlContext.createDataFrame(rdd,schema)
>     df.registerTempTable("test")
>     val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"
>     sqlContext.sql(sqlString).show(false)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-20837) Spark SQL doesn't support escape of single/double quote as SQL standard.

Reply via email to