[
https://issues.apache.org/jira/browse/SPARK-20837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
bing huang updated SPARK-20837:
-------------------------------
Description:
1. If we run the code below against 1.6.x, we will get "Exception in thread
"main" java.lang.RuntimeException: [1.44] failure: ``)'' expected but "york"
found".
2. If we run the code against 2.x.x, we can run successfully and the result is
(1,2,3,4,5), however, based on the sql specification, to doubling up the single
quote here is just to escape the single quote, hence the result should be
(6,7,8,9,10), which could also be verified if you ran the same sql in
MySQLWorkBench or SQL Server.
The code snippet I used to demonstrate the issue as below:
val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
// create test dataset
val data = (1 to 10).map{x:Int => x match {
case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01
13:59:59.123", 2147483647.0, Double
.PositiveInfinity)
case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456", 1.0,
Double.PositiveInfinity)
}}
// create schema of the test dataset
val schema = StructType(Array(
StructField("A1", DataTypes.StringType),
StructField("A2", DataTypes.StringType),
StructField("A3", DataTypes.StringType),
StructField("A4", DataTypes.DoubleType),
StructField("A5", DataTypes.DoubleType)
))
val rdd = sc.parallelize(data)
val df = sqlContext.createDataFrame(rdd,schema)
df.registerTempTable("test")
val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"
sqlContext.sql(sqlString).show(false)
was:
The code snippet I used to demonstrate the issue:
val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
// create test dataset
val data = (1 to 10).map{x:Int => x match {
case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01
13:59:59.123", 2147483647.0, Double
.PositiveInfinity)
case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456", 1.0,
Double.PositiveInfinity)
}}
// create schema of the test dataset
val schema = StructType(Array(
StructField("A1", DataTypes.StringType),
StructField("A2", DataTypes.StringType),
StructField("A3", DataTypes.StringType),
StructField("A4", DataTypes.DoubleType),
StructField("A5", DataTypes.DoubleType)
))
val rdd = sc.parallelize(data)
val df = sqlContext.createDataFrame(rdd,schema)
df.registerTempTable("test")
val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"
sqlContext.sql(sqlString).show(false)
> Spark SQL doesn't support escape of single/double quote as SQL standard.
> ------------------------------------------------------------------------
>
> Key: SPARK-20837
> URL: https://issues.apache.org/jira/browse/SPARK-20837
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1
> Reporter: bing huang
>
> 1. If we run the code below against 1.6.x, we will get "Exception in thread
> "main" java.lang.RuntimeException: [1.44] failure: ``)'' expected but "york"
> found".
> 2. If we run the code against 2.x.x, we can run successfully and the result
> is (1,2,3,4,5), however, based on the sql specification, to doubling up the
> single quote here is just to escape the single quote, hence the result should
> be (6,7,8,9,10), which could also be verified if you ran the same sql in
> MySQLWorkBench or SQL Server.
> The code snippet I used to demonstrate the issue as below:
> val conf = new SparkConf().setAppName("bhuang").setMaster("local[3]")
> val sc = new SparkContext(conf)
> val sqlContext = new SQLContext(sc)
> // create test dataset
> val data = (1 to 10).map{x:Int => x match {
> case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01
> 13:59:59.123", 2147483647.0, Double
> .PositiveInfinity)
> case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456",
> 1.0, Double.PositiveInfinity)
> }}
> // create schema of the test dataset
> val schema = StructType(Array(
> StructField("A1", DataTypes.StringType),
> StructField("A2", DataTypes.StringType),
> StructField("A3", DataTypes.StringType),
> StructField("A4", DataTypes.DoubleType),
> StructField("A5", DataTypes.DoubleType)
> ))
> val rdd = sc.parallelize(data)
> val df = sqlContext.createDataFrame(rdd,schema)
> df.registerTempTable("test")
> val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"
> sqlContext.sql(sqlString).show(false)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]