[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17266 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105583152 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- We got the exception from Hive, because they are not escaped. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105572873 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- From the current master branch, Hive does not accept that. ```scala scala> Seq((1, "a\"'b")).toDF("a", "p").write.partitionBy("p").saveAsTable("t1") scala> spark.table("t1").filter($"p" === """a"'b""").select($"a").show java.lang.RuntimeException: ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105572627 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- Does that return the correct result? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105572354 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- ```Scala table.filter($"p" === """a"'b""").select($"a") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105572081 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- Test case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/17266#discussion_r105553664 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -566,13 +566,24 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { s"$v ${op.symbol} ${a.name}" case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType)) if !varcharKeys.contains(a.name) => -s"""${a.name} ${op.symbol} "$v +s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}""" case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute) if !varcharKeys.contains(a.name) => -s$v" ${op.symbol} ${a.name}""" +s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}""" }.mkString(" and ") } + private def quoteStringLiteral(str: String): String = { +if (!str.contains("\"")) { + s$str +} else if (!str.contains("'")) { + s"""'$str'""" +} else { + throw new UnsupportedOperationException( +"""Partition filter cannot have both `"` and `'` characters""") --- End diff -- The current master also raise exception with this mixed case. ```scala scala> spark.table("t1").filter($"p" === "'\"").select($"a").show java.lang.RuntimeException: Caught Hive MetaException attempting to get partition metadata by filter from ... ... Caused by: java.lang.reflect.InvocationTargetException: org.apache.hadoop.hive.metastore.api.MetaException: Error parsing partition filter : line 1:8 mismatched character '' expecting '"' ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17266: [SPARK-19912][SQL] String literals should be esca...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/17266 [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning ## What changes were proposed in this pull request? Currently, HiveShim's `convertFilters` does not escape the string literals. This PR fixes the following cases. **BEFORE** ``` scala> Seq((1, "p1", "q1"), (2, "p1\" and q=\"q1", "q2")).toDF("a", "p", "q").write.partitionBy("p", "q").saveAsTable("t1") scala> spark.table("t1").filter($"p" === "p1\" and q=\"q1").select($"a").show +---+ | a| +---+ +---+ ``` **AFTER** ``` scala> Seq((1, "p1", "q1"), (2, "p1\" and q=\"q1", "q2")).toDF("a", "p", "q").write.partitionBy("p", "q").saveAsTable("t1") scala> spark.table("t1").filter($"p" === "p1\" and q=\"q1").select($"a").show +---+ | a| +---+ | 2| +---+ ``` ## How was this patch tested? Pass the Jenkins test with new test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-19912 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17266.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17266 commit 37bff5dfbe7dd8c2b3f650915107bdb84a4b7fac Author: Dongjoon Hyun Date: 2017-03-12T09:48:35Z [SPARK-19912][SQL] String literals should be escaped for Hive metastore partition pruning --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org