Repository: spark Updated Branches: refs/heads/master d2436a852 -> 13a67b070
[SPARK-24870][SQL] Cache can't work normally if there are case letters in SQL ## What changes were proposed in this pull request? Modified the canonicalized to not case-insensitive. Before the PR, cache can't work normally if there are case letters in SQL, for example: sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive") sql("select key, sum(case when Key > 0 then 1 else 0 end) as positiveNum " + "from src group by key").cache().createOrReplaceTempView("src_cache") sql( s"""select a.key from (select key from src_cache where positiveNum = 1)a left join (select key from src_cache )b on a.key=b.key """).explain The physical plan of the sql is: ![image](https://user-images.githubusercontent.com/26834091/42979518-3decf0fa-8c05-11e8-9837-d5e4c334cb1f.png) The subquery "select key from src_cache where positiveNum = 1" on the left of join can use the cache data, but the subquery "select key from src_cache" on the right of join cannot use the cache data. ## How was this patch tested? new added test Author: 10129659 <chen.yans...@zte.com.cn> Closes #21823 from eatoncys/canonicalized. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/13a67b07 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/13a67b07 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/13a67b07 Branch: refs/heads/master Commit: 13a67b070d335bb257d13dacadea3450885c3d81 Parents: d2436a8 Author: 10129659 <chen.yans...@zte.com.cn> Authored: Mon Jul 23 23:05:08 2018 -0700 Committer: Xiao Li <gatorsm...@gmail.com> Committed: Mon Jul 23 23:05:08 2018 -0700 ---------------------------------------------------------------------- .../apache/spark/sql/catalyst/plans/QueryPlan.scala | 2 +- .../apache/spark/sql/execution/SameResultSuite.scala | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/13a67b07/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala index 4b4722b..b1ffdca 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala @@ -284,7 +284,7 @@ object QueryPlan extends PredicateHelper { if (ordinal == -1) { ar } else { - ar.withExprId(ExprId(ordinal)) + ar.withExprId(ExprId(ordinal)).canonicalized } }.canonicalized.asInstanceOf[T] } http://git-wip-us.apache.org/repos/asf/spark/blob/13a67b07/sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala ---------------------------------------------------------------------- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala index aaf51b5..d088e24 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala @@ -18,8 +18,11 @@ package org.apache.spark.sql.execution import org.apache.spark.sql.{DataFrame, QueryTest} +import org.apache.spark.sql.catalyst.expressions.AttributeReference +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, Project} import org.apache.spark.sql.functions._ import org.apache.spark.sql.test.SharedSQLContext +import org.apache.spark.sql.types.IntegerType /** * Tests for the sameResult function for [[SparkPlan]]s. @@ -58,4 +61,16 @@ class SameResultSuite extends QueryTest with SharedSQLContext { val df4 = spark.range(10).agg(sumDistinct($"id")) assert(df3.queryExecution.executedPlan.sameResult(df4.queryExecution.executedPlan)) } + + test("Canonicalized result is case-insensitive") { + val a = AttributeReference("A", IntegerType)() + val b = AttributeReference("B", IntegerType)() + val planUppercase = Project(Seq(a), LocalRelation(a, b)) + + val c = AttributeReference("a", IntegerType)() + val d = AttributeReference("b", IntegerType)() + val planLowercase = Project(Seq(c), LocalRelation(c, d)) + + assert(planUppercase.sameResult(planLowercase)) + } } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org