GitHub user eatoncys opened a pull request: https://github.com/apache/spark/pull/21823
[SPARK-24870][SQL]Cache can't work normally if there are case letters in SQL ## What changes were proposed in this pull request? Modified the canonicalized to not case-insensitive. Before the PR, cache can't work normally if there are case letters in SQL, for example: sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive") sql("select key, sum(case when Key > 0 then 1 else 0 end) as positiveNum " + "from src group by key").cache().createOrReplaceTempView("src_cache") sql( s"""select a.key from (select key from src_cache where positiveNum = 1)a left join (select key from src_cache )b on a.key=b.key """).explain The physical plan of the sql is: ![image](https://user-images.githubusercontent.com/26834091/42979518-3decf0fa-8c05-11e8-9837-d5e4c334cb1f.png) The subquery "select key from src_cache where positiveNum = 1" on the left of join can use the cache data, but the subquery "select key from src_cache" on the right of join cannot use the cache data. ## How was this patch tested? new added test You can merge this pull request into a Git repository by running: $ git pull https://github.com/eatoncys/spark canonicalized Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21823.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21823 ---- commit 2b2a5a33ed58ce07fd2515eb01e80acbedeb8b2a Author: 10129659 <chen.yanshan@...> Date: 2018-07-20T01:43:53Z Cache can't work normally if there are case letters in SQL ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org