GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23045
[SPARK-26071][SQL] disallow map as map key ## What changes were proposed in this pull request? Due to implementation limitation, currently Spark can't compare or do equality check between map types. As a result, map values can't appear in EQUAL or comparison expressions, can't be grouping key, etc. The more important thing is, map loop up needs to do equality check of the map key, and thus can't support map as map key when looking up values from a map. Thus it's not useful to have map as map key. This PR proposes to stop users from creating maps using map type as key. The list of expressions that are updated: `CreateMap`, `MapFromArrays`, `MapFromEntries`, `MapConcat`, `TransformKeys`. I manually checked all the places that create `MapType`, and came up with this list. Note that, maps with map type key still exist, via reading from parquet files, converting from scala/java map, etc. This PR is not to completely forbid map as map key, but to avoid creating it by Spark itself. Motivation: when I was trying to fix the duplicate key problem, I found it's impossible to do it with map type map key. I think it's reasonable to avoid map type map key for builtin functions. ## How was this patch tested? updated test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark map-key Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23045.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23045 ---- commit 3ff0cd592c52839d0aac739b44cee0cf02e951bc Author: Wenchen Fan <wenchen@...> Date: 2018-11-15T10:23:58Z disallow map as map key ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org