GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/22198
[SPARK-25121][SQL] Supports multi-part table names for broadcast hint resolution ## What changes were proposed in this pull request? This pr fixed code to respect a database name for broadcast table hint resolution. Currently, spark ignores a database name in multi-part names; ``` scala> sql("CREATE DATABASE testDb") scala> spark.range(10).write.saveAsTable("testDb.t") // without this patch scala> spark.range(10).join(spark.table("testDb.t"), "id").hint("broadcast", "testDb.t").explain == Physical Plan == *(2) Project [id#24L] +- *(2) BroadcastHashJoin [id#24L], [id#26L], Inner, BuildLeft :- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, false])) : +- *(1) Range (0, 10, step=1, splits=4) +- *(2) Project [id#26L] +- *(2) Filter isnotnull(id#26L) +- *(2) FileScan parquet testdb.t[id#26L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/Users/maropu/Repositories/spark/spark-2.3.1-bin-hadoop2.7/spark-warehouse..., PartitionFilters: [], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> // with this patch scala> spark.range(10).join(spark.table("testDb.t"), "id").hint("broadcast", "testDb.t").explain == Physical Plan == *(2) Project [id#3L] +- *(2) BroadcastHashJoin [id#3L], [id#5L], Inner, BuildRight :- *(2) Range (0, 10, step=1, splits=4) +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, true])) +- *(1) Project [id#5L] +- *(1) Filter isnotnull(id#5L) +- *(1) FileScan parquet testdb.t[id#5L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/Users/maropu/Repositories/spark/spark-master/spark-warehouse/testdb.db/t], PartitionFilters: [], PushedFilters: [IsNotNull(id)], ReadSchema: struct<id:bigint> ``` ## How was this patch tested? Added tests in `DataFrameJoinSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark SPARK-25121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22198.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22198 ---- commit d2be6920ba1cc052e9d5d8364cf48375cea8ba44 Author: Takeshi Yamamuro <yamamuro@...> Date: 2018-08-23T07:20:51Z Fix ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org