[ https://issues.apache.org/jira/browse/SPARK-52160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Smith updated SPARK-52160: ------------------------------- Description: We have a hive partitioned dataset, partitioned on year, month, day. When doing a broadcast hash join with that table, with the partition keys as the join keys, dynamic partition pruning is not being used. It's reading all partitions. The query would be a lot faster with DPP. When filtering on year, month, and day (no join), Spark does read only the partitions that match the filter. Code: https://community.palantir.com/t/dynamic-partition-pruning-of-hive-partitioned-datasets-with-a-broadcast-join/3545 was: We have a hive partitioned dataset, partitioned on year, month, day. When doing a broadcast hash join with that table, with the partition keys as the join keys, dynamic partition pruning is not being used. It's reading all partitions. The query would be a lot faster with DPP. When filtering on year, month, and day (no join), Spark does read only the partitions that match the filter. > Dynamic pruning not being used with broadcast hash join > ------------------------------------------------------- > > Key: SPARK-52160 > URL: https://issues.apache.org/jira/browse/SPARK-52160 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.5.5 > Reporter: John Smith > Priority: Major > > We have a hive partitioned dataset, partitioned on year, month, day. When > doing a broadcast hash join with that table, with the partition keys as the > join keys, dynamic partition pruning is not being used. It's reading all > partitions. The query would be a lot faster with DPP. > > When filtering on year, month, and day (no join), Spark does read only the > partitions that match the filter. > > Code: > https://community.palantir.com/t/dynamic-partition-pruning-of-hive-partitioned-datasets-with-a-broadcast-join/3545 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org