wangyum opened a new pull request, #38464: URL: https://github.com/apache/spark/pull/38464
### What changes were proposed in this pull request? This PR enhances DPP to use bloom filters if `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly` is disabled and build plan can't build broadcast by size and can reuse the existing shuffle exchanges. ### Why are the changes needed? Avoid job fail if `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly` is disabled: ```sql select catalog_sales.* from catalog_sales join catalog_returns where cr_order_number = cs_sold_date_sk and cr_returned_time_sk < 40000; ``` ``` 20/08/16 06:44:42 ERROR TaskSetManager: Total size of serialized results of 494 tasks (1225.3 MiB) is bigger than spark.driver.maxResultSize (1024.0 MiB) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org