Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10942#issuecomment-176920834 @cloud-fan I still have a question about the low-level test for verifying if pruning works. For example, Below is the physical plan of a query `hiveContext.table("bucketed_table").filter($"j" === 0).queryExecution.toRdd`, where `j` is a bucketing key. ``` Filter (j#21 = 0) +- Scan ParquetRelation[j#21,k#22,i#23] InputPaths: file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/warehouse--3a744e2c-d2a6-4a1a-ba61-1dc4197744d6/bucketed_table, PushedFilters: [EqualTo(j,0)] ``` ```hiveContext.table("bucketed_table").filter($"j" === 0).queryExecution.toRdd```. When we running this statement, the filter will still remove all the ineligible rows per partition even if bucket pruning does not work. Is my understanding right? If so, how to verify the pruning works? Thank you!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org