Anoop Johnson created SPARK-33760: ------------------------------------- Summary: Extend Dynamic Partition Pruning Support to DataSources Key: SPARK-33760 URL: https://issues.apache.org/jira/browse/SPARK-33760 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.1 Reporter: Anoop Johnson
The implementation of Dynamic Partition Pruning (DPP) in Spark is [specific|https://github.com/apache/spark/blob/fb2e3af4b5d92398d57e61b766466cc7efd9d7cb/sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala#L59-L64] to HadoopFSRelation. As a result, DPP is not triggered for queries that use data sources. The DataSource v2 readers can expose the partition metadata. Can we use this metadata and extend DPP to work on data sources as well? Would appreciate thoughts or corner cases we need to handle. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org