Anoop Johnson created SPARK-33760:
-------------------------------------

             Summary: Extend Dynamic Partition Pruning Support to DataSources
                 Key: SPARK-33760
                 URL: https://issues.apache.org/jira/browse/SPARK-33760
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.1
            Reporter: Anoop Johnson


The implementation of Dynamic Partition Pruning  (DPP) in Spark is 
[specific|https://github.com/apache/spark/blob/fb2e3af4b5d92398d57e61b766466cc7efd9d7cb/sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala#L59-L64]
 to HadoopFSRelation. As a result, DPP is not triggered for queries that use 
data sources. 

The DataSource v2 readers can expose the partition metadata. Can we use this 
metadata and extend DPP to work on data sources as well?

Would appreciate thoughts or corner cases we need to handle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to