[ 
https://issues.apache.org/jira/browse/SPARK-25548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-25548:
------------------------------------

    Assignee: Apache Spark

> In the PruneFileSourcePartitions optimizer, replace the nonPartitionOps field 
> with true in the And(partitionOps, nonPartitionOps) to make the partition can 
> be pruned
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25548
>                 URL: https://issues.apache.org/jira/browse/SPARK-25548
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.2
>            Reporter: eaton
>            Assignee: Apache Spark
>            Priority: Critical
>
> In the PruneFileSourcePartitions optimizer, the partition files will not be 
> pruned if we use partition filter and non partition filter together, for 
> example:
> sql("CREATE TABLE IF NOT EXISTS src_par (key INT, value STRING) partitioned 
> by(p_d int) stored as parquet ")
>  sql("insert overwrite table src_par partition(p_d=2) select 2 as key, '4' as 
> value")
>  sql("insert overwrite table src_par partition(p_d=3) select 3 as key, '4' as 
> value")
>  sql("insert overwrite table src_par partition(p_d=4) select 4 as key, '4' as 
> value")
> The sql below will scan all the partition files, in which, the partition 
> **p_d=4** should be pruned.
>  **sql("select * from src_par where (p_d=2 and key=2) or (p_d=3 and 
> key=3)").show**



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to