Yuming Wang created SPARK-32351:
-----------------------------------

             Summary: Partially pushed partition filters are not explained
                 Key: SPARK-32351
                 URL: https://issues.apache.org/jira/browse/SPARK-32351
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.1.0
            Reporter: Yuming Wang


How to reproduce this issue:
{code:scala}
spark.sql(
  s"""
     |CREATE TABLE t(i INT, p STRING)
     |USING parquet
     |PARTITIONED BY (p)""".stripMargin)

spark.range(0, 1000).selectExpr("id as col").createOrReplaceTempView("temp")
for (part <- Seq(1, 2, 3, 4)) {
  sql(s"""
         |INSERT OVERWRITE TABLE t PARTITION (p='$part')
         |SELECT col FROM temp""".stripMargin)
}

spark.sql("SELECT * FROM t WHERE  WHERE (p = '1' AND i = 1) OR (p = '2' and i = 
2)").explain
{code}

We have pushed down {{p = '1' or p = '2'}} since SPARK-28169, but this pushed 
down filter not in explain

{noformat}
== Physical Plan ==
*(1) Filter (((p#21 = 1) AND (i#20 = 1)) OR ((p#21 = 2) AND (i#20 = 2)))
+- *(1) ColumnarToRow
   +- FileScan parquet default.t[i#20,p#21] Batched: true, DataFilters: [], 
Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/spark/SPARK-32289/sql/core/spark-warehouse/org.apache.spark...,
 PartitionFilters: [], PushedFilters: [], ReadSchema: struct<i:int>

{noformat}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to