[ 
https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-30872:
--------------------------------
    Description: 
{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
c").write.saveAsTable("t1")

scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c 
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
      +- *(1) Project
         +- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
            +- *(1) ColumnarToRow
               +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, 
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
 PartitionFilters: [], PushedFilters: [IsNotNull(c), 
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
{code}

We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.


  was:

{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
c").write.saveAsTable("t1")

scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c 
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
      +- *(1) Project
         +- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
            +- *(1) ColumnarToRow
               +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, 
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
 PartitionFilters: [], PushedFilters: [IsNotNull(c), 
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
{code}



> Constraints inferred from inferred attributes
> ---------------------------------------------
>
>                 Key: SPARK-30872
>                 URL: https://issues.apache.org/jira/browse/SPARK-30872
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> {code:scala}
> scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
> c").write.saveAsTable("t1")
> scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or 
> c = 13)").explain(false)
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)])
> +- Exchange SinglePartition, true, [id=#76]
>    +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
>       +- *(1) Project
>          +- *(1) Filter (((((((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
> 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
> (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
>             +- *(1) ColumnarToRow
>                +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: 
> true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
> isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, 
> Location: 
> InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(c), 
> Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
> Or(EqualTo(c,3),EqualT..., ReadSchema: struct<a:bigint,b:bigint,c:bigint>
> {code}
> We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to