subject:"\[jira\] \[Updated\] \(SPARK\-30872\) Constraints inferred from inferred attributes"

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

2020-02-19 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-30872:

Affects Version/s: (was: 3.0.0)
   3.1.0

> Constraints inferred from inferred attributes
> -
>
> Key: SPARK-30872
> URL: https://issues.apache.org/jira/browse/SPARK-30872
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:scala}
> scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
> c").write.saveAsTable("t1")
> scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or 
> c = 13)").explain(false)
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)])
> +- Exchange SinglePartition, true, [id=#76]
>+- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
>   +- *(1) Project
>  +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
> 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
> (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
> +- *(1) ColumnarToRow
>+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: 
> true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
> isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, 
> Location: 
> InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(c), 
> Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
> Or(EqualTo(c,3),EqualT..., ReadSchema: struct
> {code}
> We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

2020-02-19 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-30872:

Description: 
{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
c").write.saveAsTable("t1")

scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c 
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
  +- *(1) Project
 +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
+- *(1) ColumnarToRow
   +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, 
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
 PartitionFilters: [], PushedFilters: [IsNotNull(c), 
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
Or(EqualTo(c,3),EqualT..., ReadSchema: struct
{code}

We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.


  was:

{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
c").write.saveAsTable("t1")

scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c 
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
  +- *(1) Project
 +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
+- *(1) ColumnarToRow
   +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, 
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
 PartitionFilters: [], PushedFilters: [IsNotNull(c), 
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
Or(EqualTo(c,3),EqualT..., ReadSchema: struct
{code}



> Constraints inferred from inferred attributes
> -
>
> Key: SPARK-30872
> URL: https://issues.apache.org/jira/browse/SPARK-30872
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:scala}
> scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
> c").write.saveAsTable("t1")
> scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or 
> c = 13)").explain(false)
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)])
> +- Exchange SinglePartition, true, [id=#76]
>+- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
>   +- *(1) Project
>  +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
> 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
> (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
> +- *(1) ColumnarToRow
>+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: 
> true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
> isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, 
> Location: 
> InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(c), 
> Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
> Or(EqualTo(c,3),EqualT..., ReadSchema: struct
> {code}
> We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

2020-02-18 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-30872:

Description: 

{code:scala}
scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
c").write.saveAsTable("t1")

scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c 
= 13)").explain(false)
== Physical Plan ==
*(2) HashAggregate(keys=[], functions=[count(1)])
+- Exchange SinglePartition, true, [id=#76]
   +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
  +- *(1) Project
 +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
(a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
+- *(1) ColumnarToRow
   +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, 
DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
 PartitionFilters: [], PushedFilters: [IsNotNull(c), 
Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
Or(EqualTo(c,3),EqualT..., ReadSchema: struct
{code}


> Constraints inferred from inferred attributes
> -
>
> Key: SPARK-30872
> URL: https://issues.apache.org/jira/browse/SPARK-30872
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:scala}
> scala> spark.range(20).selectExpr("id as a", "id as b", "id as 
> c").write.saveAsTable("t1")
> scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or 
> c = 13)").explain(false)
> == Physical Plan ==
> *(2) HashAggregate(keys=[], functions=[count(1)])
> +- Exchange SinglePartition, true, [id=#76]
>+- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
>   +- *(1) Project
>  +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 
> 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND 
> (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13)))
> +- *(1) ColumnarToRow
>+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: 
> true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), 
> isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, 
> Location: 
> InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(c), 
> Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), 
> Or(EqualTo(c,3),EqualT..., ReadSchema: struct
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes

3 matches

Site Navigation

Mail list logo

Footer information