[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes
[ https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-30872: Affects Version/s: (was: 3.0.0) 3.1.0 > Constraints inferred from inferred attributes > - > > Key: SPARK-30872 > URL: https://issues.apache.org/jira/browse/SPARK-30872 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Yuming Wang >Priority: Major > > {code:scala} > scala> spark.range(20).selectExpr("id as a", "id as b", "id as > c").write.saveAsTable("t1") > scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or > c = 13)").explain(false) > == Physical Plan == > *(2) HashAggregate(keys=[], functions=[count(1)]) > +- Exchange SinglePartition, true, [id=#76] >+- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) > +- *(1) Project > +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = > 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND > (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: > true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), > isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, > Location: > InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., > PartitionFilters: [], PushedFilters: [IsNotNull(c), > Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), > Or(EqualTo(c,3),EqualT..., ReadSchema: struct > {code} > We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes
[ https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-30872: Description: {code:scala} scala> spark.range(20).selectExpr("id as a", "id as b", "id as c").write.saveAsTable("t1") scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c = 13)").explain(false) == Physical Plan == *(2) HashAggregate(keys=[], functions=[count(1)]) +- Exchange SinglePartition, true, [id=#76] +- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) +- *(1) Project +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) +- *(1) ColumnarToRow +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., PartitionFilters: [], PushedFilters: [IsNotNull(c), Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), Or(EqualTo(c,3),EqualT..., ReadSchema: struct {code} We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}. was: {code:scala} scala> spark.range(20).selectExpr("id as a", "id as b", "id as c").write.saveAsTable("t1") scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c = 13)").explain(false) == Physical Plan == *(2) HashAggregate(keys=[], functions=[count(1)]) +- Exchange SinglePartition, true, [id=#76] +- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) +- *(1) Project +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) +- *(1) ColumnarToRow +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., PartitionFilters: [], PushedFilters: [IsNotNull(c), Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), Or(EqualTo(c,3),EqualT..., ReadSchema: struct {code} > Constraints inferred from inferred attributes > - > > Key: SPARK-30872 > URL: https://issues.apache.org/jira/browse/SPARK-30872 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > {code:scala} > scala> spark.range(20).selectExpr("id as a", "id as b", "id as > c").write.saveAsTable("t1") > scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or > c = 13)").explain(false) > == Physical Plan == > *(2) HashAggregate(keys=[], functions=[count(1)]) > +- Exchange SinglePartition, true, [id=#76] >+- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) > +- *(1) Project > +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = > 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND > (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: > true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), > isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, > Location: > InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., > PartitionFilters: [], PushedFilters: [IsNotNull(c), > Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), > Or(EqualTo(c,3),EqualT..., ReadSchema: struct > {code} > We can infer more constraints: {{(a#34L = 3) OR (a#34L = 13)}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-30872) Constraints inferred from inferred attributes
[ https://issues.apache.org/jira/browse/SPARK-30872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-30872: Description: {code:scala} scala> spark.range(20).selectExpr("id as a", "id as b", "id as c").write.saveAsTable("t1") scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or c = 13)").explain(false) == Physical Plan == *(2) HashAggregate(keys=[], functions=[count(1)]) +- Exchange SinglePartition, true, [id=#76] +- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) +- *(1) Project +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) +- *(1) ColumnarToRow +- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., PartitionFilters: [], PushedFilters: [IsNotNull(c), Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), Or(EqualTo(c,3),EqualT..., ReadSchema: struct {code} > Constraints inferred from inferred attributes > - > > Key: SPARK-30872 > URL: https://issues.apache.org/jira/browse/SPARK-30872 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > {code:scala} > scala> spark.range(20).selectExpr("id as a", "id as b", "id as > c").write.saveAsTable("t1") > scala> spark.sql("select count(*) from t1 where a = b and b = c and (c = 3 or > c = 13)").explain(false) > == Physical Plan == > *(2) HashAggregate(keys=[], functions=[count(1)]) > +- Exchange SinglePartition, true, [id=#76] >+- *(1) HashAggregate(keys=[], functions=[partial_count(1)]) > +- *(1) Project > +- *(1) Filter (((isnotnull(c#36L) AND ((b#35L = 3) OR (b#35L = > 13))) AND isnotnull(b#35L)) AND (a#34L = c#36L)) AND isnotnull(a#34L)) AND > (a#34L = b#35L)) AND (b#35L = c#36L)) AND ((c#36L = 3) OR (c#36L = 13))) > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#34L,b#35L,c#36L] Batched: > true, DataFilters: [isnotnull(c#36L), ((b#35L = 3) OR (b#35L = 13)), > isnotnull(b#35L), (a#34L = c#36L), isnotnull(a#..., Format: Parquet, > Location: > InMemoryFileIndex[file:/Users/yumwang/Downloads/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehous..., > PartitionFilters: [], PushedFilters: [IsNotNull(c), > Or(EqualTo(b,3),EqualTo(b,13)), IsNotNull(b), IsNotNull(a), > Or(EqualTo(c,3),EqualT..., ReadSchema: struct > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org