[jira] [Assigned] (SPARK-36656) CollapseProject should not collapse correlated scalar subqueries
[ https://issues.apache.org/jira/browse/SPARK-36656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-36656: --- Assignee: Allison Wang > CollapseProject should not collapse correlated scalar subqueries > > > Key: SPARK-36656 > URL: https://issues.apache.org/jira/browse/SPARK-36656 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > > Currently, the optimizer rule `CollapseProject` inlines expressions generated > from correlated scalar subqueries, which can create unnecessary left outer > joins. > {code:sql} > select c1, s, s * 10 from ( > select c1, (select first(c2) from t2 where t1.c1 = t2.c1) s from t1) > {code} > {code:scala} > // Before > Project [c1, s, (s * 10)] > +- Project [c1, scalar-subquery [c1] AS s] >: +- Aggregate [c1], [first(c2), c1] >: +- LocalRelation [c1, c2] >+- LocalRelation [c1, c2] > // After (scalar subqueries are inlined) > Project [c1, scalar-subquery [c1], (scalar-subquery [c1] * 10)] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > +- LocalRelation [c1, c2] > {code} > Then this query will have two LeftOuter joins created. We should only > collapse projects after correlated subqueries are rewritten as joins. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36656) CollapseProject should not collapse correlated scalar subqueries
[ https://issues.apache.org/jira/browse/SPARK-36656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36656: Assignee: Apache Spark > CollapseProject should not collapse correlated scalar subqueries > > > Key: SPARK-36656 > URL: https://issues.apache.org/jira/browse/SPARK-36656 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > > Currently, the optimizer rule `CollapseProject` inlines expressions generated > from correlated scalar subqueries, which can create unnecessary left outer > joins. > {code:sql} > select c1, s, s * 10 from ( > select c1, (select first(c2) from t2 where t1.c1 = t2.c1) s from t1) > {code} > {code:scala} > // Before > Project [c1, s, (s * 10)] > +- Project [c1, scalar-subquery [c1] AS s] >: +- Aggregate [c1], [first(c2), c1] >: +- LocalRelation [c1, c2] >+- LocalRelation [c1, c2] > // After (scalar subqueries are inlined) > Project [c1, scalar-subquery [c1], (scalar-subquery [c1] * 10)] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > +- LocalRelation [c1, c2] > {code} > Then this query will have two LeftOuter joins created. We should only > collapse projects after correlated subqueries are rewritten as joins. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36656) CollapseProject should not collapse correlated scalar subqueries
[ https://issues.apache.org/jira/browse/SPARK-36656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36656: Assignee: (was: Apache Spark) > CollapseProject should not collapse correlated scalar subqueries > > > Key: SPARK-36656 > URL: https://issues.apache.org/jira/browse/SPARK-36656 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Allison Wang >Priority: Major > > Currently, the optimizer rule `CollapseProject` inlines expressions generated > from correlated scalar subqueries, which can create unnecessary left outer > joins. > {code:sql} > select c1, s, s * 10 from ( > select c1, (select first(c2) from t2 where t1.c1 = t2.c1) s from t1) > {code} > {code:scala} > // Before > Project [c1, s, (s * 10)] > +- Project [c1, scalar-subquery [c1] AS s] >: +- Aggregate [c1], [first(c2), c1] >: +- LocalRelation [c1, c2] >+- LocalRelation [c1, c2] > // After (scalar subqueries are inlined) > Project [c1, scalar-subquery [c1], (scalar-subquery [c1] * 10)] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > : +- Aggregate [c1], [first(c2), c1] > : +- LocalRelation [c1, c2] > +- LocalRelation [c1, c2] > {code} > Then this query will have two LeftOuter joins created. We should only > collapse projects after correlated subqueries are rewritten as joins. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org