[ https://issues.apache.org/jira/browse/SPARK-34807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang resolved SPARK-34807. --------------------------------- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31980 [https://github.com/apache/spark/pull/31980] > Push down filter through window after TransposeWindow > ----------------------------------------------------- > > Key: SPARK-34807 > URL: https://issues.apache.org/jira/browse/SPARK-34807 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Yuming Wang > Assignee: Tanel Kiis > Priority: Major > Fix For: 3.2.0 > > > {code:scala} > spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS > d").createTempView("t1") > val df = spark.sql( > """ > |SELECT * > | FROM ( > | SELECT b, > | sum(d) OVER (PARTITION BY a, b), > | rank() OVER (PARTITION BY a ORDER BY c) > | FROM t1 > | ) v1 > |WHERE b = 2 > |""".stripMargin) > {code} > Current optimized plan: > {noformat} > == Optimized Logical Plan == > Project [b#221L, sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED > PRECEDING AND UNBOUNDED FOLLOWING)#231L, RANK() OVER (PARTITION BY a ORDER BY > c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232] > +- Filter (b#221L = 2) > +- Window [rank(c#222L) windowspecdefinition(a#220L, c#222L ASC NULLS > FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) > AS RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST ROWS BETWEEN > UNBOUNDED PRECEDING AND CURRENT ROW)#232], [a#220L], [c#222L ASC NULLS FIRST] > +- Project [b#221L, a#220L, c#222L, sum(d) OVER (PARTITION BY a, b ROWS > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)#231L] > +- Window [sum(d#223L) windowspecdefinition(a#220L, b#221L, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND > UNBOUNDED FOLLOWING)#231L], [a#220L, b#221L] > +- Project [id#218L AS b#221L, id#218L AS d#223L, id#218L AS > a#220L, id#218L AS c#222L] > +- Range (0, 10, step=1, splits=Some(2)) > {noformat} > Expected optimized plan: > {noformat} > == Optimized Logical Plan == > Project [b#221L, sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED > PRECEDING AND UNBOUNDED FOLLOWING)#231L, RANK() OVER (PARTITION BY a ORDER BY > c ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232] > +- Window [sum(d#223L) windowspecdefinition(a#220L, b#221L, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS sum(d) OVER (PARTITION BY a, b ROWS BETWEEN UNBOUNDED PRECEDING AND > UNBOUNDED FOLLOWING)#231L], [a#220L, b#221L] > +- Project [b#221L, d#223L, a#220L, RANK() OVER (PARTITION BY a ORDER BY c > ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232] > +- Filter (b#221L = 2) > +- Window [rank(c#222L) windowspecdefinition(a#220L, c#222L ASC > NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), > currentrow$())) AS RANK() OVER (PARTITION BY a ORDER BY c ASC NULLS FIRST > ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#232], [a#220L], [c#222L ASC > NULLS FIRST] > +- Project [id#218L AS b#221L, id#218L AS d#223L, id#218L AS > a#220L, id#218L AS c#222L] > +- Range (0, 10, step=1, splits=Some(2)) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org