squito commented on a change in pull request #28885: URL: https://github.com/apache/spark/pull/28885#discussion_r465997277
########## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala ########## @@ -156,4 +158,46 @@ class ExchangeSuite extends SparkPlanTest with SharedSparkSession { val projection2 = cached.select("_1", "_3").queryExecution.executedPlan assert(!projection1.sameResult(projection2)) } + + test("Exchange reuse across the whole plan") { + val df = sql( + """ + |SELECT + | (SELECT max(a.key) FROM testData AS a JOIN testData AS b ON b.key = a.key), + | a.key + |FROM testData AS a + |JOIN testData AS b ON b.key = a.key + """.stripMargin) + + val plan = df.queryExecution.executedPlan + + val exchangeIds = plan.collectWithSubqueries { case e: Exchange => e.id } + val reusedExchangeIds = plan.collectWithSubqueries { + case re: ReusedExchangeExec => re.child.id + } + + assert(exchangeIds.size == 2, "Whole plan exchange reusing not working correctly") + assert(reusedExchangeIds.size == 3, "Whole plan exchange reusing not working correctly") + assert(reusedExchangeIds.forall(exchangeIds.contains(_)), + "ReusedExchangeExec should reuse an existing exchange") + + val df2 = sql( Review comment: Sorry I am getting back to this so late. So, I see what you are saying, but this sounds super brittle. Again, I'm not an expert here, but I wouldn't have expected rules to be so specific to their ordering. What if someone else adds another rule which they decide also has to be last, for some other reason? The abstraction suggests the rules should operate relatively independently. It would seem better to have a mechanism in `transform()` etc. functions to deal with this kind of thing. Eg. there already is a call to `copyTagsFrom()`, couldn't there be something similar there to keep the references between "reused" nodes intact? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org