squito commented on a change in pull request #28885:
URL: https://github.com/apache/spark/pull/28885#discussion_r465997277



##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala
##########
@@ -156,4 +158,46 @@ class ExchangeSuite extends SparkPlanTest with 
SharedSparkSession {
     val projection2 = cached.select("_1", "_3").queryExecution.executedPlan
     assert(!projection1.sameResult(projection2))
   }
+
+  test("Exchange reuse across the whole plan") {
+    val df = sql(
+      """
+        |SELECT
+        |  (SELECT max(a.key) FROM testData AS a JOIN testData AS b ON b.key = 
a.key),
+        |  a.key
+        |FROM testData AS a
+        |JOIN testData AS b ON b.key = a.key
+      """.stripMargin)
+
+    val plan = df.queryExecution.executedPlan
+
+    val exchangeIds = plan.collectWithSubqueries { case e: Exchange => e.id }
+    val reusedExchangeIds = plan.collectWithSubqueries {
+      case re: ReusedExchangeExec => re.child.id
+    }
+
+    assert(exchangeIds.size == 2, "Whole plan exchange reusing not working 
correctly")
+    assert(reusedExchangeIds.size == 3, "Whole plan exchange reusing not 
working correctly")
+    assert(reusedExchangeIds.forall(exchangeIds.contains(_)),
+      "ReusedExchangeExec should reuse an existing exchange")
+
+    val df2 = sql(

Review comment:
       Sorry I am getting back to this so late.
   
   So, I see what you are saying, but this sounds super brittle.  Again, I'm 
not an expert here, but I wouldn't have expected rules to be so specific to 
their ordering.  What if someone else adds another rule which they decide also 
has to be last, for some other reason?  The abstraction suggests the rules 
should operate relatively independently.
   
   It would seem better to have a mechanism in `transform()` etc. functions to 
deal with this kind of thing.  Eg. there already is a call to `copyTagsFrom()`, 
couldn't there be something similar there to keep the references between 
"reused" nodes intact?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to