sigmod commented on a change in pull request #32742:
URL: https://github.com/apache/spark/pull/32742#discussion_r647227099



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
##########
@@ -603,6 +603,19 @@ case class AdaptiveSparkPlanExec(
     (newPlan, optimized)
   }
 
+  /**
+   * Clean up logical plan stats before re-optimize
+   */
+  private def cleanupStats(logicalPlan: LogicalPlan): Unit = {
+    logicalPlan.invalidateStatsCache()
+    // We must invalidate ineffective rules before re-optimize since AQE 
Optimizer may introduce
+    // LocalRelation that can affect result.

Review comment:
       >> some AQE Optimier rules only match the materialized logical plan 
   
   IIUC, the key issue is that LogicalQueryStage.sparkPlan is not immutable.  
In general, Catalyst plans are immutable  and copy-on-modification but 
LogicalQueryStage sounds like a violation. There're a few write-once var fields 
elsewhere but they're essentially lazy or a cache. 
   
   Therefore, I think we should not use ruleId pruning for AQE at this point so 
as not to cause weird issues. We can still do pattern bits pruning by passing 
the pattern Lambda but not the ruleId to transform functions.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to