peter-toth commented on code in PR #43614: URL: https://github.com/apache/spark/pull/43614#discussion_r1390209748
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ########## @@ -156,7 +171,15 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB // If a CTE relation is never used, it will disappear after inline. Here we explicitly check // analysis for it, to make sure the entire query plan is valid. try { - if (refCount == 0) checkAnalysis0(relation.child) Review Comment: Sorry for the late reply, I was travelling this week. The reason why I would suggest the `cleanCTEMap()` approach is because you might be checking a CTE 2 times now. Not in this new code, as `visited` map makes sure to check a CTE only once here, but later when we check the inlined plan it might contain a CTE that was referenced from an other CTE with ref count = 0 (so the CTE is checked in the new code) and also from a third CTE with valid references (so the CTE is checked in the inlined plan again). As for checking the leafs first, checking the CTEs in id order guarantees that. This is because how we do CTE resolution (assign ids to defs) in `CTESubstitution`. `cleanCTEMap()` also uses this property to quickly clear useless transitive reference counts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org