peter-toth commented on code in PR #43614:
URL: https://github.com/apache/spark/pull/43614#discussion_r1390209748


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -156,7 +171,15 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
       // If a CTE relation is never used, it will disappear after inline. Here 
we explicitly check
       // analysis for it, to make sure the entire query plan is valid.
       try {
-        if (refCount == 0) checkAnalysis0(relation.child)

Review Comment:
   Sorry for the late reply, I was travelling this week.
   
   The reason why I would suggest the `cleanCTEMap()` approach is because you 
might be checking a CTE 2 times now. Not in this new code, as `visited` map 
makes sure to check a CTE only once here, but later when we check the inlined 
plan it might contain a CTE that was referenced from an other CTE with ref 
count = 0 (so the CTE is checked in the new code) and also from a third CTE 
with valid references (so the CTE is checked in the inlined plan again).
   
   As for checking the leafs first, checking the CTEs in id order guarantees 
that. This is because how we do CTE resolution (assign ids to defs) in 
`CTESubstitution`. `cleanCTEMap()` also uses this property to quickly clear 
useless transitive reference counts.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to