Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14397 Hi, @hvanhovell . It seems not clearly documented, so I did some comparisons. First of all, I found that I did overlooked the behavior of the Hive queries. Hive also uses CTE names first in the consecutive CTE queries like other DBMS. It's natural. ``` with t as (select 10), s as (select * from t) select * from s; ``` For the self recursion, the approaches are different. Hive/Oracle raises exceptions, PostgreSQL uses the base tables. ``` with t as (select 10 from t), s as (select * from t) select * from s; ``` The cross referencing case raises exceptions. ``` with t as (select 10 from s), s as (select * from t) select * from s; ``` To sum up, the general approach seems to - use **forward-declared(not self)** base table or CTE name - prevent the execution of cyclic queries by raising exceptions. The root cause of previous Spark behavior is using `self` or `backward-declared CTE` for table name resolutions. I will try to revise this PR according to the above rules. Please comment about the above rule.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org