allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r620553370
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala ########## @@ -156,6 +156,16 @@ object DeduplicateRelations extends Rule[LogicalPlan] { if findAliases(projectList).intersect(conflictingAttributes).nonEmpty => Seq((oldVersion, oldVersion.copy(projectList = newAliases(projectList)))) + // Handle projects that create conflicting outer references. + case oldVersion @ Project(projectList, _) + if findOuterReferences(projectList).intersect(conflictingAttributes).nonEmpty => + // Add alias to conflicting outer references. + val aliasedProjectList = projectList.map { + case o @ OuterReference(a) if conflictingAttributes.contains(a) => Alias(o, a.name)() + case other => other + } Review comment: Outer references can be inside an expression tree, which can be handled by the deduplicating aliases case, for example `outer(a) AS a`. Here the dedup logic is trying to handle this case: ``` Join LatealJoin(Inner) :- Project [c1#0, c2#0] :- ... +- Project [outer(c1#0)] +- ... ``` Since an outer reference is also a named expression, it can appear in the project list without an alias. Then `c1#0` and `outer(c1#0)` will conflict with each other. (`outer(c1#0).toAttribute => c1#0`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org