allisonwang-db commented on a change in pull request #32303:
URL: https://github.com/apache/spark/pull/32303#discussion_r620553370



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala
##########
@@ -156,6 +156,16 @@ object DeduplicateRelations extends Rule[LogicalPlan] {
           if 
findAliases(projectList).intersect(conflictingAttributes).nonEmpty =>
         Seq((oldVersion, oldVersion.copy(projectList = 
newAliases(projectList))))
 
+      // Handle projects that create conflicting outer references.
+      case oldVersion @ Project(projectList, _)
+          if 
findOuterReferences(projectList).intersect(conflictingAttributes).nonEmpty =>
+        // Add alias to conflicting outer references.
+        val aliasedProjectList = projectList.map {
+          case o @ OuterReference(a) if conflictingAttributes.contains(a) => 
Alias(o, a.name)()
+          case other => other
+        }

Review comment:
       Outer references can be inside an expression tree, which can be handled 
by the deduplicating aliases case, for example `outer(a) AS a`. Here the dedup 
logic is trying to handle this case:
   ```
   Join LatealJoin(Inner)
   :- Project [c1#0, c2#0]
   :-  ...
   +- Project [outer(c1#0)]
      +- ...
   ```
   Since an outer reference is also a named expression, it can appear in the 
project list without an alias. Then `c1#0` and  `outer(c1#0)` will conflict 
with each other. (`outer(c1#0).toAttribute => c1#0`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to