cloud-fan commented on code in PR #54976:
URL: https://github.com/apache/spark/pull/54976#discussion_r3298984672


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala:
##########
@@ -829,6 +829,29 @@ case class Join(
     newLeft: LogicalPlan, newRight: LogicalPlan): Join = copy(left = newLeft, 
right = newRight)
 }
 
+/**
+ * A logical plan that combines the columns of two DataFrames that derive from 
the same
+ * base plan through chains of Project nodes. This node is always unresolved 
and must be
+ * rewritten by [[ResolveZip]] into a single Project over the shared base plan 
during
+ * analysis. If the two children do not share the same base plan (after 
stripping Project
+ * nodes), analysis will fail with an error.
+ */

Review Comment:
   The "single Project" wording sat correctly under the prior implementation 
but doesn't match the redesign — `ResolveZip` now emits a chain of `Project`s 
(one per dependency depth) so each user-written alias stays in its own `Alias` 
and `CollapseProject`'s safety guards apply. The doc should describe the new 
shape, and the failure list should also mention the second 
`ZIP_PLANS_NOT_MERGEABLE` trigger (non-scalar Python UDFs) — it currently only 
names mismatched bases.
   
   ```suggestion
   /**
    * A logical plan that combines the columns of two DataFrames that derive 
from the same
    * base plan through chains of Project nodes. This node is always unresolved 
and must be
    * rewritten by [[ResolveZip]] into a chain of Project nodes over the shared 
base plan
    * during analysis. If the two children do not share the same base plan 
(after stripping
    * outer Projects), or if either side contains a non-scalar Python UDF, 
analysis will fail
    * with ZIP_PLANS_NOT_MERGEABLE.
    */
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to