cloud-fan commented on code in PR #54976:
URL: https://github.com/apache/spark/pull/54976#discussion_r3298984672
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala:
##########
@@ -829,6 +829,29 @@ case class Join(
newLeft: LogicalPlan, newRight: LogicalPlan): Join = copy(left = newLeft,
right = newRight)
}
+/**
+ * A logical plan that combines the columns of two DataFrames that derive from
the same
+ * base plan through chains of Project nodes. This node is always unresolved
and must be
+ * rewritten by [[ResolveZip]] into a single Project over the shared base plan
during
+ * analysis. If the two children do not share the same base plan (after
stripping Project
+ * nodes), analysis will fail with an error.
+ */
Review Comment:
The "single Project" wording sat correctly under the prior implementation
but doesn't match the redesign — `ResolveZip` now emits a chain of `Project`s
(one per dependency depth) so each user-written alias stays in its own `Alias`
and `CollapseProject`'s safety guards apply. The doc should describe the new
shape, and the failure list should also mention the second
`ZIP_PLANS_NOT_MERGEABLE` trigger (non-scalar Python UDFs) — it currently only
names mismatched bases.
```suggestion
/**
* A logical plan that combines the columns of two DataFrames that derive
from the same
* base plan through chains of Project nodes. This node is always unresolved
and must be
* rewritten by [[ResolveZip]] into a chain of Project nodes over the shared
base plan
* during analysis. If the two children do not share the same base plan
(after stripping
* outer Projects), or if either side contains a non-scalar Python UDF,
analysis will fail
* with ZIP_PLANS_NOT_MERGEABLE.
*/
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]