sigmod commented on code in PR #55629:
URL: https://github.com/apache/spark/pull/55629#discussion_r3184148132
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -2541,14 +2542,36 @@ object CheckCartesianProducts extends Rule[LogicalPlan]
with PredicateHelper {
}
}
- def apply(plan: LogicalPlan): LogicalPlan =
+ def apply(plan: LogicalPlan): LogicalPlan = {
if (conf.crossJoinEnabled) {
- plan
- } else plan.transformWithPruning(_.containsAnyPattern(INNER_LIKE_JOIN,
OUTER_JOIN)) {
+ return plan
+ }
+
+ // Joins synthesized by `RewriteNearestByJoin` are an intentional, bounded
cross-product
+ // wrapped by a `MaxMinByK` aggregate. Identify them by their unambiguous
post-rewrite
+ // signature -- `Aggregate(_, exprs, Join(_, _, LeftOuter, None, _))`
where `exprs`
+ // contains a `MaxMinByK` -- and skip them so user queries written as
`NEAREST BY` are not
+ // rejected when `spark.sql.crossJoin.enabled = false`. We use structural
detection rather
+ // than a `TreeNodeTag` because a tag set on the `Join` would be silently
dropped by any
+ // intervening optimizer rule that constructs a fresh `Join` via the
case-class
+ // constructor without calling `copyTagsFrom`.
+ val nearestByJoins: java.util.IdentityHashMap[Join, Unit] = {
+ val acc = new java.util.IdentityHashMap[Join, Unit]()
+ plan.foreach {
Review Comment:
Do we have to check it?
`spark.sql.crossJoin.enabled` has been on-by-default.
In case it's not disabled, we don't have to make an exception for vector
index creation but can just let the error be raised?
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2294-L2300
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]