Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20670 Good catch! This is a real problem, but the fix looks hacky. By definition, I think `plan.contraints` should only include constraints that refer to `plan.output`, as that's the promise a plan can make to its parent. However, join is special as `Join.condition` can refer to both of the join sides, and we add the constraints to `Join.condition`, which is kind of we are making a promise to Join's children, not parent. My proposal: ``` lazy val constraints: ExpressionSet = { if (conf.constraintPropagationEnabled) { allConstraints.filter { c => c.references.nonEmpty && c.references.subsetOf(outputSet) && c.deterministic } } else { ExpressionSet(Set.empty) } } lazy val allConstraints = ExpressionSet(validConstraints .union(inferAdditionalConstraints(validConstraints)) .union(constructIsNotNullConstraints(validConstraints))) ``` Then we can call `plan.allConstraints` when inferring contraints for join.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org