Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20670
  
    Good catch! This is a real problem, but the fix looks hacky.
    
    By definition, I think `plan.contraints` should only include constraints 
that refer to `plan.output`, as that's the promise a plan can make to its 
parent. However, join is special as `Join.condition` can refer to both of the 
join sides, and we add the constraints to `Join.condition`, which is kind of we 
are making a promise to Join's children, not parent. My proposal:
    ```
      lazy val constraints: ExpressionSet = {
        if (conf.constraintPropagationEnabled) {
          allConstraints.filter { c =>
            c.references.nonEmpty && c.references.subsetOf(outputSet) && 
c.deterministic
          }
        } else {
          ExpressionSet(Set.empty)
        }
      }
    
      lazy val allConstraints = ExpressionSet(validConstraints
              .union(inferAdditionalConstraints(validConstraints))
              .union(constructIsNotNullConstraints(validConstraints)))
    ```
    Then we can call `plan.allConstraints` when inferring contraints for join.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to