Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/20816#discussion_r175330576 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -669,11 +672,42 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe val newConditionOpt = conditionOpt match { case Some(condition) => val newFilters = additionalConstraints -- splitConjunctivePredicates(condition) - if (newFilters.nonEmpty) Option(And(newFilters.reduce(And), condition)) else None + if (newFilters.nonEmpty) Option(And(newFilters.reduce(And), condition)) else conditionOpt case None => additionalConstraints.reduceOption(And) } - if (newConditionOpt.isDefined) Join(left, right, joinType, newConditionOpt) else join + // Infer filter for left/right outer joins + val newLeftOpt = joinType match { + case RightOuter if newConditionOpt.isDefined => + val rightConstraints = right.constraints.union( + splitConjunctivePredicates(newConditionOpt.get).toSet) + val inferredConstraints = ExpressionSet( + QueryPlanConstraints.inferAdditionalConstraints(rightConstraints)) + val leftConditions = inferredConstraints --- End diff -- I think the `constructIsNotNullConstraints` logic does not deal with the "transitive" constraints so we not need to include it here. Instead the "isNotNull" deduction for inferred filters on the null-supplying side is guaranteed by 2 things here: 1) when getting constraints from the preserved side, `constructIsNotNullConstraints` has already been called and will be carried over by `inferAdditionalConstraints` to the null-supplying side; 2) the Filter matching part of `InferFiltersFromConstraints`. That said, I'm good with the name `getRelevantConstraints` too.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org