Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/18968#discussion_r134126635 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -138,46 +138,80 @@ case class Not(child: Expression) case class In(value: Expression, list: Seq[Expression]) extends Predicate { require(list != null, "list should not be null") + + lazy val valExprs = value match { + case cns: CreateNamedStruct => cns.valExprs + case expr => Seq(expr) + } + + override lazy val resolved: Boolean = { + lazy val checkForInSubquery = list match { + case (l @ ListQuery(sub, children, _)) :: Nil => + // SPARK-21759: + // It is possibly that the subquery plan has more output than value expressions, because + // the condition expressions in `ListQuery` might use part of subquery plan's output. --- End diff -- So we are adding another criteria to consider an in-subquery expression to be resolved. That new criteria is - 1) Any additional output attributes that may have been added to the subquery plan by optimizer should have a reference in the originating in-subquery expression's children.(children reflect the pulled up correlated predicates) Just for my understanding, there is no way to trigger this condition from our regular code path, right ? This is just to guard against any potential incorrect rewrites by the optimizer in the future ?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org