Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109211310 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -90,11 +90,12 @@ trait PredicateHelper { * Returns true iff `expr` could be evaluated as a condition within join. */ protected def canEvaluateWithinJoin(expr: Expression): Boolean = expr match { - case l: ListQuery => + case _: ListQuery | _: Exists => // A ListQuery defines the query which we want to search in an IN subquery expression. // Currently the only way to evaluate an IN subquery is to convert it to a // LeftSemi/LeftAnti/ExistenceJoin by `RewritePredicateSubquery` rule. // It cannot be evaluated as part of a Join operator. + // An Exists shouldn't be push into a Join operator too. --- End diff -- @nsyca Looking at this further, there is a SubqueryExec operator that can execute a ScalarSubquery and InSubquery (PlanSubqueries). As part of my change, i had removed the case for PredicateSubquery as we removed PredicateSubquery all together. I just quickly tried the following and got the query to work. I haven't verified the semantics but just tried something quickly. Basically if we were to keep the Exists expression as it is and push it down as a join condition and execute it as a InSubquery (possibly with a additional limit clause) there seems to be an infrastructure for it already. Or perhaps we may want to introduce a ExistSubquery exec operator that can work more efficiently. ```scala case subquery: expressions.Exists => val executedPlan = new QueryExecution(sparkSession, subquery.plan).executedPlan InSubquery(Literal.TrueLiteral, SubqueryExec(s"subquery${subquery.exprId.id}", executedPlan), subquery.exprId) ``` What do you think Natt ?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org