Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/22326#discussion_r219675105 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -995,7 +995,8 @@ class Dataset[T] private[sql]( // After the cloning, left and right side will have distinct expression ids. val plan = withPlan( Join(logicalPlan, right.logicalPlan, JoinType(joinType), Some(joinExprs.expr))) - .queryExecution.analyzed.asInstanceOf[Join] + .queryExecution.analyzed + val joinPlan = plan.collectFirst { case j: Join => j }.get --- End diff -- For reviewer, we need this change cause the rule `HandlePythonUDFInJoinCondition` will break the assumption about the join plan after analyzing will only return Join. After we add the rule of handling python udf, we'll add filter or project node on top of Join.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org