Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22326#discussion_r219675105
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -995,7 +995,8 @@ class Dataset[T] private[sql](
         // After the cloning, left and right side will have distinct 
expression ids.
         val plan = withPlan(
           Join(logicalPlan, right.logicalPlan, JoinType(joinType), 
Some(joinExprs.expr)))
    -      .queryExecution.analyzed.asInstanceOf[Join]
    +      .queryExecution.analyzed
    +    val joinPlan = plan.collectFirst { case j: Join => j }.get
    --- End diff --
    
    For reviewer, we need this change cause the rule 
`HandlePythonUDFInJoinCondition` will break the assumption about the join plan 
after analyzing will only return Join. After we add the rule of handling python 
udf, we'll add filter or project node on top of Join.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to