Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/22326 ``` move this rule to optimizer, as the last batch (but before the UpdateAttributeReferences batch). Since we apply this rule after filter pushdown, we can simply pull out any python udf in join condition. Also add this rule to Optimizer.nonExcludableRules, since this is a special optimizer rule that can't be turned off. ``` Make sense, implement like this can also avoid breaking the assumption in Dataset.join of Join plan only return Join after analysis. I'll reimplement as this proposal soon.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org