Github user viirya commented on the issue: https://github.com/apache/spark/pull/19642 Another concern is, by seeing python udfs as normal expressions without specific operator, we can apply necessary optimization such as CollapseProject. If we extract python udfs earlier in logical plan, we might have multiple python runners that were collapsed originally now. To deal with it, we may need to add specific optimization rule for collapsing python runners. This adds more complexity, IMHO. We should consider if it's worth.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org