Github user mridulm commented on the issue: https://github.com/apache/spark/pull/21698 Taking a step back and analyzing the solution for the problem at hand. There are three main issues with the proposal: * It does not solve the problem in a general manner. * I gave example of zip, sample - it applies to any order sensitive closure. * It does not fix the issue when a child stage has one or more completed tasks. * Even if we assume it is a specific fix for repartition/coalasce - even there it does not solve the problem and can cause data loss. * It causes performance regression to existing workaround. * The common workaround for this issue is to checkpoint + action or do a local/global sort (I believe sql does the latter now ?). * The proposal causes performance regression for these existing workarounds. The corner case where the proposal works is if : a) order sensitive stage has finished and b) no task in child stage has finished fetching its shuffle input. This is a fairly narrow subset, and why I dont believe the current approach helps. Having said that, if it is possible to enhance the approach, that would be great ! This is a fairly nasty issue which hurts users, and typically people who are aware of the problem tend to always pay a performance cost to avoid the corner case.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org