Github user squito commented on the issue: https://github.com/apache/spark/pull/21698 sorry I got bogged down in some other things, thanks for the responses: >> on a fetch-failure in repartition, fail the entire job > Currently I can't figure out a case that a customer may vote for this behavior change, esp. FetchFailure tends to occur more often on long-running jobs on big datasets compared to interactive queries. yeah maybe you're right. I was thinking that maybe there comes a point where if you have one failure, you expect more failures on retries as well (in my experience, large shuffles often fail the first time when everything is getting fetched, but on subsequent retries they manage to succeed because the load is smaller). It might be better to just not bother retrying. But then again, there are situtations where retry is fine, and I guess users won't know which one to choose. >> since we only need to do this sort on RDDs post shuffle > IIUC this is not the case in RDD.repartition(), see https://github.com/apache/spark/blob/94c67a76ec1fda908a671a47a2a1fa63b3ab1b06/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L453~L461 , it requires the input rows are ordered then perform a round-robin style data transformation, so I don't see what we can do if the input data type is not sortable. my point is that if you serialize the input (the `Iterator[T]` there), then there is a well-defined ordering based on the serialized bytes. (I guess I'm assuming serialization is deterministic, I can't think of a case that isn't true.) In general, you don't know that `T` is serializable, but after a shuffle you know it must be. So that gives you a way to always deterministically order the input after a shuffle, though at a pretty serious performance penalty. You could avoid the re-serialization overhead by pushing the sort down into ShuffleBlockFetcherIterator etc. Maybe you could skip this if you detect checkpointing or something equivalent which eliminates the ordering dependency ... or maybe thats just not possible with the current apis. thanks for the description of the problem with determinstic shuffle ordering. The "Shuffle Merge With Spills" problem seems particularly hard to solve.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org