Github user squito commented on the issue: https://github.com/apache/spark/pull/21698 I also think @tgravescs solution of using the HashPartitioner is an acceptable one, though as you've noted it doesn't deal w/ skew (which may be a lot of the existing use of `repartition()`). I think we'd probably see a bunch of users complain that their jobs started crashing on upgrading 2.4 if thats the best we can offer, but IMO crash is way better than silent data loss.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org