Github user ala commented on the issue:

    https://github.com/apache/spark/pull/20664
  
    Thanks for the comments.
    
    I don't think the users should be impacted by changing execution time. If 
the parameters of the job are constant, then the partition allocation should 
also be deterministic, since the seed is fixed in `CoalescedRDD.scala`. There 
was already a degree of randomization in `DefaultPartitionCoalescer.pickBin()` 
which could lead to some fluctuation, so it's not a big difference.
    
    TBH, I'm just trying to merge upstream a fix we've implemented for the 
client. I agree much more could be done to improve coalesce, and if someone 
would be interested in looking into it, I'm all for it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to