Min Shen created SPARK-36423:
--------------------------------

             Summary: Randomize blocks within a push request before pushing to 
improve block merge ratio
                 Key: SPARK-36423
                 URL: https://issues.apache.org/jira/browse/SPARK-36423
             Project: Spark
          Issue Type: Sub-task
          Components: Shuffle, Spark Core
    Affects Versions: 3.2.0
            Reporter: Min Shen


On the client side, we are currently randomizing the order of push requests 
before processing each request. In addition we canĀ further randomize the order 
of blocks within each push request before pushing them.

In our benchmark, this has resulted in a 60%-70% reduction of blocks that fail 
to be merged due to bock collision (the existing block merge ratio is already 
pretty good in general, and this further improves it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to