[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

gaoyajun02 (Jira) Mon, 24 Jan 2022 18:59:05 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-38010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


gaoyajun02 updated SPARK-38010:
-------------------------------
        Parent: SPARK-33235
    Issue Type: Sub-task  (was: Improvement)

> Push-based shuffle disabled due to insufficient mergeLocations
> --------------------------------------------------------------
>
>                 Key: SPARK-38010
>                 URL: https://issues.apache.org/jira/browse/SPARK-38010
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Shuffle, Spark Core
>    Affects Versions: 3.1.0
>            Reporter: gaoyajun02
>            Priority: Major
>
> The current shuffle merger locations is obtained based on the host of the 
> active or dead Executors.
> When dynamic executor allocation is enabled, when an application submits the 
> first few stages, there are often not enough locations to satisfy the push 
> merge, which causes most shuffles to not benefit from the push bashed shuffle.
> The first few shuffle write stages of spark applications are generally the 
> stages for reading tables or data sources, which account for a large amount 
> of shuffled data. Because push merge shuffle is disabled, the end-to-end 
> improvement of spark applications is very limited.
> I probably thought of a way, but not sure if it's possible：
>  *  Lazy initialize shuffle merger locations, After the mapper writes the 
> local shuffle data, it obtains the merge location in the push thread.
> Looking for advice and solutions on this issue



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-38010) Push-based shuffle disabled due to insufficient mergeLocations

Reply via email to