[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

Josh Rosen (JIRA) Fri, 25 Jul 2014 09:19:39 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074521#comment-14074521
 ]


Josh Rosen commented on SPARK-2387:
-----------------------------------

{quote}
For example, in a push-style shuffle, the pushed data won't have to be stored 
to disk if the reducers have started on the destination node.
{quote}

One of the reasons to materialize shuffle outputs to disk on the map side is 
fault-tolerance: an individual reducer depends on all mappers, so without 
materialization the failure of any reducer would cause re-computation of all 
mappers.

> Remove the stage barrier for better resource utilization
> --------------------------------------------------------
>
>                 Key: SPARK-2387
>                 URL: https://issues.apache.org/jira/browse/SPARK-2387
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Rui Li
>
> DAGScheduler divides a Spark job into multiple stages according to RDD 
> dependencies. Whenever there’s a shuffle dependency, DAGScheduler creates a 
> shuffle map stage on the map side, and another stage depending on that stage.
> Currently, the downstream stage cannot start until all its depended stages 
> have finished. This barrier between stages leads to idle slots when waiting 
> for the last few upstream tasks to finish and thus wasting cluster resources.
> Therefore we propose to remove the barrier and pre-start the reduce stage 
> once there're free slots. This can achieve better resource utilization and 
> improve the overall job performance, especially when there're lots of 
> executors granted to the application.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

Reply via email to