[ 
https://issues.apache.org/jira/browse/SPARK-26268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26268:
----------------------------------
    Affects Version/s:     (was: 2.4.0)
                       3.0.0

> Decouple shuffle data from Spark deployment
> -------------------------------------------
>
>                 Key: SPARK-26268
>                 URL: https://issues.apache.org/jira/browse/SPARK-26268
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle
>    Affects Versions: 3.0.0
>            Reporter: Ben Sidhom
>            Priority: Major
>
> Right now the batch scheduler assumes that shuffle data is tied to executors. 
> As a result, when an executor is lost, any map tasks that ran on that 
> executor are rescheduled unless the "external" shuffle service is being used. 
> Note that this service is only external in the sense that it does not live 
> within executors themselves; its implementation cannot be swapped out and it 
> is assumed to speak the BlockManager language.
> The following changes would facilitate external shuffle (see SPARK-25299 for 
> motivation):
>  * Do not rerun map tasks on lost executors when shuffle data is stored 
> externally. For example, this could be determined by a property or by an 
> additional method that all ShuffleManagers implement.
>  * Do not assume that shuffle data is stored in the standard BlockManager 
> format or that a BlockManager is or must be available to ShuffleManagers.
> Note that only the first change is actually required to realize the benefits 
> of remote shuffle implementations as a phony (or null) BlockManager can be 
> used by shuffle implementations.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to