[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

Saisai Shao (JIRA) Tue, 24 Jul 2018 20:08:52 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555080#comment-16555080
 ]


Saisai Shao commented on SPARK-24615:
-------------------------------------

Hi [~tgraves] [~irashid] thanks a lot for your comments.

Currently in my design I don't insert a specific stage boundary with different 
resources, the stage boundary is still the same (by shuffle or by result). so 
{{withResouces}} is not an eval() action which trigger a stage. Instead, it 
just adds a resource hint to the RDD.

So which means RDDs with different resources requirements in one stage may have 
conflicts. For example: {{rdd1.withResources.mapPartitions \{ xxx 
\}.withResources.mapPartitions \{ xxx \}.collect}},  resources in rdd1 may be 
different from map rdd, so currently what I can think is that:

1. always pick the latter with warning log to say that multiple different 
resources in one stage is illegal.
2. fail the stage with warning log to say that multiple different resources in 
one stage is illegal.
3. merge conflicts with maximum resources needs. For example rdd1 requires 3 
gpus per task, rdd2 requires 4 gpus per task, then the merged requirement would 
be 4 gpus per task. (This is the high level description, details will be per 
partition based merging) [chosen].

Take join for example, where rdd1 and rdd2 may have different resource 
requirements, and joined RDD will potentially have other resource requirements.

For example:

{code}
val rddA = rdd.mapPartitions().withResources
val rddB = rdd.mapPartitions().withResources
val rddC = rddA.join(rddB).withResources
rddC.collect()
{code}

Here this 3 {{withResources}} may have different requirements. Since {{rddC}} 
is running in a different stage, so there's no need to merge the resource 
conflicts. But {{rddA}} and {{rddB}} are running in the same stage with 
different tasks (partitions). So the merging strategy I'm thinking is based on 
the partition, tasks running with partitions from {{rddA}} will use the 
resource specified by {{rddA}}, so does {{rddB}}.





> Accelerator-aware task scheduling for Spark
> -------------------------------------------
>
>                 Key: SPARK-24615
>                 URL: https://issues.apache.org/jira/browse/SPARK-24615
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Saisai Shao
>            Assignee: Saisai Shao
>            Priority: Major
>              Labels: Hydrogen, SPIP
>
> In the machine learning area, accelerator card (GPU, FPGA, TPU) is 
> predominant compared to CPUs. To make the current Spark architecture to work 
> with accelerator cards, Spark itself should understand the existence of 
> accelerators and know how to schedule task onto the executors where 
> accelerators are equipped.
> Current Spark’s scheduler schedules tasks based on the locality of the data 
> plus the available of CPUs. This will introduce some problems when scheduling 
> tasks with accelerators required.
>  # CPU cores are usually more than accelerators on one node, using CPU cores 
> to schedule accelerator required tasks will introduce the mismatch.
>  # In one cluster, we always assume that CPU is equipped in each node, but 
> this is not true of accelerator cards.
>  # The existence of heterogeneous tasks (accelerator required or not) 
> requires scheduler to schedule tasks with a smart way.
> So here propose to improve the current scheduler to support heterogeneous 
> tasks (accelerator requires or not). This can be part of the work of Project 
> hydrogen.
> Details is attached in google doc. It doesn't cover all the implementation 
> details, just highlight the parts should be changed.
>  
> CC [~yanboliang] [~merlintang]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

Reply via email to