[ 
https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-24615:
--------------------------------
    Description: 
In the machine learning area, accelerator card (GPU, FPGA, TPU) is predominant 
compared to CPUs. To make the current Spark architecture to work with 
accelerator cards, Spark itself should understand the existence of accelerators 
and know how to schedule task onto the executors where accelerators are 
equipped.

Current Spark’s scheduler schedules tasks based on the locality of the data 
plus the available of CPUs. This will introduce some problems when scheduling 
tasks with accelerators required.
 # CPU cores are usually more than accelerators on one node, using CPU cores to 
schedule accelerator required tasks will introduce the mismatch.
 # In one cluster, we always assume that CPU is equipped in each node, but this 
is not true of accelerator cards.
 # The existence of heterogeneous tasks (accelerator required or not) requires 
scheduler to schedule tasks with a smart way.

So here propose to improve the current scheduler to support heterogeneous tasks 
(accelerator requires or not). This can be part of the work of Project hydrogen.

Details is attached in google doc.

 

CC [~yanboliang] [~merlintang]

  was:
In the machine learning area, accelerator card (GPU, FPGA, TPU) is predominant 
compared to CPUs. To make the current Spark architecture to work with 
accelerator cards, Spark itself should understand the existence of accelerators 
and know how to schedule task onto the executors where accelerators are 
equipped.

Current Spark’s scheduler schedules tasks based on the locality of the data 
plus the available of CPUs. This will introduce some problems when scheduling 
tasks with accelerators required.
 # CPU cores are usually more than accelerators on one node, using CPU cores to 
schedule accelerator required tasks will introduce the mismatch.
 # In one cluster, we always assume that CPU is equipped in each node, but this 
is not true of accelerator cards.
 # The existence of heterogeneous tasks (accelerator required or not) requires 
scheduler to schedule tasks with a smart way.

So here propose to improve the current scheduler to support heterogeneous tasks 
(accelerator requires or not). Details is attached in google doc.


> Accelerator aware task scheduling for Spark
> -------------------------------------------
>
>                 Key: SPARK-24615
>                 URL: https://issues.apache.org/jira/browse/SPARK-24615
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Saisai Shao
>            Priority: Major
>
> In the machine learning area, accelerator card (GPU, FPGA, TPU) is 
> predominant compared to CPUs. To make the current Spark architecture to work 
> with accelerator cards, Spark itself should understand the existence of 
> accelerators and know how to schedule task onto the executors where 
> accelerators are equipped.
> Current Spark’s scheduler schedules tasks based on the locality of the data 
> plus the available of CPUs. This will introduce some problems when scheduling 
> tasks with accelerators required.
>  # CPU cores are usually more than accelerators on one node, using CPU cores 
> to schedule accelerator required tasks will introduce the mismatch.
>  # In one cluster, we always assume that CPU is equipped in each node, but 
> this is not true of accelerator cards.
>  # The existence of heterogeneous tasks (accelerator required or not) 
> requires scheduler to schedule tasks with a smart way.
> So here propose to improve the current scheduler to support heterogeneous 
> tasks (accelerator requires or not). This can be part of the work of Project 
> hydrogen.
> Details is attached in google doc.
>  
> CC [~yanboliang] [~merlintang]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to