[ 
https://issues.apache.org/jira/browse/SPARK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972631#comment-16972631
 ] 

Imran Rashid commented on SPARK-29762:
--------------------------------------

I don't really understand the complication.  I know there would be some special 
casing for GPUs in the config parsing code (eg. in 
{{org.apache.spark.resource.ResourceUtils#parseResourceRequirements}}), but 
doesn't seem anything too bad.

I did think about this more, and realize it gets a bit confusing when you add 
in task-level resource constraints.  you won't schedule optimally for tasks 
that don't need gpu, and you won't have gpus leftover for the tasks that do 
need them.  Eg, say you had each executor setup with 4 cores and 2 gpus.  If 
you had one task set come in which only needed cpu, you would only run 2 
copies.  And then if another taskset came in which did need the gpus, you 
woudn't be able to schedule it.

You can't end up in that situation until you have task-specific resource 
constraints.  But does it get too messy to have sensible defaults in that 
situation?  Maybe the user specifies gpus as an executor resource up front, for 
the whole cluster, because they have them available and they know some 
significant fraction of the workloads need them.  They might think that the 
regular tasks will just ignore the gpus, and the tasks that do need gpus would 
just specify them as task-level constraints.

I guess this might have been a bad suggestion after all, sorry.

> GPU Scheduling - default task resource amount to 1
> --------------------------------------------------
>
>                 Key: SPARK-29762
>                 URL: https://issues.apache.org/jira/browse/SPARK-29762
>             Project: Spark
>          Issue Type: Story
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Thomas Graves
>            Priority: Major
>
> Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
> user specifies the executor resource then to make it more user friendly lets 
> have the task resource config default to 1.  This is ok right now since we 
> require resources to have an address.  It also matches what we do for the 
> spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to