[ https://issues.apache.org/jira/browse/SPARK-32429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165903#comment-17165903 ]
Xiangrui Meng commented on SPARK-32429: --------------------------------------- Couple questions: 1. Which GPU resource name do we use? "spark.task.resource.gpu" does not have special meaning in the current implemetnation. 2. I think we can do this for PySpark workers if 1) gets resolved. However, for executors running inside the same JVM, is there a way to set CUDA_VISIBLE_DEVICES differently per executor thread? > Standalone Mode allow setting CUDA_VISIBLE_DEVICES on executor launch > --------------------------------------------------------------------- > > Key: SPARK-32429 > URL: https://issues.apache.org/jira/browse/SPARK-32429 > Project: Spark > Issue Type: Improvement > Components: Deploy > Affects Versions: 3.0.0 > Reporter: Thomas Graves > Priority: Major > > It would be nice if standalone mode could allow users to set > CUDA_VISIBLE_DEVICES before launching an executor. This has multiple > benefits. > * kind of an isolation in that the executor can only see the GPUs set there. > * If your GPU application doesn't support explicitly setting the GPU device > id, setting this will make any GPU look like the default (id 0) and things > generally just work without any explicit setting > * New features are being added on newer GPUs that require explicit setting > of CUDA_VISIBLE_DEVICES like MIG > ([https://www.nvidia.com/en-us/technologies/multi-instance-gpu/]) > The code changes to just set this are very small, once we set them we would > also possibly need to change the gpu addresses as it changes them to start > from device id 0 again. > The easiest implementation would just specifically support this and have it > behind a config and set when the config is on and GPU resources are > allocated. > Note we probably want to have this same thing set when we launch a python > process as well so that it gets same env. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org