[ https://issues.apache.org/jira/browse/YARN-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wangda Tan updated YARN-7224: ----------------------------- Attachment: YARN-7224.008.patch Thanks [~sunilg] for comments, bq. In assignGpus, do we also need to update the assigned gpus to container's resource mapping list ? I would prefer to keep them in NMStateStore#storeAssignedResources, otherwise all new resource plugins need to implement such logics. bq. In general dockerCommandPlugin.updateDockerRunCommand helps to update docker command for volume etc. However is its better to have an api named sanitize/verifyCommand in dockerCommandPlugin so that incoming/created command will validated and logged based on system parameters I'm not quite sure about this, could you explain? bq. Once a docker volume is created, when this volume will be cleaned or unmounted ? in case when container crashes or force stopping container from external docker commands etc bq. With container upgrades or partially using GPU device for a timeslice of container lifetime, how volumes could be mounted/re-mounted ? For the GPU docker integration, we don't need to do this. Because all launched containers will share the same docker volume, so we don't need to create the docker volume again and again. I agree that we may need this in the future. So I added one method (getCleanupDockerVolumeCommand) to DockerCommandPlugin interface. bq. In GpuDevice, do we also need to add make (like nvidia with version etc ? ) We don't need it for now, we can add it in the future easily when required. bq. In initializeWhenGpuRequested, we do a lazy initialization. However if docker end point is down(default port), this could cause delay in container launch. Do we need a health mechanism to get this data updated ? To me this is same as docker daemon is down. And since containers will fail fast, so admin should be able to fix this issue. bq. Once docker volume is created, its better to dump the docker volume inspect o/p on created volume. Could help for debugging later. I like this ideal, but considering size of this patch, can we do this in a follow up JIRA? Attached ver.8 patch. > Support GPU isolation for docker container > ------------------------------------------ > > Key: YARN-7224 > URL: https://issues.apache.org/jira/browse/YARN-7224 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-7224.001.patch, YARN-7224.002-wip.patch, > YARN-7224.003.patch, YARN-7224.004.patch, YARN-7224.005.patch, > YARN-7224.006.patch, YARN-7224.007.patch, YARN-7224.008.patch > > > This patch is to address issues when docker container is being used: > 1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are > pre-packaged inside docker image, it could conflict to driver and > nvidia-libraries installed on Host OS. An alternative solution is to detect > Host OS's installed drivers and devices, mount it when launch docker > container. Please refer to \[1\] for more details. > 2. Image detection: > From \[2\], the challenge is: > bq. Mounting user-level driver libraries and device files clobbers the > environment of the container, it should be done only when the container is > running a GPU application. The challenge here is to determine if a given > image will be using the GPU or not. We should also prevent launching > containers based on a Docker image that is incompatible with the host NVIDIA > driver version, you can find more details on this wiki page. > 3. GPU isolation. > *Proposed solution*: > a. Use nvidia-docker-plugin \[3\] to address issue #1, this is the same > solution used by K8S \[4\]. issue #2 could be addressed in a separate JIRA. > We won't ship nvidia-docker-plugin with out releases and we require cluster > admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. > "nvidia-docker" is a wrapper of docker binary which can address #3 as well, > however "nvidia-docker" doesn't provide same semantics of docker, and it > needs to setup additional environments such as PATH/LD_LIBRARY_PATH to use > it. To avoid introducing additional issues, we plan to use > nvidia-docker-plugin + docker binary approach. > b. To address GPU driver and nvidia libraries, we uses nvidia-docker-plugin > \[3\] to create a volume which includes GPU-related libraries and mount it > when docker container being launched. Changes include: > - Instead of using {{volume-driver}}, this patch added {{docker volume > create}} command to c-e and NM Java side. The reason is {{volume-driver}} can > only use single volume driver for each launched docker container. > - Updated {{c-e}} and Java side, if a mounted volume is a named volume in > docker, skip checking file existence. (Named-volume still need to be added to > permitted list of container-executor.cfg). > c. To address isolation issue: > We found that, cgroup + docker doesn't work under newer docker version which > uses {{runc}} as default runtime. Setting {{--cgroup-parent}} to a cgroup > which include any {{devices.deny}} causes docker container cannot be launched. > Instead this patch passes allowed GPU devices via {{--device}} to docker > launch command. > References: > \[1\] https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver > \[2\] https://github.com/NVIDIA/nvidia-docker/wiki/Image-inspection > \[3\] https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker-plugin > \[4\] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org