[ https://issues.apache.org/jira/browse/YARN-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wangda Tan updated YARN-7224: ----------------------------- Attachment: YARN-7224.002-wip.patch Attached ver.2 work-in-progress patch. Major change of this patch is I found cgroup + docker doesn't work under newer docker version which uses {{runc}} as default runtime. Setting {{--cgroup-parent}} to a cgroup which include any {{devices.deny}} causes docker container cannot be launched. Instead this patch passes allowed GPU devices via {{--device}} to docker launch command. Tested this patch in a centos 7 machine with 2 GPU devices, it works fine. There're some cleanups need to be done and more unit tests need to be added. Marked as WIP. > Support GPU isolation for docker container > ------------------------------------------ > > Key: YARN-7224 > URL: https://issues.apache.org/jira/browse/YARN-7224 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-7224.001.patch, YARN-7224.002-wip.patch > > > YARN-6620 added support of GPU isolation in NM side, which only supports > non-docker containers. We need to add support to help docker containers > launched by YARN can utilize GPUs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org