Zhankun Tang created YARN-9073: ---------------------------------- Summary: GPU/FPGA whitelist configuration in container-executor.cfg won't work when yarn-site.xml's allowed devices doesn't align with it Key: YARN-9073 URL: https://issues.apache.org/jira/browse/YARN-9073 Project: Hadoop YARN Issue Type: Bug Reporter: Zhankun Tang Assignee: Zhankun Tang
The current GPU/FPGA behavior may has an issue when c-g.cfg doesn't align with yarn-site.xml. Take GPU for instance: One host has 1,2,3,4,5. And "GPU.allowed = 1,2,3" configured in c-e.cfg. But yarn-site.xml configured auto which means 1,2,3,4,5. And one application request 4 GPU, the scheduler allocated 1,2,4,5. So --excluded-gpus is "3". And c-e will check that 3 is in allowed list(1,2,3) and then only deny 3 in cgroups. In this case, c-e's allowed-list (1,2,3) doesn't work because the application can access 4 and 5. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org