[ https://issues.apache.org/jira/browse/YARN-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722443#comment-16722443 ]
Zhankun Tang commented on YARN-9120: ------------------------------------ [~snemeth], I double-checked that if we remove "yarn.io/gpu" from property "nm.resource-plugins", the other GPU related configuration remains there, the server's GPU resource won't be discovered and used. Which means, GPU is disabled. And verified that the application requesting GPU will fail. It can run without requesting GPU resource. [~pbacsko], Probably it may have no obvious benefit when we add a new "off" value to "yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices" comparing to remove "yarn.io/gpu" from "yarn.nodemanager.resource-plugins"? Both ways to me need the admin to configure different yarn-site.xml in the servers. I guess your point is on how YARN can manage the configurations on a heterogeneous cluster? I'm not sure if Ambari or any tool can have a different configuration for each node. This seems not YARN's responsibility. [~rohithsharma] , any idea? > Need to have a way to turn off GPU auto-discovery in GpuDiscoverer > ------------------------------------------------------------------ > > Key: YARN-9120 > URL: https://issues.apache.org/jira/browse/YARN-9120 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Szilard Nemeth > Assignee: Szilard Nemeth > Priority: Major > > GpuDiscoverer.getGpusUsableByYarn either parses the user-defined GPU devices > or should have the value 'auto' (from property: > yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices) > In some circumstances, users would want to exclude a node from scheduling, so > they should have an option to turn off auto-discovery. > It's straightforward that this is possible by removing the GPU > resource-plugin from YARN's config along with GPU-related config in > container-executor.cfg, but doing that with a dedicated value for > yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices is a more > lightweight approach. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org