Also when we directly use container-executor command to put something into 
devices.deny, it report unexpected operation code.

test@ip:/opt/hadoop-3.3.0$ sudo -U yarn 
/opt/hadoop-3.3.0/bin/container-executor  --module-gpu --container_id 
container_e57_1667177358230_0650_01_000001
-excluded_gpus 1,2,3,4,5,6,7
[sudo〕 password for alpha:
CGroups: Updating cgroups, 
path=/sys/fs/cgroup/devices/yarn/container_e57_1667177358230_0650_01_000001/devices.deny,
 value=c 195:1 rwm
CGroups: Updating cgroups, 
path=/sys/fs/cgroup/devices/yarn/container_e57_1667177358230_0650_01_000001/devices.deny,
 value=c 195:2 rwm
CGroups: Updating cgroups, path=/ 
sys/fs/cgroup/devices/yarn/container_e57_1667177358230 0650 01 
000001/devices.deny, value=c 195:3 rwm
CGroups: Updating cgroups, 
path=/sys/fs/cgroup/devices/yarn/container_e57_1667177358230_0650_01_000001/devices.deny,
 value=c 195:4 rwm
CGroups: Updating cgroups, path=/sys/ 
fs/cgroup/devices/yarn/container_e57_1667177358230_0650_01_000001/devices.deny, 
value=c 195:5 rwm
CGroups: Updating cgroups, path=/sys/fs/cgroup/ 
devices/yarn/container_e57_1667177358230_0650_01_000001/devices.deny, value=c 
195:6 rwm
CGroups: Dpaatang SEroupo: Pathg/Bya/4S/Eroup/ aeVicas/arn/ ontatner-es/ 
18871773382S8 68s8 f ooooot /aevAces.a8y. value=c 195:7 rwm
Unexpected operation code: -1
Nonzero exit code=3, error message=' Invalid command provided’


Thanks,
Xiong


> 2022年10月31日 22:21,zxcs <zhuxion...@163.com> 写道:
> 
> Hi, experts,
> 
> we are using hadoop-3.3.0 and trying using cpu also enable gpu isolation 
> following guide 
> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/UsingGpus.html
>  
> <https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/UsingGpus.html>
> 
> but when we start a  yarn job, node manager always failed at unexpected 
> operation code:-1 , could  any experts shed some light here? Thanks in 
> advance!
> 
> (sorry for the picture due, this due to we banned the copy anything from 
> testbed to outside)
> 
> <粘贴的图形-4.tiff>
> 
> 
> 
> here is the yarn-site.xml config 
> <property>
> <name>yarn.resource-types< /name>
> <value>yarn.io/gpu <http://yarn.io/gpu>< /value>
> </property>
> <property>
> <name>yarn.nodemanager.resource-plugins</name>
> <value>yarn.io/gpu <http://yarn.io/gpu></value>
> </ property>
> 
> and below is obtainer-executor.cfg
>      yarn.nodemanager.linux-container-executor.group=hadoop
> banned.users=root
> min.user.id <http://min.user.id/>=500
> allowed.system.users=yarn
> [gpu]
> module.enabled=true
> [cgroups]
> root=/sys/fs/cgroup
> yarn-hierarchy=yarn
> 
> below is the directory of /sys/fs/cgroup
> <粘贴的图形-3.tiff>
> 

Reply via email to