Oh, I've solve the problem. While I'm searching on Google about "cgroup monitoring", there was a program called ctop. And I run it inside my slurm-docker images and it said:
# ctop [WARN] Failed to find any relevant cgroup/container.ree view │ [Q]uit Hint: It seems you are running inside a Docker container. Please make sure to expose host's cgroups with '--volume=/sys/fs/cgroup:/sys/fs/cgroup:ro' So I just mounted that volume and it was successfully solved. Thank you, anyway! Sincerly, Sumin. Sumin Han Undergraduate '13, School of Computing Korea Advanced Institute of Science and Technology Daehak-ro 291 Yuseong-gu, Daejeon Republic of Korea 305-701 Tel. +82-10-2075-6911 2017-08-02 16:56 GMT+09:00 한수민 <hsm6...@gmail.com>: > Well.. now i've change to > > cgroup.conf: > ### > # Slurm cgroup support configuration file > ### > CgroupAutomount=no > CgroupMountpoint=/sys/fs/cgroup > #CgroupReleaseAgentDir="/etc/slurm/cgroup" > ConstrainCores=yes > > #TaskAffinity=no > # > > but still don't work. > > In fact, I'm using slurm inside docker container. Would it can cause > problem with using slurm with cgroup? > > Sumin Han > Undergraduate '13, School of Computing > Korea Advanced Institute of Science and Technology > Daehak-ro 291 > Yuseong-gu, Daejeon > Republic of Korea 305-701 > Tel. +82-10-2075-6911 > > 2017-08-02 13:34 GMT+09:00 Lachlan Musicman <data...@gmail.com>: > >> You will see here >> >> https://groups.google.com/forum/#!msg/slurm-devel/lKX8st9azt >> I/dF5Kvz4gDAAJ >> >> that you need to set >> >> CgroupAutomount=no >> >> in cgroup.conf >> >> if you are running a system using systemd >> >> cheers >> L. >> >> ------ >> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic >> civics is the insistence that we cannot ignore the truth, nor should we >> panic about it. It is a shared consciousness that our institutions have >> failed and our ecosystem is collapsing, yet we are still here — and we are >> creative agents who can shape our destinies. Apocalyptic civics is the >> conviction that the only way out is through, and the only way through is >> together. " >> >> *Greg Bloom* @greggish https://twitter.com/greggish/s >> tatus/873177525903609857 >> >> On 2 August 2017 at 14:27, 한수민 <hsm6...@gmail.com> wrote: >> >>> My slurmd.log says: >>> >>> [2017-08-02T04:25:45.453] debug2: _file_read_content: unable to open >>> '/sys/fs/cgroup/freezer//release_agent' for reading : No such file or >>> directory >>> [2017-08-02T04:25:45.453] debug2: xcgroup_get_param: unable to get >>> parameter 'release_agent' for '/sys/fs/cgroup/freezer/' >>> [2017-08-02T04:25:45.453] error: unable to mount freezer cgroup >>> namespace: Device or resource busy >>> [2017-08-02T04:25:45.453] error: unable to create freezer cgroup >>> namespace >>> [2017-08-02T04:25:45.453] error: Couldn't load specified plugin name for >>> proctrack/cgroup: Plugin init() callback failed >>> [2017-08-02T04:25:45.453] error: cannot create proctrack context for >>> proctrack/cgroup >>> [2017-08-02T04:25:45.453] error: slurmd initialization failed >>> >>> >>> hmm... >>> >>> Sumin Han >>> Undergraduate '13, School of Computing >>> Korea Advanced Institute of Science and Technology >>> Daehak-ro 291 >>> Yuseong-gu, Daejeon >>> Republic of Korea 305-701 >>> Tel. +82-10-2075-6911 <+82%2010-2075-6911> >>> >>> 2017-08-02 13:05 GMT+09:00 Lachlan Musicman <data...@gmail.com>: >>> >>>> [root@n6 /]# si >>>>> >>>>> PARTITION NODES NODES(A/I/O/T) S:C:T MEMORY >>>>> TMP_DISK TIMELIMIT AVAIL_FEATURES NODELIST >>>>> >>>>> debug* 6 0/6/0/6 1:4:2 7785 >>>>> 113264 infinite (null) c[1-6] >>>>> >>>>> (for a moment) >>>>> >>>>> [root@n6 /]# si >>>>> >>>>> PARTITION NODES NODES(A/I/O/T) S:C:T MEMORY >>>>> TMP_DISK TIMELIMIT AVAIL_FEATURES NODELIST >>>>> >>>>> debug* 6 0/0/6/6 1:4:2 7785 >>>>> 113264 infinite (null) c[1-6] >>>>> >>>>> >>>> >>>> >>>> >>>> 0/0/6/6 means your nodes are dying. >>>> >>>> You need to look into the /var/log/slurm/slurmd.log (*or where ever you >>>> put the slurmd logs on the machine, as dictated by >>>> SlurmdLogFile= ) on each of the nodes. >>>> >>>> I would predict that there is something wrong with your cgroup.conf >>>> >>>> try: >>>> >>>> - confirming that /etc/slurm/cgroup directory exists on all nodes (as >>>> per your cgroup.conf) >>>> - commenting out everything in cgroup.conf except CgroupAutomount=yes >>>> ConstrainCores=yes >>>> >>>> Cheers >>>> L. >>>> >>>> >>>> ------ >>>> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic >>>> civics is the insistence that we cannot ignore the truth, nor should we >>>> panic about it. It is a shared consciousness that our institutions have >>>> failed and our ecosystem is collapsing, yet we are still here — and we are >>>> creative agents who can shape our destinies. Apocalyptic civics is the >>>> conviction that the only way out is through, and the only way through is >>>> together. " >>>> >>>> *Greg Bloom* @greggish https://twitter.com/greggish/s >>>> tatus/873177525903609857 >>>> >>>> >>> >> >