You will see here

https://groups.google.com/forum/#!msg/slurm-devel/lKX8st9aztI/dF5Kvz4gDAAJ

that you need to set

CgroupAutomount=no

in cgroup.conf

if you are running a system using systemd

cheers
L.

------
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consciousness that our institutions have failed
and our ecosystem is collapsing, yet we are still here — and we are
creative agents who can shape our destinies. Apocalyptic civics is the
conviction that the only way out is through, and the only way through is
together. "

*Greg Bloom* @greggish
https://twitter.com/greggish/status/873177525903609857

On 2 August 2017 at 14:27, 한수민 <hsm6...@gmail.com> wrote:

> My slurmd.log says:
>
> [2017-08-02T04:25:45.453] debug2: _file_read_content: unable to open
> '/sys/fs/cgroup/freezer//release_agent' for reading : No such file or
> directory
> [2017-08-02T04:25:45.453] debug2: xcgroup_get_param: unable to get
> parameter 'release_agent' for '/sys/fs/cgroup/freezer/'
> [2017-08-02T04:25:45.453] error: unable to mount freezer cgroup namespace:
> Device or resource busy
> [2017-08-02T04:25:45.453] error: unable to create freezer cgroup namespace
> [2017-08-02T04:25:45.453] error: Couldn't load specified plugin name for
> proctrack/cgroup: Plugin init() callback failed
> [2017-08-02T04:25:45.453] error: cannot create proctrack context for
> proctrack/cgroup
> [2017-08-02T04:25:45.453] error: slurmd initialization failed
>
>
> hmm...
>
> Sumin Han
> Undergraduate '13, School of Computing
> Korea Advanced Institute of Science and Technology
> Daehak-ro 291
> Yuseong-gu, Daejeon
> Republic of Korea 305-701
> Tel. +82-10-2075-6911 <+82%2010-2075-6911>
>
> 2017-08-02 13:05 GMT+09:00 Lachlan Musicman <data...@gmail.com>:
>
>> [root@n6 /]# si
>>>
>>> PARTITION            NODES NODES(A/I/O/T) S:C:T    MEMORY     TMP_DISK
>>> TIMELIMIT   AVAIL_FEATURES   NODELIST
>>>
>>> debug*               6     0/6/0/6        1:4:2    7785       113264
>>> infinite    (null)           c[1-6]
>>>
>>> (for a moment)
>>>
>>> [root@n6 /]# si
>>>
>>> PARTITION            NODES NODES(A/I/O/T) S:C:T    MEMORY     TMP_DISK
>>> TIMELIMIT   AVAIL_FEATURES   NODELIST
>>>
>>> debug*               6     0/0/6/6        1:4:2    7785       113264
>>> infinite    (null)           c[1-6]
>>>
>>>
>>
>>
>>
>> 0/0/6/6 means your nodes are dying.
>>
>> You need to look into the /var/log/slurm/slurmd.log (*or where ever you
>> put the slurmd logs on the machine, as dictated by
>> SlurmdLogFile= ) on each of the nodes.
>>
>> I would predict that there is something wrong with your cgroup.conf
>>
>> try:
>>
>>  - confirming that /etc/slurm/cgroup directory exists on all nodes (as
>> per your cgroup.conf)
>>  - commenting out everything in cgroup.conf except CgroupAutomount=yes
>> ConstrainCores=yes
>>
>> Cheers
>> L.
>>
>>
>> ------
>> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic
>> civics is the insistence that we cannot ignore the truth, nor should we
>> panic about it. It is a shared consciousness that our institutions have
>> failed and our ecosystem is collapsing, yet we are still here — and we are
>> creative agents who can shape our destinies. Apocalyptic civics is the
>> conviction that the only way out is through, and the only way through is
>> together. "
>>
>> *Greg Bloom* @greggish https://twitter.com/greggish/s
>> tatus/873177525903609857
>>
>>
>

Reply via email to