Hi Marcus,

Thank you for your reply. Your comments regarding the oom_killer sounds 
interesting. Looking at the slurmd logs on the serial nodes I see that the 
oom_killer is very active on a typical day, and so I suspect you're likely on 
to something there. As you might expect memory is configured as a resource on 
these shared nodes and users should take care to request sufficient memory for 
their job. More often than none I guess that users are wrongly assuming that 
the default memory allocation is sufficient.

Best regards,
David
________________________________
From: Marcus Wagner <wag...@itc.rwth-aachen.de>
Sent: 06 November 2019 09:53
To: David Baker <d.j.ba...@soton.ac.uk>; slurm-users@lists.schedmd.com 
<slurm-users@lists.schedmd.com>; juergen.s...@uni-ulm.de 
<juergen.s...@uni-ulm.de>
Subject: Re: [slurm-users] Running job using our serial queue

Hi David,

if I remember right (we have disabled swap for years now), swapping out 
processes seem to slow down the system overall.
But I know, that if the oom_killer does its job (killing over memory 
processes), the whole system is stalled until it has done its work. This might 
be the issue, your users see.

Hwloc at least should help the scheduler to decide, where to place processes, 
but if I remember right, slurm has to be built with hwloc support (meaning at 
least hwloc-devel has to be installed).
But this part is more guessing, than knowing.

Best
Marcus

On 11/5/19 11:58 AM, David Baker wrote:
Hello,

Thank you for your replies. I double checked that the "task" in, for example, 
taskplugin=task/affinity is optional. In this respect it is good to know that 
we have  the correct cgroups setup. So in theory users should only disturb 
themselves, however in reality we find that there is often a knock on effect on 
other users' jobs. So, for example, users have complained that their jobs 
sometimes stall. I can only vaguely think that something odd is going on at the 
kernel level perhaps.

One additional thing that I need to ask is... Should we have hwloc installed 
our compute nodes? Does that help? Whenever I check which processes are not 
being constrained by cgroups I only ever find a small group of system processes.

Best regards,
David




________________________________
From: slurm-users 
<slurm-users-boun...@lists.schedmd.com><mailto:slurm-users-boun...@lists.schedmd.com>
 on behalf of Marcus Wagner 
<wag...@itc.rwth-aachen.de><mailto:wag...@itc.rwth-aachen.de>
Sent: 05 November 2019 07:47
To: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com> 
<slurm-users@lists.schedmd.com><mailto:slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Running job using our serial queue

Hi David,

doing it the way you do it, is the same way, we do it.

When the Matlab job asks for one CPU, it only gets on CPU this way. That means, 
that all the processes are bound to this one CPU. So (theoretically) the user 
is just disturbing himself, if he uses more.

But especially Matlab, there are more things to do. I t does not suffice to add 
'-singleCompThread' to the commandline. Matlab is not the only tool, that tries 
to use all cores, it finds on the node.
The same is valid for CPLEX and Gurobi, both often used from Matlab. So even, 
if the user sets '-singleCompThread' for Matlab, that does not mean at all, the 
job is only using one CPU.


Best
Marcus

On 11/4/19 4:14 PM, David Baker wrote:
Hello,

We decided to route all jobs requesting from 1 to 20 cores to our serial queue. 
Furthermore, the nodes controlled by the serial queue are shared by multiple 
users. We did this to try to reduce the level of fragmentation across the 
cluster -- our default "batch" queue provides exclusive access to compute nodes.

It looks like the downside of the serial queue is that jobs from different 
users can interact quite badly. To some extent this is an education issue -- 
for example matlab users need to be told to add the "-singleCompThread" option 
to their command line. On the other hand I wonder if our cgroups setup is 
optimal for the serial queue. Our cgroup.conf contains...

CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"

ConstrainCores=yes
ConstrainRAMSpace=yes
ConstrainDevices=yes
TaskAffinity=no

CgroupMountpoint=/sys/fs/cgroup

The relevant cgroup configuration in the slurm.conf is...
ProctrackType=proctrack/cgroup
TaskPlugin=affinity,cgroup

Could someone please advise us on the required/recommended cgroup setup for the 
above scenario? For example, should we really set "TaskAffinity=yes"? I assume 
the interaction between jobs (sometimes jobs can get stalled) is due to context 
switching at the kernel level, however (apart from educating users) how can we 
minimise that switching on the serial nodes?

Best regards,
David



--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de<mailto:wag...@itc.rwth-aachen.de>
www.itc.rwth-aachen.de<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.itc.rwth-aachen.de&data=01%7C01%7CD.J.Baker%40soton.ac.uk%7Ccf7d7c5b2f294965e06f08d7629f3040%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=%2FKhU1fdr86GJemYcUWEzW23zh2LALjgLXntiB5MXy%2FA%3D&reserved=0>



--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de<mailto:wag...@itc.rwth-aachen.de>
www.itc.rwth-aachen.de<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.itc.rwth-aachen.de&data=01%7C01%7CD.J.Baker%40soton.ac.uk%7Ccf7d7c5b2f294965e06f08d7629f3040%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=%2FKhU1fdr86GJemYcUWEzW23zh2LALjgLXntiB5MXy%2FA%3D&reserved=0>

Reply via email to