Re: [slurm-users] Nagios or Other Monitoring Plugins

2018-01-18 Thread Marcin Stolarek
We're using icinga2 storing accounting data in influxdb for grafana dashboards. In terms of monitoring I prefere end-user functionality, so apart from services we also have a plugin that submits a jobs to cluster (to idle nodes, with a few minutes of deadline) the job simply creates files on

[slurm-users] Nagios or Other Monitoring Plugins

2018-01-18 Thread Ryan Novosielski
Hi all, Looked back at the mailing list to see if there was a question about this already. There was some mention of /using/ Nagios, but no real mention of specifics. What do people monitor with Nagios? We monitor, so far, slurmctld, slurmdbd, and MySQL, but there are probably some others.

Re: [slurm-users] howto limit the cpu resource for each user

2018-01-18 Thread Colas Rivière
Hello Arielle, I don't have a full answer, but here is a start: Yes, you first need at least "AccountingStorageEnforce=associations,limits" (and qos is you want to use it) so that the limits you set are enforced (see https://slurm.schedmd.com/resource_limits.html) Then you can set limits

[slurm-users] howto limit the cpu resource for each user

2018-01-18 Thread Arielle Willm
Hi, slurm is installed in a minimal configuration for a cluster of 3000cores/170 nodes.We have 4 partitions, one for each type of nodes; each partition is available for all users. We want to prevent each user from taking more than 1000 cores running on up to 50 jobs on all the cluster, and

Re: [slurm-users] Slurm and available libraries

2018-01-18 Thread Elisabetta Falivene
So EasyBuild + Lmod seems the best solution. I'll try. :) Thank you all! betta 2018-01-17 17:53 GMT+01:00 Christopher Samuel : > On 18/01/18 03:50, Patrick Goetz wrote: > > Can anyone shed some light on the situation? I'm very surprised that >> a module script isn't just an