[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Sean McGrath
Hi, We use cgroups to limit usage to 3 cores and 4G of memory on the head nodes. I didn't do it but will copy and paste in our documentation below. Those limits, 3 cores are 4G are global to all non root users I think as they apply to a group. We obviously don't do this on the nodes. We also mo

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Ryan Cox
John, We use /etc/security/limits.conf to set cputime limits on processes: * hard cpu 60 root hard cpu unlimited It works pretty well but long running file transfers can get killed. We have a script that looks for whitelisted programs to remove the limit from on a periodic basis. We haven't

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread John Hearns
@byu.edu] Sent: 09 February 2017 15:31 To: slurm-dev Subject: [slurm-dev] Re: Stopping compute usage on login nodes John, We use /etc/security/limits.conf to set cputime limits on processes: * hard cpu 60 root hard cpu unlimited It works pretty well but long running file transfers can get killed.

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Jason Bacon
We simply make it impossible to run computational software on the head nodes. 1.No scientific software packages are installed on the local disk. 2.Our NFS-mounted application directory is mounted with noexec. Regards, Jason On 02/09/17 07:09, John Hearns wrote: Does anyone ha

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Ole Holm Nielsen
We limit the cpu times in /etc/security/limits.conf so that user processes have a maximum of 10 minutes. It doesn't eliminate the problem completely, but it's fairly effective on users who misunderstood the role of login nodes. On Thu, Feb 9, 2017 at 6:38 PM +0100, "Jason Bacon" mailto:bacon4

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Nicholas McCollum
While this isn't a SLURM issue, it's something we all face. Due to my system being primarily students, it's something I face a lot. I second the use of ulimits, although this can kill off long running file transfers. What you can do to help out users is set a low soft limit and a somewhat larger

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Jason Bacon
That reminds me, we also don't allow file transfers through the head node: chmod 750 /usr/bin/sftp /usr/bin/scp /usr/bin/rsync All file transfer operations must go through one of the file servers. On 02/09/17 12:13, Nicholas McCollum wrote: While this isn't a SLURM issue, it's something we a

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Ryan Cox
If you're interested in the programmatic method I mentioned to increase limits for file transfers, https://github.com/BYUHPC/uft/tree/master/cputime_controls might be worth looking at. It works well for us, though a user will occasionally start using a new file transfer program that you migh

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-09 Thread Ryan Novosielski
I have used ulimits in the past to limit users to 768MB of RAM per process. This seemed to be enough to run anything they were actually supposed to be running. I would use cgroups on a more modern (this was RHEL5). A related question: we used cgroups on a CentOS 6 system, but then switched our

[slurm-dev] Re: Stopping compute usage on login nodes

2017-02-10 Thread Marcin Stolarek
On the cluster I've been managing we had a solution with pam_script that was choosing for each user two random cores and bounding his session to those (if this is second session use the same cores). I think it's quite good solution, since 1) User is not able to take all server resources 2) The prob