[slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Holtgrewe, Manuel
Hi, is there a way to make interactive logins where users will use almost no resources "always succeed"? In most of these interactive sessions, users will have mostly idle shells running and do some batch job submissions. Is there a way to allocate "infinite virtual cpus" on each node that can

Re: [slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Loris Bennett
Hi Manual, "Holtgrewe, Manuel" writes: > Hi, > > is there a way to make interactive logins where users will use almost no > resources "always succeed"? > > In most of these interactive sessions, users will have mostly idle shells > running and do some batch job submissions. Is there a way to a

Re: [slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Paul Edmon
Generally the way we've solved this is to set aside a specific set of nodes in a partition for interactive sessions.  We deliberately scale the size of the resources so that users will always run immediately and we also set a QoS on the partition to make it so that no one user can dominate the

Re: [slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Renfro, Michael
That’s close to what we’re doing, but without dedicated nodes. We have three back-end partitions (interactive, any-interactive, and gpu-interactive), but the users typically don’t have to consider that, due to our job_submit.lua plugin. All three partitions have a default of 2 hours, 1 core, 2

Re: [slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Paul Edmon
That's pretty slick.  We just have a test, gpu_test, and remotedesktop partition set up for those purposes. What the real trick is making sure you have sufficient spare capacity that you can deliberately idle for these purposes.  If we were a smaller shop with less hardware I wouldn't be able

Re: [slurm-users] Make "srun --pty bash -i" always schedule immediately

2020-06-11 Thread Renfro, Michael
Spare capacity is critical. At our scale, the few dozen cores that were typically left idle in our GPU nodes handles the vast majority of interactive work. > On Jun 11, 2020, at 8:38 AM, Paul Edmon wrote: > > External Email Warning > > This email originated from outside the university. Please