Sounds pretty safe, but with the current COVID-19 difficulties, including
Spring quarter classes being taught remotely (starts tomorrow, fun times
ahead), I'm a bit reluctant to poke a running system. This will get put on
the giant list of things waiting for a scheduled downtime once campus
re-open
When I changed this on a running system, no jobs were killed, but
slurm lost track of jobs on nodes and was unable to kill them or tell
when they were finished until slurmd on each node was restarted. I
let running jobs complete and monitored them manually, and restarted
slurmd on each node as the
Hi Nate,
On Fri, 2020-02-21 at 11:38 -0800, Nathan R Crawford wrote:
> If it just requires restarting slurmctld and the slurmd processes
> on the nodes, I will be happy! Can you confirm that no running or
> pending jobs were lost in the transition?
Did you change your SelectType to cons_tres? H
Hi Chris,
If it just requires restarting slurmctld and the slurmd processes on the
nodes, I will be happy! Can you confirm that no running or pending jobs
were lost in the transition?
Thanks,
Nate
On Thu, Feb 20, 2020 at 6:54 PM Chris Samuel wrote:
> On 20/2/20 2:16 pm, Nathan R Crawford wro
On 20/2/20 2:16 pm, Nathan R Crawford wrote:
I interpret this as, in general, changing SelectType will nuke
existing jobs, but that since cons_tres uses the same state format as
cons_res, it should work.
We got caught with just this on our GPU nodes (though it was fixed
before I got to se
Hi All,
I have 19.05.4 and want to change SelectType from select/cons_res to
select/cons_tres without losing running or pending jobs. The documentation
is a bit conflicting.
>From the man page:
SelectType
Identifies the type of resource selection algorithm to be used. Changing
this value can