[slurm-dev] Re: Running jobs are stopped and reqeued when adding new nodes

2017-10-22 Thread Douglas Jacobsen
You cannot change the nodelist without draining the system of running jobs (terminating all slurmstepd) and restarting all slurmd and slurmctld. This is because slurm uses a bit mask to represent the nodelist, and slurm uses a hierarchical overlay communication network. If all daemons don't have t

[slurm-dev] Running jobs are stopped and reqeued when adding new nodes

2017-10-22 Thread JinSung Kang
Hello, I am having trouble with adding new nodes into slurm cluster without killing the jobs that are currently running. Right now I 1. Update the slurm.conf and add a new node to it 2. Copy new slurm.conf to all the nodes, 3. Restart the slurmd on all nodes 4. Restart the slurmctld But when I