Hi Thekla, Thekla Loizou <t.loi...@cyi.ac.cy> writes:
> Dear all, > > I have noticed that SLURM schedules several jobs from a job array on the same > node with the same start time and end time. > > Each of these jobs requires the full node. You can see the squeue output > below: > > JOBID PARTITION ST START_TIME NODES SCHEDNODES > NODELIST(REASON) > > 124841_1 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_2 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_3 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_4 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_5 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_6 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_7 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_8 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > 124841_9 cpu PD 2021-12-11T03:58:00 1 > cn06 (Priority) > > Is this a bug or am I missing something? Is this because the jobs have the > same > JOBID and are still in pending state? I am aware that the jobs will not > actually > all run on the same node at the same time and that the scheduler somehow takes > into account that this job array has 9 jobs that will need 9 nodes. I am > creating a timeline with the start time of all jobs and when the job array > jobs > will start running no other jobs are set to run on the remaining nodes (so it > "saves" the other nodes for the jobs of the array even if they are all > scheduled > to run on the same node based on squeue or scontrol). In general jobs from an array will be scheduled on whatever nodes fulfil their requirements. The fact that all the jobs have cn06 as NODELIST however seems to suggest that you have either specified cn06 as the node the jobs should run on, or cn06 is the only node which fulfils the job requirements. I'm not sure what you mean about '"saving" the other nodes'. Cheers, Loris -- Dr. Loris Bennett (Herr/Mr) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de