Hi,

I'm running Slurm 15.08.1 version.

When I submitted a job array with 5000 tasks, it only scheduled the first
102 tasks although there are plenty of slots available.

sbatch --array=1-5000 -o /dev/null --wrap="/bin/sleep 120"

The slurmctld log says:

[2016-01-06T12:43:43.496] debug:  sched: already tested 102 jobs, breaking
out

Then, after a while, the schduler dispatched some 1000 tasks and says

[[2016-01-06T12:44:24.003] debug:  sched: loop taking too long, breaking out
[2016-01-06T12:44:24.004] debug:  Note large processing time from schedule:
usec=1439516 began=12:44:22.564
[2016-01-06T12:44:24.070] debug:  Note large processing time from
_slurmctld_background: usec=1531381 began=12:44:22.538

After that, slurm schedules the remaining tasks only one compute nodes.

Has anyone seen this behavior?

Currently we've set the following Slurm parameters:
MaxArraySize=100000
MaxJobCount=2500000

Thanks,
- Chansup

Reply via email to