Hi slurm-dev, I'm doing a simulation project that requires me to run a large number of simulations (a million). Array jobs currently only let me submit 1000 jobs at a time. I created a bash loop that submits that 1,000 arrays times to get to a million, but it stops after 9 iterations and I get this error:
sbatch: error: Slurm temporarily unable to accept job, sleeping and retrying. Previous replies to this type of problem suggest changing MaxJobCount in slurm.conf, but I'm on a shared cluster and don't have access to that. Additionally, I don't want to flood the cluster with only my jobs. I'm wondering if there's a way for me to submit some number of jobs, say 5,000, and every time a job finished, submit another one until I get to a million. Is that possible? Thanks a lot! Arun -- Arun Durvasula [email protected]
