Hi all, I guess this is a simple matter but I still find it confusing.
I have to run 20 jobs on our supercomputer. Each job takes about 8 hours and every one need the previous one to be completed. The queue time limit for jobs is 10 hours. So my first approach is serially launching them in a loop using srun: *#!/bin/bash* *for i in {1..20};do* * srun --time 08:10:00 [options]* *done* However SLURM literature keeps saying that 'srun' should be only used for short command line tests. So that some sysadmins would consider this a bad practice (see this <https://stackoverflow.com/questions/43767866/slurm-srun-vs-sbatch-and-their-parameters> ). My second approach switched to sbatch: * #!/bin/bash * *for i in {1..20};do* * sbatch --time 08:10:00 [options]* * [polling to queue to see if job is done]* *done* But since sbatch returns the prompt I had to add code to check for job termination. Polling make use of sleep command and it is prone to race conditions so it doesn't like to sysadmins either. I guess there must be a --wait option in some recent versions of SLURM (see this <https://bugs.schedmd.com/show_bug.cgi?id=1685>). Not yet available in our system though. Is there any prefererable/canonical/friendly way to do this? Any thoughts would be really appreciated, Regards, Nigella.