Hi All, Our team is using Slurm to distribute tasks across a cluster, but our implementation may be a little different than what the typical person is doing... maybe?
We'll submit a very simple sbatch, like so: ``` #!/bin/bash #SBATCH --errror=/tmp/error.log #SBATCH --output/tmp/output.log execute_algorithm arg1 arg2 ``` `execute_alogorithm` is where things get a bit funny, it can be some variant of a complex C algorithm of ours that will spawn potentially thousands upon thousands of subprocess invocations. Each of these subprocess invocations are executed with `srun`, and slurm is successfully recognizing them as job steps. It should also be noted that we are waiting for all the threads to exit before exiting `execute_algorithm` itself. The question here is can slurm handle this sort of srun task allotment, firing them all out at once? In testing, this has appeared to work on very small jobs that are just above the limits of our node resources. I can actually see entries in the job error log that slurm is recognizing that it's hitting task capacity and waiting: `srun: Job step creation temporarily disabled, retrying` We have yet to get to a point where we can run the "thousands" of tasks that I speak of, but that will be coming up at the end of the month, and frankly I'm skeptical. Is this a common approach? If we are stuck with this approach and there is not another way to do it, do we just build some internal scheduling logic into the `execute_algorithm`? Thanks!
