Hi All
Hope you are all keeping well in these difficult times.
I have setup a small Slurm cluster of 8 compute nodes (4 x 1-core CPUs, 16GB
RAM) without scheduling or accounting as it isn't really needed.
I'm just looking for confirmation it's configured correctly to allow the
controller to
Hi,
I'm wondering if it's possible to gracefully terminate a solver that is
using MPI. If srun starts the MPI for me, can it tell the solver to
terminate and then wait n seconds before it tells MPI to terminate?
Or is the only way of handling this using scancel -b and trapping the
signal?
Depending on the users who will be on this cluster, I'd probably adjust the
partition to have a defined, non-infinite MaxTime, and maybe a lower
DefaultTime. Otherwise, it would be very easy for someone to start a job that
reserves all cores until the nodes get rebooted, since all they have to
Il 02/10/20 09:06, Diego Zuccato ha scritto:
> But IIUC, even if there's no default partition and the user did not
> select one explicitly, slurm can automatically select one containing all
> the requested resources, right?
I'm also experimenting with heterogeneus jobs
Il 01/10/20 16:01, Relu Patrascu ha scritto:
>
> Besides having a separate partition for each type of node, you can also
> have a partition which includes all the nodes, and use the Default=yes
> option in its definition.This is how it's currently configured, but being
> composed of