On 24/12/20 4:42 pm, Erik Bryer wrote:
I made sure my slurm.conf is synchronized across machines. My intention
is to add some arbitrary gres for testing purposes.
Did you update your gres.conf on all the nodes to match?
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berke
Hello List,
I am trying to change:
NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1
RealMemory=1800 State=UNKNOWN
to
NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1
RealMemory=1800 State=UNKNOWN Gres=gpu:foolsgold:4
But I get this er
On 24/12/20 6:24 am, Paul Edmon wrote:
We then have a test cluster that we install the release on a run a few
test jobs to make sure things are working, usually MPI jobs as they tend
to hit most of the features of the scheduler.
One thing I meant to mention last night was that we use Reframe
We are the same way, though we tend to keep pace with minor releases.
We typically wait until the .1 release of a new major release before
considering upgrade so that many of the bugs are worked out. We then
have a test cluster that we install the release on a run a few test jobs
to make sure
Dear there,
We tested mpi allreduce job in three modes (srun-dtcp
、mpirun-slurm、mpirun-ssh), and we found that the job running time in the
mpirun-ssh mode is shorter than the other modes.
We've set parameters like below:
/usr/lib/systemd/system/slurmd.service:
LimitMEMLOCK=i