Has anyone been able to run an MPI job using PMIX and heterogeneous jobs
successfully with 19.05 (or even 18.08)? I can run without heterogeneous jobs
but get all sorts of errors when I try and split the job up.
I haven't used MPI/PMIX much so maybe I'm missing something? Any ideas?
[slurm@tre
Works here on slurm 18.08.8, pmix 3.1.2. The mpi world ranks are unified as
they should be.
$ srun --mpi=pmix -n2 -wathos ./hello : -n8 -wporthos ./hellosrun: job 586
queued and waiting for resourcessrun: job 586 has been allocated resourcesHello
world from processor athos, rank 1 out of 10 pr
Well it looks like it it does fail as often as it works.
srun --mpi=pmix -n1 -wporthos : -n1 -wathos ./hellosrun: job 681 queued and
waiting for resourcessrun: job 681 has been allocated resourcesslurmstepd:
error: athos [0] pmixp_coll_ring.c:613 [pmixp_coll_ring_check] mpi/pmix: ERROR:
0x153ab
Hello world, I am 3 of 4 - running on trek9
From: slurm-users On Behalf Of Philip
Kovacs
Sent: Tuesday, July 16, 2019 12:03 PM
To: Slurm User Community List
Subject: Re: [slurm-users] PMIX with heterogeneous jobs
Well it looks like it it does fail as often as it works.
srun --mpi=pmix -n1