[slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Mehlberg, Steve
Has anyone been able to run an MPI job using PMIX and heterogeneous jobs successfully with 19.05 (or even 18.08)? I can run without heterogeneous jobs but get all sorts of errors when I try and split the job up. I haven't used MPI/PMIX much so maybe I'm missing something? Any ideas? [slurm@tre

Re: [slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Philip Kovacs
Works here on slurm 18.08.8, pmix 3.1.2.  The mpi world ranks are unified as they should be. $ srun --mpi=pmix -n2 -wathos ./hello : -n8 -wporthos ./hellosrun: job 586 queued and waiting for resourcessrun: job 586 has been allocated resourcesHello world from processor athos, rank 1 out of 10 pr

Re: [slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Philip Kovacs
Well it looks like it it does fail as often as it works. srun --mpi=pmix -n1 -wporthos : -n1 -wathos ./hellosrun: job 681 queued and waiting for resourcessrun: job 681 has been allocated resourcesslurmstepd: error: athos [0] pmixp_coll_ring.c:613 [pmixp_coll_ring_check] mpi/pmix: ERROR: 0x153ab

Re: [slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Mehlberg, Steve
Hello world, I am 3 of 4 - running on trek9 From: slurm-users On Behalf Of Philip Kovacs Sent: Tuesday, July 16, 2019 12:03 PM To: Slurm User Community List Subject: Re: [slurm-users] PMIX with heterogeneous jobs Well it looks like it it does fail as often as it works. srun --mpi=pmix -n1