Sent: Wednesday, April 17, 2024 5:11 PM
To: Open MPI Users
Cc: Greg Samonds ; Adnane Khattabi
; Philippe Rouchon
Subject: Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available
processors)" when running multiple jobs concurrently
Hi Greg,
I am not an openmpi exper
s
Cc: Greg Samonds ; Adnane Khattabi
; Philippe Rouchon
Subject: Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available
processors)" when running multiple jobs concurrently
Hi Greg,
I am not an openmpi expert but I just wanted to share my experience with HPC-X.
1. Default
ds,
Mehmet
From: users on behalf of Greg Samonds via
users
Sent: Tuesday, April 16, 2024 5:50 PM
To: Open MPI Users
Cc: Greg Samonds ; Adnane Khattabi
; Philippe Rouchon
Subject: Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available
processors)" when running multip
.
Thanks again!
Regards,
Greg
From: users On Behalf Of Gilles Gouaillardet
via users
Sent: Tuesday, April 16, 2024 12:59 AM
To: Open MPI Users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users] "MCW rank 0 is not bound (or bound to all available
processors)" when running mul
Greg,
If Open MPI was built with UCX, your jobs will likely use UCX (and the
shared memory provider) even if running on a single node.
You can
mpirun --mca pml ob1 --mca btl self,sm ...
if you want to avoid using UCX.
What is a typical mpirun command line used under the hood by your "make
test"?
Hello,
We're running into issues with jobs failing in a non-deterministic way when
running multiple jobs concurrently within a "make test" framework.
Make test is launched from within a shell script running inside a Podman
container, and we're typically running with "-j 20" and "-np 4" (20 jobs