It took some time but my colleague was able to build OpenMPI and get it
working with OmniPath, however the performance is quite disappointing.
The configuration line used was the following: ./configure
--prefix=$INSTALL_PATH --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --enable-shared -
- Mensaje original -
> De: "Pavel Mezentsev via users"
> Para: users@lists.open-mpi.org
> CC: "Pavel Mezentsev"
> Enviado: Miércoles, 19 de Mayo 2021 10:53:50
> Asunto: Re: [OMPI users] unable to launch a job on a system with OmniPath
>
> It took some time but my colleague was able to bui
The original configure line is correct ("--without-orte") - just a typo in the
later text.
You may be running into some issues with Slurm's built-in support for OMPI. Try
running it with OMPI's "mpirun" instead and see if you get better performance.
You'll have to reconfigure to remove the "--w
So, the bad news is that the PSM2 MTL requires ORTE - ORTE generates a UUID to
identify the job across all nodes in the fabric, allowing processes to find
each other over OPA at init time.
I believe the reason this works when you use OFI/libfabric is that libfabrice
generates its own UUIDs.
Fr
After thinking about this for a few more minutes, it occurred to me that you
might be able to "fake" the required UUID support by passing it as a shell
variable. For example:
export OMPI_MCA_orte_precondition_transports="0123456789ABCDEF-0123456789ABCDEF"
would probably do it. However, note tha
Right. there was a reference counting issue in OMPI that required a change to
PSM2 to properly fix. There's a configuration option to disable the reference
count check at build time, although I don't recall what the option is off the
top of my head.
From: Carlson, Timothy S
Sent: Wednesday, M
On Wed, 19 May 2021 15:53:50 +0200
Pavel Mezentsev via users wrote:
> It took some time but my colleague was able to build OpenMPI and get
> it working with OmniPath, however the performance is quite
> disappointing. The configuration line used was the
> following: ./configure --prefix=$INSTALL_P
To answer your specific questions:
The backend daemons (orted) will not exit until all locally spawned procs exit.
This is not configurable - for one thing, OMPI procs will suicide if they see
the daemon depart, so it makes no sense to have the daemon fail if a proc
terminates. The logic behind