Hello folks PR https://github.com/open-mpi/ompi/pull/2916 <https://github.com/open-mpi/ompi/pull/2916> contains modifications that will significantly improve launch performance when launching via mpirun at scale. It contains two changes:
1. it pushes all mapping operations to the backend compute daemons. This reduces the launch message to less than 1k bytes and essentially independent of the number of nodes and procs in the job. 2. adds a cmd line option/mca param “--fwd-mpirun-port” that allows mpirun to dynamically select a port, but then forces all other daemons to statically use that port. This allows the daemons to report back to mpirun thru the OOB overlay network instead of having to directly connect back to mpirun. The prior --novm option continues to map on the frontend, but that should be okay since that option isn’t intended to operate at scale. To summarize where we are on OMPI master, best launch performance can be obtained by “--mca pmix_base_async_modex 1 --fwd-mpirun-port” on the mpirun cmd line, or adding the “pmix_base_async_modex” mca param to your environment for direct launch. Everything else should be automatic now. Ralph
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel