That warning is an annoying bit of cruft from the openib / verbs provider that can be ignored. (Actually, I recommend using "-btl ^openib" to suppress the warning.)
That said, there is a known issue with selecting PSM2 and OMPI 4.1.0. I'm not sure that that's the problem you're hitting, though, because you really haven't provided a lot of information. I would suggest trying the following to see what happens: ${PATH_TO_OMPI}/mpirun -mca mtl psm2 -mca btl ^openib -mca mtl_base_verbose 99 -mca btl_base_verbose 99 -n ${N} -H ${HOSTS} my_application This should give you detailed information on what transports were selected and what happened next. Oh - and make sure your fabric is up with an opainfo or opareport command, just to make sure. From: users <users-boun...@lists.open-mpi.org> On Behalf Of Pavel Mezentsev via users Sent: Monday, May 10, 2021 8:41 AM To: users@lists.open-mpi.org Cc: Pavel Mezentsev <pavel.mezent...@gmail.com> Subject: [OMPI users] unable to launch a job on a system with OmniPath Hi! I'm working on a system with KNL and OmniPath and I'm trying to launch a job but it fails. Could someone please advise what parameters I need to add to make it work properly? At first I need to make it work within one node, however later I need to use multiple nodes and eventually I may need to switch to TCP to run a hybrid job where some nodes are connected via Infiniband and some nodes are connected via OmniPath. So far without any extra parameters I get: ``` By default, for Open MPI 4.0 and later, infiniband ports on a device are not used by default. The intent is to use UCX for these devices. You can override this policy by setting the btl_openib_allow_ib MCA parameter to true. Local host: XXXXXX Local adapter: hfi1_0 Local port: 1 ``` If I add `OMPI_MCA_btl_openib_allow_ib="true"` then I get: ``` Error obtaining unique transport key from ORTE (orte_precondition_transports not present in the environment). Local host: XXXXXX ``` Then I tried adding OMPI_MCA_mtl="psm2" or OMPI_MCA_mtl="ofi" to make it use omnipath or OMPI_MCA_btl="sm,self" to make it use only shared memory. But these parameters did not make any difference. There does not seem to be much omni-path related documentation, at least I was not able to find anything that would help me but perhaps I missed something: https://www.open-mpi.org/faq/?category=running#opa-support<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.open-mpi.org%2Ffaq%2F%3Fcategory%3Drunning%23opa-support&data=04%7C01%7Cmichael.william.heinz%40cornelisnetworks.com%7C57fa32f71d054ebd6a5a08d913cd8fbf%7C4dbdb7da74ee4b458747ef5ce5ebe68a%7C0%7C0%7C637562595871907805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=kJ830bXfZmIMEg4hJkdEw8D6lw66aooAjHMpLL7NZ8c%3D&reserved=0> https://www.open-mpi.org/faq/?category=opa<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.open-mpi.org%2Ffaq%2F%3Fcategory%3Dopa&data=04%7C01%7Cmichael.william.heinz%40cornelisnetworks.com%7C57fa32f71d054ebd6a5a08d913cd8fbf%7C4dbdb7da74ee4b458747ef5ce5ebe68a%7C0%7C0%7C637562595871907805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SavN0pUsMxdufMBzrTyqSNCNHTVRMA1EUqlcWUMDcBo%3D&reserved=0> This is the `configure` line: ``` ./configure --prefix=XXXXX --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --enable-shared --with-hwloc=$EBROOTHWLOC --with-psm2 --with-libevent=$EBROOTLIBEVENT --without-orte --disable-oshmem --with-cuda=$EBROOTCUDA --with-gpfs --with-slurm --with-pmix=external --with-libevent=external --with-ompi-pmix-rte ``` Which also raises another question: if it was built with `--without-orte` then why do I get an error about failing to get something from ORTE. The OpenMPI version is `4.1.0rc1` built with `gcc-9.3.0`. Thank you in advance! Regards, Pavel Mezentsev.