Hi Daniele, I bet this psm2 got installed as part of Mpss 3.7. I see something in the readme for that about MPSS install with OFED support. I think if you want to go the route of using the RHEL Open MPI RPMS, you could use the mca-params.conf file approach to disabling the use of psm2.
This file and a lot of other stuff about mca parameters is described here: https://www.open-mpi.org/faq/?category=tuning Alternatively, you could try and build/install Open MPI yourself from the download page: https://www.open-mpi.org/software/ompi/v1.10/ The simplest solution - but you need to be confident that nothing's using the PSM2 software - would be just use yum to deinstall the psm2 rpm. Good luck, Howard 2016-12-08 14:17 GMT-07:00 Daniele Tartarini <d.tartar...@sheffield.ac.uk>: > Hi, > many thanks for tour reply. > > I have a S2600IP Intel motherboard. it is a stand alone server and I > cannot see any omnipath device and so not such modules. > opainfo is not available on my system > > missing anything? > cheers > Daniele > > On 8 December 2016 at 17:55, Cabral, Matias A <matias.a.cab...@intel.com> > wrote: > >> >Anyway, * /dev/hfi1_0* doesn't exist. >> >> Make sure you have the hfi1 module/driver loaded. >> >> In addition, please confirm the links are in active state on all the >> nodes `opainfo` >> >> >> >> _MAC >> >> >> >> *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *Howard >> Pritchard >> *Sent:* Thursday, December 08, 2016 9:23 AM >> *To:* Open MPI Users <users@lists.open-mpi.org> >> *Subject:* Re: [OMPI users] device failed to appear .. Connection timed >> out >> >> >> >> hello Daniele, >> >> >> >> Could you post the output from ompi_info command? I'm noticing on the >> RPMS that came with the rhel7.2 distro on >> >> one of our systems that it was built to support psm2/hfi-1. >> >> >> >> Two things, could you try running applications with >> >> >> >> mpirun --mca pml ob1 (all the rest of your args) >> >> >> >> and see if that works? >> >> >> >> Second, what sort of system are you using? Is this a cluster? If it >> is, you may want to check whether >> >> you have a situation where its an omnipath interconnect and you have the >> psm2/hfi1 packages installed >> >> but for some reason the omnipath HCAs themselves are not active. >> >> >> >> On one of our omnipath systems the following hfi1 related pms are >> installed: >> >> >> >> *hfi*diags-0.8-13.x86_64 >> >> *hfi*1-psm-devel-0.7-244.x86_64 >> lib*hfi*1verbs-0.5-16.el7.x86_64 >> *hfi*1-psm-0.7-244.x86_64 >> *hfi*1-firmware-0.9-36.noarch >> *hfi*1-psm-compat-0.7-244.x86_64 >> lib*hfi*1verbs-devel-0.5-16.el7.x86_64 >> *hfi*1-0.11.3.10.0_327.el7.x86_64-245.x86_64 >> *hfi*1-firmware_debug-0.9-36.noarc >> *hfi*1-diagtools-sw-0.8-13.x86_64 >> >> >> >> Howard >> >> >> >> 2016-12-08 8:45 GMT-07:00 r...@open-mpi.org <r...@open-mpi.org>: >> >> Sounds like something didn’t quite get configured right, or maybe you >> have a library installed that isn’t quite setup correctly, or... >> >> >> >> Regardless, we generally advise building from source to avoid such >> problems. Is there some reason not to just do so? >> >> >> >> On Dec 8, 2016, at 6:16 AM, Daniele Tartarini < >> d.tartar...@sheffield.ac.uk> wrote: >> >> >> >> Hi, >> >> I've installed on a Red Hat 7.2 the OpenMPI distributed via Yum: >> >> * openmpi-devel.x86_64 1.10.3-3.el7 * >> >> >> >> any code I try to run (including the mpitests-*) I get the following >> message with slight variants: >> >> >> >> * my_machine.171619hfi_wait_for_device: The /dev/hfi1_0 device >> failed to appear after 15.0 seconds: Connection timed out* >> >> >> >> Is anyone able to help me in identifying the source of the problem? >> >> Anyway, * /dev/hfi1_0* doesn't exist. >> >> >> >> If I use an OpenMPI version compiled from source I have no issue (gcc >> 4.8.5). >> >> >> >> many thanks in advance. >> >> >> >> cheers >> >> Daniele >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> >> >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > > > -- > -- > Daniele Tartarini > > Post-Doctoral Research Associate > Dept. Mechanical Engineering & > INSIGNEO, institute for *in silico* medicine, > University of Sheffield, Sheffield, UK > linkedIn <http://uk.linkedin.com/in/danieletartarini> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users