Hi Daniele,

I bet this psm2 got installed as part of Mpss 3.7.  I see something in the
readme for that about MPSS install with OFED support.
I think if you want to go the route of using the RHEL Open MPI RPMS, you
could use the mca-params.conf file approach
to disabling the use of psm2.

This file and a lot of other stuff about mca parameters is described here:

https://www.open-mpi.org/faq/?category=tuning

Alternatively, you could try and build/install Open MPI yourself from the
download page:

https://www.open-mpi.org/software/ompi/v1.10/

The simplest solution - but you need to be confident that nothing's using
the PSM2 software - would be just
use yum to deinstall the psm2 rpm.

Good luck,

Howard




2016-12-08 14:17 GMT-07:00 Daniele Tartarini <d.tartar...@sheffield.ac.uk>:

> Hi,
> many thanks for tour reply.
>
> I have a S2600IP Intel motherboard. it is a stand alone server and I
> cannot see any omnipath device and so not such modules.
> opainfo is not available on my system
>
> missing anything?
> cheers
> Daniele
>
> On 8 December 2016 at 17:55, Cabral, Matias A <matias.a.cab...@intel.com>
> wrote:
>
>> >Anyway, * /dev/hfi1_0* doesn't exist.
>>
>> Make sure you have the hfi1 module/driver loaded.
>>
>> In addition, please confirm the links are in active state on all the
>> nodes `opainfo`
>>
>>
>>
>> _MAC
>>
>>
>>
>> *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *Howard
>> Pritchard
>> *Sent:* Thursday, December 08, 2016 9:23 AM
>> *To:* Open MPI Users <users@lists.open-mpi.org>
>> *Subject:* Re: [OMPI users] device failed to appear .. Connection timed
>> out
>>
>>
>>
>> hello Daniele,
>>
>>
>>
>> Could you post the output from ompi_info command?  I'm noticing on the
>> RPMS that came with the rhel7.2 distro on
>>
>> one of our systems that it was built to support psm2/hfi-1.
>>
>>
>>
>> Two things, could you try running applications with
>>
>>
>>
>> mpirun --mca pml ob1 (all the rest of your args)
>>
>>
>>
>> and see if that works?
>>
>>
>>
>> Second,  what sort of system are you using?  Is this a cluster?  If it
>> is, you may want to check whether
>>
>> you have a situation where its an omnipath interconnect and you have the
>> psm2/hfi1 packages installed
>>
>> but for some reason the omnipath HCAs themselves are not active.
>>
>>
>>
>> On one of our omnipath systems the following hfi1 related pms are
>> installed:
>>
>>
>>
>> *hfi*diags-0.8-13.x86_64
>>
>> *hfi*1-psm-devel-0.7-244.x86_64
>> lib*hfi*1verbs-0.5-16.el7.x86_64
>> *hfi*1-psm-0.7-244.x86_64
>> *hfi*1-firmware-0.9-36.noarch
>> *hfi*1-psm-compat-0.7-244.x86_64
>> lib*hfi*1verbs-devel-0.5-16.el7.x86_64
>> *hfi*1-0.11.3.10.0_327.el7.x86_64-245.x86_64
>> *hfi*1-firmware_debug-0.9-36.noarc
>> *hfi*1-diagtools-sw-0.8-13.x86_64
>>
>>
>>
>> Howard
>>
>>
>>
>> 2016-12-08 8:45 GMT-07:00 r...@open-mpi.org <r...@open-mpi.org>:
>>
>> Sounds like something didn’t quite get configured right, or maybe you
>> have a library installed that isn’t quite setup correctly, or...
>>
>>
>>
>> Regardless, we generally advise building from source to avoid such
>> problems. Is there some reason not to just do so?
>>
>>
>>
>> On Dec 8, 2016, at 6:16 AM, Daniele Tartarini <
>> d.tartar...@sheffield.ac.uk> wrote:
>>
>>
>>
>> Hi,
>>
>> I've installed on a Red Hat 7.2 the OpenMPI distributed via Yum:
>>
>> *        openmpi-devel.x86_64                 1.10.3-3.el7  *
>>
>>
>>
>> any code I try to run (including the mpitests-*) I get the following
>> message with slight variants:
>>
>>
>>
>> *         my_machine.171619hfi_wait_for_device: The /dev/hfi1_0 device
>> failed to appear after 15.0 seconds: Connection timed out*
>>
>>
>>
>> Is anyone able to help me in identifying the source of the problem?
>>
>> Anyway, * /dev/hfi1_0* doesn't exist.
>>
>>
>>
>> If I use an OpenMPI version compiled from source I have no issue (gcc
>> 4.8.5).
>>
>>
>>
>> many thanks in advance.
>>
>>
>>
>> cheers
>>
>> Daniele
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
>
>
> --
> --
> Daniele Tartarini
>
> Post-Doctoral Research Associate
> Dept. Mechanical Engineering &
> INSIGNEO, institute for *in silico* medicine,
> University of Sheffield, Sheffield, UK
> linkedIn <http://uk.linkedin.com/in/danieletartarini>
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to