Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-02-08 Thread Patrick Begou via users
version of IFS are you running? > 2. Are you using CUDA cards by any chance? If so, what version of CUDA? > > -Original Message- > From: Heinz, Michael William > Sent: Wednesday, January 27, 2021 3:45 PM > To: Open MPI Users > Subject: RE: [OMPI users] [EXTERNAL] Re: Op

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-28 Thread Heinz, Michael William via users
27, 2021 3:37 PM To: Open MPI Users Cc: Heinz, Michael William Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path Unfortunately, OPA/PSM support for Debian isn't handled by Intel directly or by Cornelis Networks - but I should point out you can download the l

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-28 Thread Peter Kjellström via users
On Wed, 27 Jan 2021 15:31:40 -0500 Michael Di Domenico via users wrote: > if you have OPA cards, for openmpi you only need --with-ofi, you don't > need psm/psm2/verbs/ucx. I agree with Michael and would add for clarity that on the system you always need PSM2 and optionally libfabric (if you go t

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Heinz, Michael William via users
, Michael William Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path Unfortunately, OPA/PSM support for Debian isn't handled by Intel directly or by Cornelis Networks - but I should point out you can download the latest official source for PSM2 and the drivers from G

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Heinz, Michael William via users
Sent: Wednesday, January 27, 2021 3:32 PM To: Open MPI Users Cc: Michael Di Domenico Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path if you have OPA cards, for openmpi you only need --with-ofi, you don't need psm/psm2/verbs/ucx. but this assumes you're runni

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Michael Di Domenico via users
if you have OPA cards, for openmpi you only need --with-ofi, you don't need psm/psm2/verbs/ucx. but this assumes you're running a rhel based distro and have installed the OPA fabric suite of software from Intel/CornelisNetworks. which is what i have. perhaps there's something really odd in debia

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Patrick Begou via users
Hi Howard and Michael first many thanks for testing with my short application. Yes, when the test code runs fine it just show the max RSS size of rank 0 process. When it runs wrong it put a messages about each invalid value found. As I said, I have also deployed OpenMPI on various cluster (in DEL

Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Pritchard Jr., Howard via users
Hi Folks, I'm also have problems reproducing this on one of our OPA clusters: libpsm2-11.2.78-1.el7.x86_64 libpsm2-devel-11.2.78-1.el7.x86_64 cluster runs RHEL 7.8 hca_id: hfi1_0 transport: InfiniBand (0) fw_ver: 1.27.0 node_g