version of IFS are you running?
> 2. Are you using CUDA cards by any chance? If so, what version of CUDA?
>
> -Original Message-
> From: Heinz, Michael William
> Sent: Wednesday, January 27, 2021 3:45 PM
> To: Open MPI Users
> Subject: RE: [OMPI users] [EXTERNAL] Re: Op
27, 2021 3:37 PM
To: Open MPI Users
Cc: Heinz, Michael William
Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path
Unfortunately, OPA/PSM support for Debian isn't handled by Intel directly or by
Cornelis Networks - but I should point out you can download the l
On Wed, 27 Jan 2021 15:31:40 -0500
Michael Di Domenico via users wrote:
> if you have OPA cards, for openmpi you only need --with-ofi, you don't
> need psm/psm2/verbs/ucx.
I agree with Michael and would add for clarity that on the system you
always need PSM2 and optionally libfabric (if you go t
, Michael William
Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path
Unfortunately, OPA/PSM support for Debian isn't handled by Intel directly or by
Cornelis Networks - but I should point out you can download the latest official
source for PSM2 and the drivers from G
Sent: Wednesday, January 27, 2021 3:32 PM
To: Open MPI Users
Cc: Michael Di Domenico
Subject: Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path
if you have OPA cards, for openmpi you only need --with-ofi, you don't need
psm/psm2/verbs/ucx. but this assumes you're runni
if you have OPA cards, for openmpi you only need --with-ofi, you don't
need psm/psm2/verbs/ucx. but this assumes you're running a rhel based
distro and have installed the OPA fabric suite of software from
Intel/CornelisNetworks. which is what i have. perhaps there's
something really odd in debia
Hi Howard and Michael
first many thanks for testing with my short application. Yes, when the
test code runs fine it just show the max RSS size of rank 0 process.
When it runs wrong it put a messages about each invalid value found.
As I said, I have also deployed OpenMPI on various cluster (in DEL
Hi Folks,
I'm also have problems reproducing this on one of our OPA clusters:
libpsm2-11.2.78-1.el7.x86_64
libpsm2-devel-11.2.78-1.el7.x86_64
cluster runs RHEL 7.8
hca_id: hfi1_0
transport: InfiniBand (0)
fw_ver: 1.27.0
node_g