Daniel,

keep in mind PMIx was designed with cross-version compatibility in mind,

so a PMIx 3.0.2 client (read Open MPI 4.0.0 app with the internal 3.0.2 PMIx) should be able

to interact with a PMIx 3.1.2 server (read SLURM pmix plugin built on top of PMIx 3.1.2).

So unless you have a specific reason not to mix both, you might also give the internal PMIx a try.


The 4.0.1 release candidate 1 was released a few days ago, and based on the feedback we receive,

the final 4.0.1 should be released in a very near future.


Cheers,


Gilles

On 3/4/2019 1:08 AM, Daniel Letai wrote:

Sent from my iPhone

On 3 Mar 2019, at 16:31, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote:

Daniel,

PMIX_MODEX and PMIX_INFO_ARRAY have been removed from PMIx 3.1.2, and
Open MPI 4.0.0 was not ready for this.

You can either use the internal PMIx (3.0.2), or try 4.0.1rc1 (with
the external PMIx 3.1.2) that was published a few days ago.

Thanks, will try that tomorrow. I can’t use internal due to Slurm dependency, but I will try the rc.
Any idea when 4.0.1 will be released?

FWIW, you are right using --with-pmix=external (and not using --with-pmix=/usr)

Cheers,

Gilles

On Sun, Mar 3, 2019 at 10:57 PM Daniel Letai <d...@letai.org.il> wrote:

Hello,


I have built the following stack :

centos 7.5 (gcc 4.8.5-28, libevent 2.0.21-4)
MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz built with --all --without-32bit (this includes ucx 1.5.0)
hwloc from centos 7.5 : 1.11.8-4.el7
pmix 3.1.2
slurm 18.08.5-2 built --with-ucx --with-pmix
openmpi 4.0.0 : configure --with-slurm --with-pmix=external --with-pmi --with-libevent=external --with-hwloc=external --with-knem=/opt/knem-1.1.3.90mlnx1 --with-hcoll=/opt/mellanox/hcoll

The configure part succeeds, however 'make' errors out with:

ext3x.c: In function 'ext3x_value_unload':

ext3x.c:1109:10: error: 'PMIX_MODEX' undeclared (first use in this function)


And same for 'PMIX_INFO_ARRAY'


However, both are declared in the opal/mca/pmix/pmix3x/pmix/include/pmix_common.h file.

opal/mca/pmix/ext3x/ext3x.c does include pmix_common.h but as a system include #include <pmix_common> , while ext3x.h includes it as a local include #include "pmix_common". Neither seem to pull from the correct path.


Regards,

Dani_L.


On 2/24/19 3:09 AM, Gilles Gouaillardet wrote:

Passant,

you have to manually download and apply
https://github.com/pmix/pmix/commit/2e2f4445b45eac5a3fcbd409c81efe318876e659.patch
to PMIx 2.2.1
that should likely fix your problem.

As a side note, it is a bad practice to configure --with-FOO=/usr
since it might have some unexpected side effects.
Instead, you can replace

configure --with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr

with

configure --with-slurm --with-pmix=external --with-pmi --with-libevent=external

to be on the safe side I also invite you to pass --with-hwloc=external
to the configure command line


Cheers,

Gilles

On Sun, Feb 24, 2019 at 1:54 AM Passant A. Hafez
<passant.ha...@kaust.edu.sa> wrote:

Hello Gilles,

Here are some details:

Slurm 18.08.4

PMIx 2.2.1 (as shown in /usr/include/pmix_version.h)

Libevent 2.0.21

srun --mpi=list
srun: MPI types are...
srun: none
srun: openmpi
srun: pmi2
srun: pmix
srun: pmix_v2

Open MPI versions tested: 4.0.0 and 3.1.2


For each installation to be mentioned a different MPI Hello World program was compiled. Jobs were submitted by sbatch, 2 node * 2 tasks per node then srun --mpi=pmix program

File 400ext_2x2.out (attached) is for OMPI 4.0.0 installation with configure options:
--with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr
and configure log:
Libevent support: external
PMIx support: External (2x)

File 400int_2x2.out (attached) is for OMPI 4.0.0 installation with configure options:
--with-slurm --with-pmix
and configure log:
Libevent support: internal (external libevent version is less that internal version 2.0.22)
PMIx support: Internal

Tested also different installations for 3.1.2 and got errors similar to 400ext_2x2.out
(NOT-SUPPORTED in file event/pmix_event_registration.c at line 101)





All the best,
--
Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office : +966 (0) 12-808-0367

________________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
Sent: Saturday, February 23, 2019 5:17 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building PMIx and Slurm support

Hi,

PMIx has cross-version compatibility, so as long as the PMIx library
used by SLURM is compatible with the one (internal or external) used
by Open MPI, you should be fine.
If you want to minimize the risk of cross-version incompatibility,
then I encourage you to use the same (and hence external) PMIx that
was used to build SLURM with Open MPI.

Can you tell a bit more than "it didn't work" ?
(Open MPI version, PMIx version used by SLURM, PMIx version used by
Open MPI, error message, ...)

Cheers,

Gilles

On Sat, Feb 23, 2019 at 9:46 PM Passant A. Hafez
<passant.ha...@kaust.edu.sa> wrote:

Good day everyone,

I've trying to build and use the PMIx support for Open MPI but I tried many things that I can list if needed, but with no luck. I was able to test the PMIx client but when I used OMPI specifying srun --mpi=pmix it didn't work.

So if you please advise me with the versions of each PMIx and Open MPI that should be working well with Slurm 18.08, it'd be great.

Also, what is the difference between using internal vs external PMIx installations?



All the best,

--

Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office : +966 (0) 12-808-0367
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to