Here is some additional information about when using
`--with-hwloc=/usr` seems necessary.

Using the tarball for the released, stable

13bb410b52becbfa140f5791bd50d580  /sw/src/arcts/ompi/openmpi-1.10.7.tar.gz

and with

  $ ./configure --prefix=/sw/arcts/centos7/intel_14_0_2/openmpi/1.10.7
--mandir=/sw/arcts/centos7/intel_14_0_2/openmpi/1.10.7/share/man
--with-slurm --without-tm --with-verbs --with-hwloc --disable-dlopen
--enable-shared CC=icc CXX=icpc FC=ifort F77=ifort

config.log reports

configure:67748: checking hwloc.h usability
configure:67748: icc -std=gnu99 -c -O3 -DNDEBUG -finline-functions
-fno-strict-aliasing -restrict -Qoption,cpp,--extended_float_types
-pthread   conftest.c >&5
configure:67748: $? = 0
configure:67748: result: yes
configure:67748: checking hwloc.h presence
configure:67748: icc -E   conftest.c
configure:67748: $? = 0
configure:67748: result: yes
configure:67748: checking for hwloc.h
configure:67748: result: yes

but make reports

Making all in mca/hwloc
make[2]: Entering directory `/tmp/bennet/build/openmpi-1.10.7/opal/mca/hwloc'
  CC       base/hwloc_base_frame.lo
  CC       base/hwloc_base_util.lo
  CC       base/hwloc_base_dt.lo
  CC       base/hwloc_base_maffinity.lo
In file included from ../../../opal/mca/hwloc/hwloc.h(134),
                 from base/hwloc_base_frame.c(23):
../../../opal/mca/hwloc/external/external.h(20): catastrophic error:
cannot open source file "/include/hwloc.h"
  #include MCA_hwloc_external_header
                                    ^
compilation aborted for base/hwloc_base_frame.c (code 4)


It looks to me like configure is not prepending `/usr` to the path to
`/include/hwloc.h` when one uses the bare `--with-hwloc` on the
configure line, and therefore using `--with-hwloc=/usr` is called for
if one wants it included.

Thanks,    -- bennet

Note, this test fails, but it isn't directly related to finding hwloc.h.

configure:67840: result: looking for library without search path
configure:67842: checking for library containing hwloc_topology_init
configure:67873: icc -std=gnu99 -o conftest -O3 -DNDEBUG
-finline-functions -fno-strict-aliasing -restrict
-Qoption,cpp,--extended_float_types -pthread     conftest.c -lutil
>&5
/tmp/iccu8nO7c.o: In function `main':
conftest.c:(.text+0x35): undefined reference to `hwloc_topology_init'
configure:67873: $? = 1

with too much more output to include here.

Seems to be the same situation with the last available nightly build, as well.

bcea63d634d05c0f5a821ce75a1eb2b2  openmpi-v1.10-201705170239-5e373bf.tar.gz


On Sun, Feb 24, 2019 at 8:11 AM Bennet Fauber <ben...@umich.edu> wrote:
>
> HI, Gilles,
>
> With respect to your comment about not using --FOO=/usr....  It is bad
> practice, sure, and it should be unnecessary, but we have had at least
> one instance where it is also necessary for the requested feature to
> actually work.  The case I am thinking of was, in particular, OpenMPI
> 1.10.2, where OMPI did not properly bind processes to cores unless we
> built --with-hwloc=/usr.
>
> When it wasn't used, it mostly ran fine, but for one program, it would
> occasionally result in 'hopping processes' and very poor performance.
> After rebuilding --with-hwloc=/usr, that is no longer a problem.  It
> was difficult to pin down, because it did not seem to be an issue all
> the time, but maybe one in five dragging performance.
>
> So, while it may be the case that it should not be necessary and not
> be good practice, it may also be the case that sometimes it does seem
> to be necessary.
>
> -- bennet
>
> On Sun, Feb 24, 2019 at 5:21 AM Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
> >
> > Passant,
> >
> > The fix is included in PMIx 2.2.2
> >
> > The bug is in a public header file, so you might indeed have to
> > rebuild the SLURM plugin for PMIx.
> > I did not check the SLURM sources though, so assuming PMIx was built
> > as a shared library, there is still a chance
> > it might work even if you do not rebuild the SLURM plugin. I'd rebuild
> > at least the SLURM plugin for PMIx to be on the safe side though.
> >
> > Cheers,
> >
> > Gilles
> >
> > On Sun, Feb 24, 2019 at 4:07 PM Passant A. Hafez
> > <passant.ha...@kaust.edu.sa> wrote:
> > >
> > > Thanks Gilles.
> > >
> > > So do we have to rebuild Slurm after applying this patch?
> > >
> > > Another question, is this fix included in the PMIx 2.2.2 
> > > https://github.com/pmix/pmix/releases/tag/v2.2.2 ?
> > >
> > >
> > >
> > >
> > > All the best,
> > >
> > >
> > > ________________________________________
> > > From: users <users-boun...@lists.open-mpi.org> on behalf of Gilles 
> > > Gouaillardet <gilles.gouaillar...@gmail.com>
> > > Sent: Sunday, February 24, 2019 4:09 AM
> > > To: Open MPI Users
> > > Subject: Re: [OMPI users] Building PMIx and Slurm support
> > >
> > > Passant,
> > >
> > > you have to manually download and apply
> > > https://github.com/pmix/pmix/commit/2e2f4445b45eac5a3fcbd409c81efe318876e659.patch
> > > to PMIx 2.2.1
> > > that should likely fix your problem.
> > >
> > > As a side note,  it is a bad practice to configure --with-FOO=/usr
> > > since it might have some unexpected side effects.
> > > Instead, you can replace
> > >
> > > configure --with-slurm --with-pmix=/usr --with-pmi=/usr 
> > > --with-libevent=/usr
> > >
> > > with
> > >
> > > configure --with-slurm --with-pmix=external --with-pmi 
> > > --with-libevent=external
> > >
> > > to be on the safe side I also invite you to pass --with-hwloc=external
> > > to the configure command line
> > >
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > On Sun, Feb 24, 2019 at 1:54 AM Passant A. Hafez
> > > <passant.ha...@kaust.edu.sa> wrote:
> > > >
> > > > Hello Gilles,
> > > >
> > > > Here are some details:
> > > >
> > > > Slurm 18.08.4
> > > >
> > > > PMIx 2.2.1 (as shown in /usr/include/pmix_version.h)
> > > >
> > > > Libevent 2.0.21
> > > >
> > > > srun --mpi=list
> > > > srun: MPI types are...
> > > > srun: none
> > > > srun: openmpi
> > > > srun: pmi2
> > > > srun: pmix
> > > > srun: pmix_v2
> > > >
> > > > Open MPI versions tested: 4.0.0 and 3.1.2
> > > >
> > > >
> > > > For each installation to be mentioned a different MPI Hello World 
> > > > program was compiled.
> > > > Jobs were submitted by sbatch, 2 node * 2 tasks per node then srun 
> > > > --mpi=pmix program
> > > >
> > > > File 400ext_2x2.out (attached) is for OMPI 4.0.0 installation with 
> > > > configure options:
> > > > --with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr
> > > > and configure log:
> > > > Libevent support: external
> > > > PMIx support: External (2x)
> > > >
> > > > File 400int_2x2.out (attached) is for OMPI 4.0.0 installation with 
> > > > configure options:
> > > > --with-slurm --with-pmix
> > > > and configure log:
> > > > Libevent support: internal (external libevent version is less that 
> > > > internal version 2.0.22)
> > > > PMIx support: Internal
> > > >
> > > > Tested also different installations for 3.1.2 and got errors similar to 
> > > > 400ext_2x2.out
> > > > (NOT-SUPPORTED in file event/pmix_event_registration.c at line 101)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > All the best,
> > > > --
> > > > Passant A. Hafez | HPC Applications Specialist
> > > > KAUST Supercomputing Core Laboratory (KSL)
> > > > King Abdullah University of Science and Technology
> > > > Building 1, Al-Khawarizmi, Room 0123
> > > > Mobile : +966 (0) 55-247-9568
> > > > Mobile : +20 (0) 106-146-9644
> > > > Office  : +966 (0) 12-808-0367
> > > >
> > > > ________________________________________
> > > > From: users <users-boun...@lists.open-mpi.org> on behalf of Gilles 
> > > > Gouaillardet <gilles.gouaillar...@gmail.com>
> > > > Sent: Saturday, February 23, 2019 5:17 PM
> > > > To: Open MPI Users
> > > > Subject: Re: [OMPI users] Building PMIx and Slurm support
> > > >
> > > > Hi,
> > > >
> > > > PMIx has cross-version compatibility, so as long as the PMIx library
> > > > used by SLURM is compatible with the one (internal or external) used
> > > > by Open MPI, you should be fine.
> > > > If you want to minimize the risk of cross-version incompatibility,
> > > > then I encourage you to use the same (and hence external) PMIx that
> > > > was used to build SLURM with Open MPI.
> > > >
> > > > Can you tell a bit more than "it didn't work" ?
> > > > (Open MPI version, PMIx version used by SLURM, PMIx version used by
> > > > Open MPI, error message, ...)
> > > >
> > > > Cheers,
> > > >
> > > > Gilles
> > > >
> > > > On Sat, Feb 23, 2019 at 9:46 PM Passant A. Hafez
> > > > <passant.ha...@kaust.edu.sa> wrote:
> > > > >
> > > > >
> > > > > Good day everyone,
> > > > >
> > > > > I've trying to build and use the PMIx support for Open MPI but I 
> > > > > tried many things that I can list if needed, but with no luck.
> > > > > I was able to test the PMIx client but when I used OMPI specifying 
> > > > > srun --mpi=pmix it didn't work.
> > > > >
> > > > > So if you please advise me with the versions of each PMIx and Open 
> > > > > MPI that should be working well with Slurm 18.08, it'd be great.
> > > > >
> > > > > Also, what is the difference between using internal vs external PMIx 
> > > > > installations?
> > > > >
> > > > >
> > > > >
> > > > > All the best,
> > > > >
> > > > > --
> > > > >
> > > > > Passant A. Hafez | HPC Applications Specialist
> > > > > KAUST Supercomputing Core Laboratory (KSL)
> > > > > King Abdullah University of Science and Technology
> > > > > Building 1, Al-Khawarizmi, Room 0123
> > > > > Mobile : +966 (0) 55-247-9568
> > > > > Mobile : +20 (0) 106-146-9644
> > > > > Office  : +966 (0) 12-808-0367
> > > > > _______________________________________________
> > > > > users mailing list
> > > > > users@lists.open-mpi.org
> > > > > https://lists.open-mpi.org/mailman/listinfo/users
> > > > _______________________________________________
> > > > users mailing list
> > > > users@lists.open-mpi.org
> > > > https://lists.open-mpi.org/mailman/listinfo/users
> > > > _______________________________________________
> > > > users mailing list
> > > > users@lists.open-mpi.org
> > > > https://lists.open-mpi.org/mailman/listinfo/users
> > > _______________________________________________
> > > users mailing list
> > > users@lists.open-mpi.org
> > > https://lists.open-mpi.org/mailman/listinfo/users
> > > _______________________________________________
> > > users mailing list
> > > users@lists.open-mpi.org
> > > https://lists.open-mpi.org/mailman/listinfo/users
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to