Hmm, trying to keep this all straight. It turns out I had the module loaded for the nv-hpc
compilers, version 21.7, instead of 21.9 in what I thought was asuccesful build.
The nuance is that with the version 21.9 compiler suite, adding this to
configure,
--with-platform=/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/contrib/platform/mellanox/optimized
brings back an error in avx.
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0): multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
first defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o): In function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651: multiple
definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi'
make: *** [all-recursive] Error 1
Similar, to the error without --enable-mca-no-build=op-avx, but appears to referencing a different
line in the source file.
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0): multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
first defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o): In function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:54: multiple
definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:54:
first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9-clean/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9-clean/ompi'
make: *** [all-recursive] Error 1
For reference, ompi_info from the two succesful builds.
NV-HPC Version 21.7
+++++++++++++++++++
ompi_info
Package: Open MPI muno@loki.local Distribution
Open MPI: 4.1.1
Open MPI repo revision: v4.1.1
Open MPI release date: Apr 24, 2021
Open RTE: 4.1.1
Open RTE repo revision: v4.1.1
Open RTE release date: Apr 24, 2021
OPAL: 4.1.1
OPAL repo revision: v4.1.1
OPAL release date: Apr 24, 2021
MPI API: 3.1.0
Ident string: 4.1.1
Prefix: /stage/opt/OpenMPI/ROME/4.1.1/NV-HPC/21.7
Configured architecture: x86_64-pc-linux-gnu
Configure host: loki.local
Configured by: muno
Configured on: Thu Sep 30 18:12:10 UTC 2021
Configure host: loki.local
Configure command line: 'CC=nvc' 'CXX=nvc++' 'FC=nvfortran' 'FCFLAGS=-fPIC'
'--enable-mca-no-build=op-avx'
'--prefix=/stage/opt/OpenMPI/ROME/4.1.1/NV-HPC/21.7'
'--with-libevent=internal'
'--enable-mpi1-compatibility' '--without-xpmem'
'--with-pmi' '--enable-mpi-cxx'
'--with-hwloc=/stage/opt/HWLOC/2.5.0'
'--with-hcoll=/opt/mellanox/hcoll'
'--with-knem=/opt/knem-1.1.4.90mlnx1'
'--with-cuda=/stage/opt/NV_hpc_sdk/Linux_x86_64/21.9/cuda'
'--with-platform=/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/contrib/platform/mellanox/optimized'
Built by: muno
Built on: Thu Sep 30 18:28:56 UTC 2021
Built host: loki.local
C bindings: yes
C++ bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the nvfortran compiler and/or Open
MPI, does not support the following: array
subsections, direct passthru (where possible) to
underlying Open MPI's C functionality
Fort mpi_f08 subarrays: no
Java bindings: no
Wrapper compiler rpath: runpath
C compiler: nvc
C compiler absolute:
/stage/opt/NV_hpc_sdk/Linux_x86_64/21.7/compilers/bin/nvc
C compiler family name: PGI
C compiler version: 21.7-0
C++ compiler: nvc++
C++ compiler absolute:
/stage/opt/NV_hpc_sdk/Linux_x86_64/21.7/compilers/bin/nvc++
Fort compiler: nvfortran
Fort compiler abs:
/stage/opt/NV_hpc_sdk/Linux_x86_64/21.7/compilers/bin/nvfortran
NV-HPC Version 21.9
+++++++++++++++++++
ompi_info
Package: Open MPI muno@loki.local Distribution
Open MPI: 4.1.1
Open MPI repo revision: v4.1.1
Open MPI release date: Apr 24, 2021
Open RTE: 4.1.1
Open RTE repo revision: v4.1.1
Open RTE release date: Apr 24, 2021
OPAL: 4.1.1
OPAL repo revision: v4.1.1
OPAL release date: Apr 24, 2021
MPI API: 3.1.0
Ident string: 4.1.1
Prefix: /stage/opt/OpenMPI/ROME/4.1.1/NV-HPC/21.9
Configured architecture: x86_64-pc-linux-gnu
Configure host: loki.local
Configured by: muno
Configured on: Thu Sep 30 17:59:36 UTC 2021
Configure host: loki.local
Configure command line: 'CC=nvc' 'CXX=nvc++' 'FC=nvfortran' 'FCFLAGS=-fPIC'
'--enable-mca-no-build=op-avx'
'--prefix=/stage/opt/OpenMPI/ROME/4.1.1/NV-HPC/21.9'
'--with-libevent=internal'
'--enable-mpi1-compatibility' '--without-xpmem'
'--with-pmi' '--enable-mpi-cxx'
'--with-hwloc=/stage/opt/HWLOC/2.5.0'
'--with-hcoll=/opt/mellanox/hcoll'
'--with-knem=/opt/knem-1.1.4.90mlnx1'
'--with-cuda=/stage/opt/NV_hpc_sdk/Linux_x86_64/21.7/cuda'
Built by: muno
Built on: Thu Sep 30 18:17:02 UTC 2021
Built host: loki.local
C bindings: yes
C++ bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the nvfortran compiler and/or Open
MPI, does not support the following: array
subsections, direct passthru (where possible) to
underlying Open MPI's C functionality
Fort mpi_f08 subarrays: no
Java bindings: no
Wrapper compiler rpath: runpath
C compiler: nvc
C compiler absolute:
/stage/opt/NV_hpc_sdk/Linux_x86_64/21.9/compilers/bin/nvc
C compiler family name: PGI
C compiler version: 21.9-0
C++ compiler: nvc++
C++ compiler absolute:
/stage/opt/NV_hpc_sdk/Linux_x86_64/21.9/compilers/bin/nvc++
Fort compiler: nvfortran
On 9/30/21 12:18 PM, Ray Muno via users wrote:
OK, starting clean.
OS CentOS 7.9 (7.9.2009)
mlnxofed 5.4-1.0.3.0
UCX 1.11.0 (from mlnxofed)
hcoll-4.7.3199 (from mlnxofed)
knem-1.1.4.90 (from mlnxofed)
nVidia HPC-SDK 21.9
OpenMPI 4.1.1
HWLOC 2.5.0
Straight configure
configure CC=nvc CXX=nvc++ FC=nvfortran
dies in
FCLD libmpi_usempif08.la
/usr/bin/ld: .libs/comm_spawn_multiple_f08.o: relocation R_X86_64_32S against `.rodata' can not be
used when making a shared object; recompile with -fPIC
configure CC=nvc CXX=nvc++ FC=nvfortran FCFLAGS='-fPIC'
Fixes that, dies in
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0): multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
first defined here
configure CC=nvc CXX=nvc++ FC=nvfortran FCFLAGS=-fPIC
--enable-mca-no-build=op-avx
succeeds
And, working up to what I really want, I can build (soemwhat emulating the
HPC-X 2.9.0 build)
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/configure CC=nvc CXX=nvc++ FC=nvfortran FCFLAGS=-fPIC
--enable-mca-no-build=op-avx --prefix=/stage/opt/OpenMPI/ROME/4.1.1/NV-HPC/21.9
--with-libevent=internal --enable-mpi1-compatibility --without-xpmem --with-pmi --enable-mpi-cxx
--with-hwloc=/stage/opt/HWLOC/2.5.0 --with-hcoll=/opt/mellanox/hcoll
--with-knem=/opt/knem-1.1.4.90mlnx1 --with-cuda=/stage/opt/NV_hpc_sdk/Linux_x86_64/21.9/cuda
adding
--with-platform=/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/contrib/platform/mellanox/optimized
brings back
/usr/bin/ld: final link failed: Bad value
make[2]: *** [libmpi_usempif08.la] Error 2
Since it overrides the FCFLAGS command line setting apparently.
Editing the file to add -fPIC to FCLAGS took care of that as did the FC='nvfortran -fPIC' (which is
kludgey).
-Ray Muno
On 9/30/21 8:13 AM, Gilles Gouaillardet via users wrote:
Ray,
there is a typo, the configure option is
--enable-mca-no-build=op-avx
Cheers,
Gilles
----- Original Message -----
Added -*-enable-mca-no-build=op-avx *to the configure line. Still dies in
the same place.
history | grep config
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0):
multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0): first
defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
In function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
multiple
definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi'
make: *** [all-recursive] Error 1
On 9/30/21 5:54 AM, Carl Ponder wrote:
For now, you can suppress this error building OpenMPI 4.1.1
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0):
multiple definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
first
defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o): In
function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
multiple definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
first defined here
with the NVHPC/PGI 21.9 compiler by using the setting
configure -*-enable-mca-no-build=op-avx* ...
We're still looking at the cause here. I don't have any advice about the
problem with 21.7.
----------------------------------------------------------------------------------------------------
Subject: Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia HPC-SDk,
build hints?
Date: Wed, 29 Sep 2021 12:25:43 -0500
From: Ray Muno via users <users@lists.open-mpi.org>
Reply-To: Open MPI Users <users@lists.open-mpi.org>
To: users@lists.open-mpi.org
CC: Ray Muno <m...@aem.umn.edu>
External email: Use caution opening links or attachments
Tried this
configure CC='nvc -fPIC' CXX='nvc++ -fPIC' FC='nvfortran -fPIC'
Configure completes. Compiles quite a way through. Dies in a different place. It does get
past the
first error, however with libmpi_usempif08.la
FCLD libmpi_usempif08.la
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/use-mpi-f08'
Making all in mpi/fortran/mpiext-use-mpi-f08
make[2]: Entering directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/mpiext-use-mpi-f08'
PPFC mpi-f08-ext-module.lo
FCLD libforce_usempif08_module_to_be_built.la
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mpi/fortran/mpiext-use-mpi-f08'
Dies here now.
CCLD liblocal_ops_avx512.la
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):(.data+0x0):
multiple
definition of `ompi_op_avx_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):(.data+0x0):
first
defined here
./.libs/liblocal_ops_avx512.a(liblocal_ops_avx512_la-op_avx_functions.o):
In function
`ompi_op_avx_2buff_min_uint16_t_avx2':
/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
multiple
definition of `ompi_op_avx_3buff_functions_avx2'
./.libs/liblocal_ops_avx2.a(liblocal_ops_avx2_la-op_avx_functions.o):/project/muno/OpenMPI/BUILD/SRC/openmpi-4.1.1/ompi/mca/op/avx/op_avx_functions.c:651:
first defined here
make[2]: *** [mca_op_avx.la] Error 2
make[2]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi/mca/op/avx'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/project/muno/OpenMPI/BUILD/4.1.1/ROME/NV-HPC/21.9/ompi'
make: *** [all-recursive] Error 1
On 9/29/21 11:42 AM, Bennet Fauber via users wrote:
Ray,
If all the errors about not being compiled with -fPIC are still appearing, there may be a
bug that
is preventing the option from getting through to the compiler(s). It might be worth looking
through
the logs to see the full compile command for one or more of them to see whether that is
true? Say,
libs/comm_spawn_multiple_f08.o for example?
If -fPIC is missing, you may be able to recompile that manually with the -fPIC in place,
then remake
and see if that also causes the link error to go away, that would be a good
start.
Hope this helps, -- bennet
On Wed, Sep 29, 2021 at 12:29 PM Ray Muno via users
<users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>> wrote:
I did try that and it fails at the same place.
Which version of the nVidia HPC-SDK are you using? I a m using 21.7. I see there is an
upgrade to
21.9, which came out since I installed. I have that installed and will try to see if
they changed
anything. Not much in the releases notes to indicate any major changes.
-Ray Muno
On 9/29/21 10:54 AM, Jing Gong wrote:
> Hi,
>
>
> Before Nvidia persons look into details,pProbably you can try to add the flag
"-fPIC" to the
> nvhpc compiler likes cc="nvc -fPIC", which at least worked with me.
>
>
>
> /Jing
>
>
----------------------------------------------------------------------------------------------------
> *From:* users <users-boun...@lists.open-mpi.org
<mailto:users-boun...@lists.open-mpi.org>> on
behalf of Ray Muno via users
> <users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>>
> *Sent:* Wednesday, September 29, 2021 17:22
> *To:* Open MPI User's List
> *Cc:* Ray Muno
> *Subject:* Re: [OMPI users] OpenMPI 4.1.1, CentOS 7.9, nVidia
HPC-SDk, build hints?
> Thanks, I looked through previous emails here in the user list. Iguess I need to
subscribe
to the
> Developers list.
>
> -Ray Muno
>
> On 9/29/21 9:58 AM, Jeff Squyres (jsquyres) wrote:
>> Ray --
>>
>> Looks like this is a dup
ofhttps://github.com/open-mpi/ompi/issues/8919
<https://github.com/open-mpi/ompi/issues/8919>
<https://github.com/open-mpi/ompi/issues/8919
<https://github.com/open-mpi/ompi/issues/8919>>
>> <https://github.com/open-mpi/ompi/issues/8919
<https://github.com/open-mpi/ompi/issues/8919>
<https://github.com/open-mpi/ompi/issues/8919
<https://github.com/open-mpi/ompi/issues/8919>>>.
>>
>>
>
--
Ray Muno
IT Systems Administrator
e-mail:m...@umn.edu
<mailto:m...@umn.edu>
University of Minnesota
Aerospace Engineering and Mechanics
--
Ray Muno
IT Systems Administrator
e-mail: m...@umn.edu
Phone: (612) 625-9531
University of Minnesota
Aerospace Engineering and Mechanics
110 Union St. S.E.
Minneapolis, MN 55455
-- Ray Muno
Computer Systems Administrator
e-mail:m...@aem.umn.edu
Phone: (612) 625-9531
University of Minnesota
Aerospace Engineering and Mechanics
110 Union St. S.E.
Minneapolis, MN 55455
--
Ray Muno
IT Systems Administrator
e-mail: m...@umn.edu
Phone: (612) 625-9531
University of Minnesota
Aerospace Engineering and Mechanics
110 Union St. S.E.
Minneapolis, MN 55455