Giles,

I’ll double check - but the intel runtime is installed on all machines in the 
fabric.

-----
Michael Heinz
michael.william.he...@cornelisnetworks.com<mailto:michael.william.he...@cornelisnetworks.com>

On Apr 7, 2021, at 2:42 AM, Gilles Gouaillardet via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:

Michael,

orted is able to find its dependencies to the Intel runtime on the
host where you sourced the environment.
However, it is unlikely able to do it on a remote host
For example
ssh ... ldd `which opted`
will likely fail.

An option is to use -rpath (and add the path to the Intel runtime).
IIRC, there is also an option in the Intel compiler to statically link
to the runtime.

Cheers,

Gilles

On Wed, Apr 7, 2021 at 9:00 AM Heinz, Michael William via users
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:

I’m having a heck of a time building OMPI with Intel C. Compilation goes fine, 
installation goes fine, compiling test apps (the OSU benchmarks) goes fine…



but when I go to actually run an MPI app I get:



[awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/mpirun -np 2 -H 
awbp025,awbp026,awbp027,awbp028 -x FI_PROVIDER=opa1x -x 
LD_LIBRARY_PATH=/usr/mpi/icc/openmpi-icc/lib64:/lib hostname

/usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: 
libimf.so: cannot open shared object file: No such file or directory

/usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: 
libimf.so: cannot open shared object file: No such file or directory



Looking at orted, it does seem like the binary is linking correctly:



[awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/orted

[awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file 
ess_env_module.c at line 135

[awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
util/session_dir.c at line 107

[awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
util/session_dir.c at line 346

[awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in file 
base/ess_base_std_orted.c at line 264

--------------------------------------------------------------------------

It looks like orte_init failed for some reason; your parallel process is

likely to abort.  There are many reasons that a parallel process can

fail during orte_init; some of which are due to configuration or

environment problems.  This failure appears to be an internal failure;

here's some additional information (which may only be relevant to an

Open MPI developer):



 orte_session_dir failed

 --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS

--------------------------------------------------------------------------



and…



[awbp025:~/work/osu-icc](N/A)$ ldd /usr/mpi/icc/openmpi-icc/bin/orted

       linux-vdso.so.1 (0x00007fffc2ebf000)

       libopen-rte.so.40 => /usr/mpi/icc/openmpi-icc/lib/libopen-rte.so.40 
(0x00007fdaa6404000)

       libopen-pal.so.40 => /usr/mpi/icc/openmpi-icc/lib/libopen-pal.so.40 
(0x00007fdaa60bd000)

       libopen-orted-mpir.so => 
/usr/mpi/icc/openmpi-icc/lib/libopen-orted-mpir.so (0x00007fdaa5ebb000)

       libm.so.6 => /lib64/libm.so.6 (0x00007fdaa5b39000)

       librt.so.1 => /lib64/librt.so.1 (0x00007fdaa5931000)

       libutil.so.1 => /lib64/libutil.so.1 (0x00007fdaa572d000)

       libz.so.1 => /lib64/libz.so.1 (0x00007fdaa5516000)

       libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fdaa52fe000)

       libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdaa50de000)

       libc.so.6 => /lib64/libc.so.6 (0x00007fdaa4d1b000)

       libdl.so.2 => /lib64/libdl.so.2 (0x00007fdaa4b17000)

       libimf.so => 
/opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libimf.so
 (0x00007fdaa4494000)

       libsvml.so => 
/opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libsvml.so
 (0x00007fdaa29c4000)

       libirng.so => 
/opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libirng.so
 (0x00007fdaa2659000)

       libintlc.so.5 => 
/opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libintlc.so.5
 (0x00007fdaa23e1000)

       /lib64/ld-linux-x86-64.so.2 (0x00007fdaa66d6000)



Can anyone suggest what I’m forgetting to do?



---

Michael Heinz
Fabric Software Engineer, Cornelis Networks



Reply via email to