If you don't have ibv_devinfo installed on your compute nodes, then you likely 
don't have the verbs package installed at all on your compute nodes.  That's 
why you're getting errors about not finding libibverbs.so.


- It sounds like Open MPI was able to find libibverbs.so when it was built.  So 
whatever node you were on when you configured/compiled/installed Open MPI, that 
node had libibverbs.so (and friends) installed properly, Open MPI found them 
during configure/make, and therefore it built/installed support for verbs.

- But then you're running that installed Open MPI on nodes where libibverbs.so 
potentially is not available (e.g., that package was not installed), so Open 
MPI fails to load the verbs-based plugins (because they need libibverbs.so), 
and therefore Open MPI emits warnings about that.

The same may well be true for the crypto libraries.

(This is a more expanded version of what I said in 
https://www.mail-archive.com/users@lists.open-mpi.org/msg32727.html and 

> On Oct 10, 2018, at 5:02 PM, Castellana Michele <michele.castell...@curie.fr> 
> wrote:
> Dear John, 
> I see, thank you for your reply. Unfortunately the cluster support is of poor 
> quality, and it would take a while to get this information from them. Is 
> there any way in which I can check this by myself? Also, it looks like 
> ibv_devinfo does not exist on the cluster
> $ ibv_devinfo
> -bash: ibv_devinfo: command not found
> Best,
> Michele
>> On Oct 9, 2018, at 5:53 PM, John Hearns <hear...@googlemail.com> wrote:
>> Michele, as other have said  libibverbs.so.1  is not in your library path.
>> Can you ask the person who manages yoru cluster where libibverbs is
>> located on the compute nodes?
>> Also try to run    ibv_devinfo
>> On Tue, 9 Oct 2018 at 16:03, Castellana Michele
>> <michele.castell...@curie.fr> wrote:
>>> Dear John,
>>> Thank you for your reply. Here is the output of ldd
>>> $ ldd ./code.io
>>> linux-vdso.so.1 =>  (0x00007ffcc759f000)
>>> liblapack.so.3 => /usr/lib64/liblapack.so.3 (0x00007fbc1c613000)
>>> libgsl.so.0 => /usr/lib64/libgsl.so.0 (0x00007fbc1c1ea000)
>>> libgslcblas.so.0 => /usr/lib64/libgslcblas.so.0 (0x00007fbc1bfad000)
>>> libmpi.so.40 => /data/users/xx/openmpi/lib/libmpi.so.40 (0x00007fbc1bcad000)
>>> libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fbc1b9a6000)
>>> libm.so.6 => /usr/lib64/libm.so.6 (0x00007fbc1b6a4000)
>>> libgcc_s.so.1 => /usr/lib64/libgcc_s.so.1 (0x00007fbc1b48e000)
>>> libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007fbc1b272000)
>>> libc.so.6 => /usr/lib64/libc.so.6 (0x00007fbc1aea5000)
>>> libblas.so.3 => /usr/lib64/libblas.so.3 (0x00007fbc1ac4c000)
>>> libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007fbc1a92a000)
>>> libsatlas.so.3 => /usr/lib64/atlas/libsatlas.so.3 (0x00007fbc19cdd000)
>>> libopen-rte.so.40 => /data/users/xx/openmpi/lib/libopen-rte.so.40 
>>> (0x00007fbc19a2d000)
>>> libopen-pal.so.40 => /data/users/xx/openmpi/lib/libopen-pal.so.40 
>>> (0x00007fbc19733000)
>>> libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007fbc1952f000)
>>> librt.so.1 => /usr/lib64/librt.so.1 (0x00007fbc19327000)
>>> libutil.so.1 => /usr/lib64/libutil.so.1 (0x00007fbc19124000)
>>> libz.so.1 => /usr/lib64/libz.so.1 (0x00007fbc18f0e000)
>>> /lib64/ld-linux-x86-64.so.2 (0x00007fbc1cd70000)
>>> libquadmath.so.0 => /usr/lib64/libquadmath.so.0 (0x00007fbc18cd2000)
>>> and the one for the PBS version
>>> $   qstat --version
>>> Version: 6.1.2
>>> Commit: 661e092552de43a785c15d39a3634a541d86898e
>>> After I created the symbolic links libcrypto.so.0.9.8  libssl.so.0.9.8, I 
>>> still have one error message left from MPI:
>>> mca_base_component_repository_open: unable to open mca_btl_openib: 
>>> libibverbs.so.1: cannot open shared object file: No such file or directory 
>>> (ignored)
>>> Please let me know if you have any suggestions.
>>> Best,
>>> On Oct 4, 2018, at 3:12 PM, John Hearns via users 
>>> <users@lists.open-mpi.org> wrote:
>>> Michele, the command is   ldd ./code.io
>>> I just Googled - ldd  means List dynamic Dependencies
>>> To find out the PBS batch system type - that is a good question!
>>> Try this:     qstat --version
>>> On Thu, 4 Oct 2018 at 10:12, Castellana Michele
>>> <michele.castell...@curie.fr> wrote:
>>> Dear John,
>>> Thank you for your reply. I have tried
>>> ldd mpirun ./code.o
>>> but I get an error message, I do not know what is the proper syntax to use 
>>> ldd command. Here is the information about the Linux version
>>> $ cat /etc/os-release
>>> NAME="CentOS Linux"
>>> VERSION="7 (Core)"
>>> ID="centos"
>>> ID_LIKE="rhel fedora"
>>> VERSION_ID="7"
>>> PRETTY_NAME="CentOS Linux 7 (Core)"
>>> ANSI_COLOR="0;31"
>>> CPE_NAME="cpe:/o:centos:centos:7"
>>> HOME_URL="https://www.centos.org/";
>>> BUG_REPORT_URL="https://bugs.centos.org/";
>>> May you please tell me how to check whether the batch system is PBSPro or 
>>> OpenPBS?
>>> Best,
>>> On Oct 4, 2018, at 10:30 AM, John Hearns via users 
>>> <users@lists.open-mpi.org> wrote:
>>> Michele  one tip:   log into a compute node using ssh and as your own 
>>> username.
>>> If you use the Modules envirnonment then load the modules you use in
>>> the job script
>>> then use the  ldd  utility to check if you can load all the libraries
>>> in the code.io executable
>>> Actually you are better to submit a short batch job which does not use
>>> mpirun but uses ldd
>>> A proper batch job will duplicate the environment you wish to run in.
>>>  ldd ./code.io
>>> By the way, is the batch system PBSPro or OpenPBS?  Version 6 seems a bit 
>>> old.
>>> Can you say what version of Redhat or CentOS this cluster is installed with?
>>> On Thu, 4 Oct 2018 at 00:02, Castellana Michele
>>> <michele.castell...@curie.fr> wrote:
>>> I fixed it, the correct file was in /lib64, not in /lib.
>>> Thank you for your help.
>>> On Oct 3, 2018, at 11:30 PM, Castellana Michele 
>>> <michele.castell...@curie.fr> wrote:
>>> Thank you, I found some libcrypto files in /usr/lib indeed:
>>> $ ls libcry*
>>> libcrypt-2.17.so  libcrypto.so.10  libcrypto.so.1.0.2k  libcrypt.so.1
>>> but I could not find libcrypto.so.0.9.8. Here they suggest to create a 
>>> hyperlink, but if I do I still get an error from MPI. Is there another way 
>>> around this?
>>> Best,
>>> On Oct 3, 2018, at 11:00 PM, Jeff Squyres (jsquyres) via users 
>>> <users@lists.open-mpi.org> wrote:
>>> It's probably in your Linux distro somewhere -- I'd guess you're missing a 
>>> package (e.g., an RPM or a deb) out on your compute nodes...?
>>> On Oct 3, 2018, at 4:24 PM, Castellana Michele 
>>> <michele.castell...@curie.fr> wrote:
>>> Dear Ralph,
>>> Thank you for your reply. Do you know where I could find libcrypto.so.0.9.8 
>>> ?
>>> Best,
>>> On Oct 3, 2018, at 9:41 PM, Ralph H Castain <r...@open-mpi.org> wrote:
>>> Actually, I see that you do have the tm components built, but they cannot 
>>> be loaded because you are missing libcrypto from your LD_LIBRARY_PATH
>>> On Oct 3, 2018, at 12:33 PM, Ralph H Castain <r...@open-mpi.org> wrote:
>>> Did you configure OMPI —with-tm=<path-to-PBS-libs>? It looks like we didn’t 
>>> build PBS support and so we only see one node with a single slot allocated 
>>> to it.
>>> On Oct 3, 2018, at 12:02 PM, Castellana Michele 
>>> <michele.castell...@curie.fr> wrote:
>>> Dear all,
>>> I am having trouble running an MPI code across multiple cores on a new 
>>> computer cluster, which uses PBS. Here is a minimal example, where I want 
>>> to run two MPI processes, each on  a different node. The PBS script is
>>> #!/bin/bash
>>> #PBS -l walltime=00:01:00
>>> #PBS -l mem=1gb
>>> #PBS -l nodes=2:ppn=1
>>> #PBS -q batch
>>> #PBS -N test
>>> mpirun -np 2 ./code.o
>>> and when I submit it with
>>> $qsub script.sh
>>> I get the following message in the PBS error file
>>> $ cat test.e1234
>>> [shbli040:08879] mca_base_component_repository_open: unable to open 
>>> mca_plm_tm: libcrypto.so.0.9.8: cannot open shared object file: No such 
>>> file or directory (ignored)
>>> [shbli040:08879] mca_base_component_repository_open: unable to open 
>>> mca_oob_ud: libibverbs.so.1: cannot open shared object file: No such file 
>>> or directory (ignored)
>>> [shbli040:08879] mca_base_component_repository_open: unable to open 
>>> mca_ras_tm: libcrypto.so.0.9.8: cannot open shared object file: No such 
>>> file or directory (ignored)
>>> --------------------------------------------------------------------------
>>> There are not enough slots available in the system to satisfy the 2 slots
>>> that were requested by the application:
>>> ./code.o
>>> Either request fewer slots for your application, or make more slots 
>>> available
>>> for use.
>>> —————————————————————————————————————
>>> The PBS version is
>>> $ qstat --version
>>> Version: 6.1.2
>>> and here is some additional information on the MPI version
>>> $ mpicc -v
>>> Using built-in specs.
>>> COLLECT_GCC=/bin/gcc
>>> COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
>>> Target: x86_64-redhat-linux
>>> […]
>>> Thread model: posix
>>> gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
>>> Do you guys know what may be the issue here?
>>> Thank you
>>> Best,
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

Jeff Squyres

users mailing list

Reply via email to