This email starts out talking about version 1.10.7 to give a complete picture. I tested 2.1.3 as well, it also exhibits this issue, although to a lesser extent though, and am asking for help on that release.
I was compiling the OpenMPI 1.10.7 shipped with NixOS against a newer libibverbs with a large set of drivers and get some strange errors when when running opmi_info (I've replaced the common prefix /nix/store/9zm0pqsh67fw0xi5cpnybnd7hgzryffs-openmpi-1.10.7 with ...) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_btl_openib: .../lib/openmpi/mca_btl_openib.so: undefined symbol: mca_mpool_grdma_evict (ignored) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_fcoll_individual: .../lib/openmpi/mca_fcoll_individual.so: undefined symbol: mca_io_ompio_file_write (ignored) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_fcoll_ylib: .../lib/openmpi/mca_fcoll_ylib.so: undefined symbol: ompi_io_ompio_scatter_data (ignored) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_fcoll_dynamic: .../lib/openmpi/mca_fcoll_dynamic.so: undefined symbol: ompi_io_ompio_allgatherv_array (ignored) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_fcoll_two_phase: .../lib/openmpi/mca_fcoll_two_phase.so: undefined symbol: ompi_io_ompio_set_aggregator_props (ignored) [mon241:04077] mca: base: component_find: unable to open .../lib/openmpi/mca_fcoll_static: .../lib/openmpi/mca_fcoll_static.so: undefined symbol: ompi_io_ompio_allgather_array (ignored) Package: Open MPI nixbld@ Distribution Open MPI: 1.10.7 Open MPI repo revision: v1.10.6-48-g5e373bf Open MPI release date: May 16, 2017 Open RTE: 1.10.7 Open RTE repo revision: v1.10.6-48-g5e373bf Open RTE release date: May 16, 2017 OPAL: 1.10.7 OPAL repo revision: v1.10.6-48-g5e373bf OPAL release date: May 16, 2017 ... I dug into the first of these (figured out what library provided it, looked at the declared dependencies, poked around in the automake file) , and, as far as I could determine, it seems that mca_btl_openib.so simply isn't linked to list mca_mpool_grdma.so (which provides the symbol) as a dependency. Seeing as 1.10.7 is no longer supported. I figured I would try 2.1.3 in case this has been fixed. I compiled it up as well, and it seems all but the mca_fcoll_individual one have been resolved (I've replaced /nix/store/4kh0zbn8pmdqhvwagicswg70rwnpm570-openmpi-2.1.3 with ...) [mon241:05544] mca_base_component_repository_open: unable to open mca_fcoll_individual: .../lib/openmpi/mca_fcoll_individual.so: undefined symbol: ompio_io_ompio_file_read (ignored) Package: Open MPI nixbld@ Distribution Open MPI: 2.1.3 Open MPI repo revision: v2.1.2-129-gcfd8f3f Open MPI release date: Mar 13, 2018 Open RTE: 2.1.3 Open RTE repo revision: v2.1.2-129-gcfd8f3f Open RTE release date: Mar 13, 2018 OPAL: 2.1.3 OPAL repo revision: v2.1.2-129-gcfd8f3f OPAL release date: Mar 13, 2018 ... Again I was able to find this symbol in the mca_io_ompio.so library. I looked through the source again, and it seems pretty clear that the function is indeed called, but the library isn't linked to list the mca_io_ompio.so library as a dependency Looking through the various shared libraries in the .../lib/openmpi directory though, and it seems none of them have dependencies on each other. How is this suppose to work? Is the component library just suppose to load everything so all symbols get resolved? Is the above error I'm seeing an error then? Any insight would be appreciated. Thanks! -Tyson PS: Please note that the openmpi code was compiled without any patches and without any special configure flags other than --prefix=.... (NixOS also adds --diasble-static and --disable-dependency-tracking by default, but I removed those, it didn't make a difference).. _______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel