This email starts out talking about version 1.10.7 to give a complete
picture.  I tested 2.1.3 as well, it also exhibits this issue,
although to a lesser extent though, and am asking for help on that
release.

I was compiling the OpenMPI 1.10.7 shipped with NixOS against a newer
libibverbs with a large set of drivers and get some strange errors
when when running opmi_info (I've replaced the common prefix
/nix/store/9zm0pqsh67fw0xi5cpnybnd7hgzryffs-openmpi-1.10.7 with ...)

[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_btl_openib: .../lib/openmpi/mca_btl_openib.so:
undefined symbol: mca_mpool_grdma_evict (ignored)
[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_fcoll_individual:
.../lib/openmpi/mca_fcoll_individual.so: undefined symbol:
mca_io_ompio_file_write (ignored)
[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_fcoll_ylib: .../lib/openmpi/mca_fcoll_ylib.so:
undefined symbol: ompi_io_ompio_scatter_data (ignored)
[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_fcoll_dynamic:
.../lib/openmpi/mca_fcoll_dynamic.so: undefined symbol:
ompi_io_ompio_allgatherv_array (ignored)
[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_fcoll_two_phase:
.../lib/openmpi/mca_fcoll_two_phase.so: undefined symbol:
ompi_io_ompio_set_aggregator_props (ignored)
[mon241:04077] mca: base: component_find: unable to open
.../lib/openmpi/mca_fcoll_static: .../lib/openmpi/mca_fcoll_static.so:
undefined symbol: ompi_io_ompio_allgather_array (ignored)
                 Package: Open MPI nixbld@ Distribution
               Open MPI: 1.10.7
 Open MPI repo revision: v1.10.6-48-g5e373bf
  Open MPI release date: May 16, 2017
               Open RTE: 1.10.7
 Open RTE repo revision: v1.10.6-48-g5e373bf
  Open RTE release date: May 16, 2017
                   OPAL: 1.10.7
     OPAL repo revision: v1.10.6-48-g5e373bf
      OPAL release date: May 16, 2017
...

I dug into the first of these (figured out what library provided it,
looked at the declared dependencies, poked around in the automake
file) , and, as far as I could determine, it seems that
mca_btl_openib.so simply isn't linked to list mca_mpool_grdma.so
(which provides the symbol) as a dependency.

Seeing as 1.10.7 is no longer supported.  I figured I would try 2.1.3
in case this has been fixed.  I compiled it up as well, and it seems
all but the mca_fcoll_individual one have been resolved (I've replaced
/nix/store/4kh0zbn8pmdqhvwagicswg70rwnpm570-openmpi-2.1.3 with ...)

[mon241:05544] mca_base_component_repository_open: unable to open
mca_fcoll_individual: .../lib/openmpi/mca_fcoll_individual.so:
undefined symbol: ompio_io_ompio_file_read (ignored)
                 Package: Open MPI nixbld@ Distribution
               Open MPI: 2.1.3
 Open MPI repo revision: v2.1.2-129-gcfd8f3f
  Open MPI release date: Mar 13, 2018
               Open RTE: 2.1.3
 Open RTE repo revision: v2.1.2-129-gcfd8f3f
  Open RTE release date: Mar 13, 2018
                   OPAL: 2.1.3
     OPAL repo revision: v2.1.2-129-gcfd8f3f
      OPAL release date: Mar 13, 2018
...

Again I was able to find this symbol in the mca_io_ompio.so library.
I looked through the source again, and it seems pretty clear that the
function is indeed called, but the library isn't linked to list the
mca_io_ompio.so library as a dependency

Looking through the various shared libraries in the .../lib/openmpi
directory though, and it seems none of them have dependencies on each
other.  How is this suppose to work?  Is the component library just
suppose to load everything so all symbols get resolved?  Is the above
error I'm seeing an error then?

Any insight would be appreciated.

Thanks!  -Tyson

PS:  Please note that the openmpi code was compiled without any
patches and without any special configure flags other than
--prefix=.... (NixOS also adds --diasble-static and
--disable-dependency-tracking by default, but I removed those, it
didn't make a difference)..
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to