Paul, generally speaking, when using mellanox stuff (mxm, hcoll, fca) these libraries must be accessible, either via LD_LIBRARY_PATH or via ld.so.conf
I do not the config of these cluster, but you might have to use the mellanox libraries tha could be in a non standard location. Cheers, Gilles On Wednesday, December 30, 2015, Paul Hargrove <phhargr...@lbl.gov> wrote: > I have tried 1.8.8, 10.0.2rc3 and master with the following configure > options (and --prefix) > > --enable-debug --with-verbs --enable-openib-connectx-xrc > --with-mxm=/opt/mellanox/mxm --with-fca=/opt/mellanox/fca > --with-hcoll=/opt/mellanox/hcoll > > In all three cases the configure output contains one of the following: > > --- MCA component coll:hcoll (m4 configuration macro) > checking for MCA component coll:hcoll compile mode... dso > checking --with-hcoll value... sanity check ok (/opt/mellanox/hcoll) > checking hcoll_api.h usability... yes > checking hcoll_api.h presence... yes > checking for hcoll_api.h... yes > checking for library containing hcoll_get_version... no > configure: error: HCOLL support requested but not found. Aborting > > > OR > > --- MCA component coll:hcoll (m4 configuration macro) > checking for MCA component coll:hcoll compile mode... dso > checking hcoll/api/hcoll_api.h usability... yes > checking hcoll/api/hcoll_api.h presence... yes > checking for hcoll/api/hcoll_api.h... yes > looking for library in lib > checking for library containing hcoll_get_version... no > looking for library in lib64 > checking for library containing hcoll_get_version... no > configure: error: HCOLL support requested but not found. Aborting > > > Where the first output is seen with v1.8 and the second with v1.10 and > master. > > The contents of config.log (shown for master, below) indicates that the > test for hcoll_get_version has failed *not* due to lack of that symbol, but > rather due to some unsatisfied shared library dependency: > > configure:241636: gcc -std=gnu99 -o conftest -g -finline-functions > -fno-strict-aliasing -pthread > -I/hpc/home/USERS/phhargrove/SCRATCH/OMPI/openmpi-master-linux-x86_64-mxm/openmpi-dev-3300-gb7b4231/opal/mca/hwloc/hwloc1111/hwloc/include > -I/hpc/home/USERS/phhargrove/SCRATCH/OMPI/openmpi-master-linux-x86_64-mxm/BLD/opal/mca/hwloc/hwloc1111/hwloc/include > -I/hpc/home/USERS/phhargrove/SCRATCH/OMPI/openmpi-master-linux-x86_64-mxm/openmpi-dev-3300-gb7b4231/opal/mca/event/libevent2022/libevent > -I/hpc/home/USERS/phhargrove/SCRATCH/OMPI/openmpi-master-linux-x86_64-mxm/openmpi-dev-3300-gb7b4231/opal/mca/event/libevent2022/libevent/include > -I/hpc/home/USERS/phhargrove/SCRATCH/OMPI/openmpi-master-linux-x86_64-mxm/BLD/opal/mca/event/libevent2022/libevent/include > -I/opt/mellanox/hcoll/include -L/opt/mellanox/hcoll/lib conftest.c > -lhcoll -lrt -lm -lutil -lrt -lm -lutil >&5 > /usr/bin/ld: warning: libibnetdisc.so.5, needed by > /opt/mellanox/hcoll/lib/libhcoll.so, not found (try using -rpath or > -rpath-link) > /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference to > `ibnd_iter_nodes_type@IBNETDISC_1.0' > /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference to > `ibnd_destroy_fabric@IBNETDISC_1.0' > /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference to > `ibnd_load_fabric@IBNETDISC_1.0' > /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference to > `ibnd_iter_nodes@IBNETDISC_1.0' > collect2: ld returned 1 exit status > > > This is on the "mir13" head node of the Mellanox DMZ cluster. > So, Mellanox should be able to either tell me what I have done wrong, or > else reproduce for themselves. > > -Paul > > > -- > Paul H. Hargrove phhargr...@lbl.gov > <javascript:_e(%7B%7D,'cvml','phhargr...@lbl.gov');> > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >