Coolio. Pak - go ahead and commit if you haven't already done so. -jms Sent from my PDA. No type good.
-----Original Message----- From: Brian Barrett [mailto:brbar...@open-mpi.org] Sent: Sunday, May 04, 2008 02:14 PM Eastern Standard Time To: Open MPI Developers Subject: Re: [OMPI devel] undefined referencesforrdma_get_peer_addr & rdma_get_local_addr I think I might see the issue. Jeff, I'm assuming you're using a developer build of Open MPI with GNU, Intel, or Pathscale compilers, right? At least someone below was using PGI. The first three compilers on a developer build have the magic pixie dust arguments added that makes calling an undeclared function an error. PGI, Sun Workshop, and non-developer builds don't have that pixie dust. So it's not an error to call an undeclared function in those cases, and AC_COMPILE_IFELSE won't error out. AC_LINK_IFELSE should always be used to check for functions for precisely that reason. Brian On May 4, 2008, at 11:41 AM, Jeff Squyres (jsquyres) wrote: > As steve mentioned, its inline. But I don't understand how that > would even compile if its not in rdma_cma.h. Iflink will catch it, > but I'm still a little uneasy not understanding why it passes the > compile... > > -jms > Sent from my PDA. No type good. > > -----Original Message----- > From: Pak Lui [mailto:pak....@sun.com] > Sent: Sunday, May 04, 2008 11:44 AM Eastern Standard Time > To: Open MPI Developers > Subject: Re: [OMPI devel] undefined references > forrdma_get_peer_addr & rdma_get_local_addr > > Jeff Squyres wrote: > > Jon / Steve -- can you comment? > > > > I tested with OFED 1.2.5 (which is what I assume you meant) and got: > > > > checking for rdma_get_peer_addr... no > > > > Because that function is not defined in OFED 1.2.5. Running with > OFED > > 1.3 (where the function does exist), I get: > > > > checking for rdma_get_peer_addr... yes > > For me it seems to be running with 1.2.5. > > login3% /opt/ofed/bin/ofed_info | head -1 > OFED-1.2.5.5 > > No rmda_get_peer_addr or rmda_get_local_addr in these .so's, > assumingly > they are coming from there. > > login3% ls librdmacm.so* > librdmacm.so librdmacm.so.1 librdmacm.so.1.0.0 librdmacm.so.1.0.2 > > login3% nm librdmacm.so* | grep rdma_get_ > 0000000000003470 T rdma_get_cm_event > 0000000000001a20 T rdma_get_devices > 0000000000003470 T rdma_get_cm_event > 0000000000001a20 T rdma_get_devices > 0000000000003470 T rdma_get_cm_event > 0000000000001a20 T rdma_get_devices > 0000000000003470 T rdma_get_cm_event > 0000000000001a20 T rdma_get_devices > > And I don't see rdma_get_peer_addr appeared in the > /opt/ofed/include/rdma/rdma_cma.h either. Not knowing how it actually > know about the interface (and it's not inline) there. > > > > > Outside of all the configure complexity, can you write a simple > > program that calls that function and have it compile and link > properly? > > These are the references of rmda_get_peer_addr from the config.log: > 47858 configure:120941: checking for rdma_get_peer_addr > 47859 configure:120966: pgcc -c -g -D_REENTRANT > -I/opt/ofed/include conftest.c >&5 > 47860 PGC-W-0155-Pointer value created from a nonlong integral type > (conftest .c: 412) > 47861 PGC/x86-64 Linux 7.1-2: compilation completed with warnings > 47862 configure:120972: $? = 0 > 47863 configure:120987: result: yes > ... > 48355 configure:123600: checking for rdma_get_peer_addr > 48356 configure:123625: pgcc -c -g -D_REENTRANT > -I/opt/ofed/include conftes t.c >&5 > 48357 PGC-W-0155-Pointer value created from a nonlong integral type > (conftest .c: 423) > 48358 PGC/x86-64 Linux 7.1-2: compilation completed with warnings > 48359 configure:123631: $? = 0 > 48360 configure:123646: result: yes > > Here's my program, not sure if it's doing it correctly. I am no m4 > expert, so how do I run the ompi_check_openib.m4 independently and see > the conftest.c?? > > login3% cat mytest.c > #include "rdma/rdma_cma.h" > int main (void) { > void *ret = (void*) rdma_get_peer_addr((struct rdma_cm_id*)0); > return 0; > } > > It gives me a warning if I just try to create an object, which is > what I > see in the config.log. > > login3% pgcc -c -g -D_REENTRANT -I/opt/ofed/include mytest.c > PGC-W-0155-Pointer value created from a nonlong integral type > (mytest.c: 3) > PGC/x86-64 Linux 7.1-2: compilation completed with warnings > login3% echo $? > 0 > > But trying to create an executable would give me the error. > > login3% pgcc -g -D_REENTRANT -I/opt/ofed/include mytest.c -o mytest > PGC-W-0155-Pointer value created from a nonlong integral type > (mytest.c: 3) > PGC/x86-64 Linux 7.1-2: compilation completed with warnings > /tmp/pgccjF6BryhFmWS.o: In function `main': > /share/home/00951/paklui/ompi-trunk5/config-data1-debug/mytest.c:3: > undefined reference to `rdma_get_peer_addr' > > Hmm, any clues, comments? > > > > > I suppose we could change the AC_COMPILE_IFELSE in config/ > > ompi_check_openib.m4 to OMPI_LINK_IFELSE, but I'm a little > confused as > > to why it would compile successfully if the symbol > rdma_get_peer_addr > > is not declared anywhere (which it shouldn't be in OFED 1.2 or > 1.2.5, > > AFAIK)... > > > > > > > > On May 3, 2008, at 10:56 AM, Pak Lui wrote: > > > >> Sure Jeff, see attached. > >> > >> Jeff Squyres wrote: > >>> (moving to devel so that others are aware) > >>> Crud. Can you send me your config.log? I don't know why it's > able > >>> to find rdma_get_peer_addr() in configure, but then later not > able > >>> to find it during the build - I'd like to see what happened > >>> during configure. > >>> On May 2, 2008, at 7:09 PM, Pak Lui wrote: > >>>> Hi Jeff, > >>>> > >>>> It seems that the cpc3 merge causes my Ranger build to break. I > >>>> believe it is using OFED 1.2 but I don't know how to check. It > >>>> passes the ompi_check_openib.m4 that you added in for the > >>>> rdma_get_peer_addr. Is there a missing #include for openib/ofed > >>>> related somewhere? > >>>> > >>>> > >>>> 1236 checking rdma/rdma_cma.h usability... yes > >>>> 1237 checking rdma/rdma_cma.h presence... yes > >>>> 1238 checking for rdma/rdma_cma.h... yes > >>>> 1239 checking for rdma_create_id in -lrdmacm... yes > >>>> 1240 checking for rdma_get_peer_addr... yes > >>>> > >>>> > >>>> pgCC -DHAVE_CONFIG_H -I. -I../../../../ompi/tools/ompi_info - > >>>> I../../../opal/include -I../../../orte/include -I../../../ompi/ > >>>> include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa - > >>>> DOMPI_CONFIGURE_USER="\"paklui\"" - > >>>> DOMPI_CONFIGURE_HOST="\"login4.ranger.tacc.utexas.edu\"" - > >>>> DOMPI_CONFIGURE_DATE="\"Fri May 2 17:07:01 CDT 2008\"" - > >>>> DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" - > >>>> DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O -DNDEBUG > >>>> \"" -DOMPI_BUILD_CPPFLAGS="\"-I../../../.. -I../../.. - > >>>> I../../../../ opal/include -I../../../../orte/include - > >>>> I../../../../ompi/include - D_REENTRANT\"" - > >>>> DOMPI_BUILD_CXXFLAGS="\"-O -DNDEBUG \"" - > >>>> DOMPI_BUILD_CXXCPPFLAGS="\"-I../../../.. -I../../.. - > I../../../../ > >>>> opal/include -I../../../../orte/include -I../../../../ompi/ > >>>> include - D_REENTRANT\"" -DOMPI_BUILD_FFLAGS="\"\"" - > >>>> DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\" \"" - > >>>> DOMPI_BUILD_LIBS="\"-lnsl -lutil -lpthread\"" - > >>>> DOMPI_CC_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgcc > >>>> \"" - DOMPI_CXX_ABSOLUTE="\"/opt/apps/pgi/7.1/linux86-64/7.1-2/ > bin/ > >>>> pgCC\"" -DOMPI_F77_ABSOLUTE="\"/opt/apps/pgi/7.1/ > linux86-64/7.1-2/ > >>>> bin/ pgf77\"" -DOMPI_F90_ABSOLUTE="\"/opt/apps/pgi/7.1/ > >>>> linux86-64/7.1-2/ bin/pgf95\"" -DOMPI_F90_BUILD_SIZE="\"small > \"" - > >>>> I../../../.. - I../../.. -I../../../../opal/include - > I../../../../ > >>>> orte/include - I../../../../ompi/include -D_REENTRANT -O - > >>>> DNDEBUG -c -o version.o ../../../../ompi/tools/ompi_info/ > >>>> version.cc > >>>> /bin/sh ../../../libtool --tag=CXX --mode=link pgCC -O - > DNDEBUG > >>>> - o ompi_info components.o ompi_info.o output.o param.o > >>>> version.o ../../../ompi/libmpi.la -lnsl -lutil -lpthread > >>>> libtool: link: pgCC -O -DNDEBUG -o .libs/ompi_info components.o > >>>> ompi_info.o output.o param.o version.o ../../../ompi/.libs/ > >>>> libmpi.so -L/opt/ofed/lib64 -libcm -lrdmacm -libverbs -lrt / > share/ > >>>> home/00951/paklui/ompi-trunk5/config-data1/orte/.libs/libopen- > >>>> rte.so /share/home/00951/paklui/ompi-trunk5/config-data1/ > >>>> opal/.libs/ libopen-pal.so -lnuma -ldl -lnsl -lutil -lpthread - > >>>> Wl,--rpath -Wl,/ share/home/00951/paklui/ompi-trunk5/shared- > >>>> install1/lib > >>>> > >>>> [1] Exit 2 make install >& > >>>> make.install.log.0 > >>>> ../../../ompi/.libs/libmpi.so: undefined reference to > >>>> `rdma_get_peer_addr' > >>>> ../../../ompi/.libs/libmpi.so: undefined reference to > >>>> `rdma_get_local_addr' > >>>> make[2]: *** [ompi_info] Error 2 > >>>> make[2]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/ > >>>> config-data1/ompi/tools/ompi_info' > >>>> make[1]: *** [install-recursive] Error 1 > >>>> make[1]: Leaving directory `/share/home/00951/paklui/ompi-trunk5/ > >>>> config-data1/ompi' > >>>> make: *** [install-recursive] Error 1 > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> - Pak Lui > >>>> pak....@sun.com > >> > >> -- > >> > >> > >> - Pak Lui > >> pak....@sun.com > >> <config.log.bz2><mime-attachment.txt> > > > > > > > -- > > > - Pak Lui > pak....@sun.com > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Brian Barrett Open MPI developer http://www.open-mpi.org/ _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel