Paul,
i missed a step indeed :
opal is required by rte, that is in turn required by mpi
the attached patch does the job (tested on a solaris10/x86_64 vm with
gnu compilers)
Cheers,
Gilles
On 2014/08/06 4:40, Paul Hargrove wrote:
> Gilles,
>
> I have not tested your patch.
> I've only read it.
>
> It looks like it could work, except that libopen-rte.a depends on libsocket
> and libnsl on Solaris.
> So, one probably needs to add $LIBS to the ORTE wrapper libs as well.
>
> Additionally,if your approach is the correct one, then I think one can fold:
>
> OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs])
> OPAL_WRAPPER_EXTRA_LIBS="$OPAL_WRAPPER_EXTRA_LIBS
> $with_wrapper_libs"
> + OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS], [$LIBS])
> + OPAL_WRAPPER_EXTRA_LIBS="$OPAL_WRAPPER_EXTRA_LIBS
> $with_wrapper_libs"
>
> into just
>
> - OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs])
> + OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs $LIBS])
>
> which merges two calls to OPAL_FLAGS_APPEND_UNIQ and avoids double-adding
> of the user's $with_wrapper_libs.
> And of course the same 1-line change would apply for the OMPI and
> eventually ORTE variables too.
>
> I'd like to wait until Jeff has had a chance to look this over before I
> devote time to testing.
> Since I've determined already that 1.6.5 did not have the problem while
> 1.7.x does, the possibility exists that some smaller change might exist to
> restore what ever was lost between the v1.6 and v1.7 branches.
>
> -Paul
>
>
> On Tue, Aug 5, 2014 at 1:33 AM, Gilles Gouaillardet <
> [email protected]> wrote:
>
>> Here is a patch that has been minimally tested.
>>
>> this is likely an overkill (at least when dynamic libraries can be used),
>> but it does the job so far ...
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/08/05 16:56, Gilles Gouaillardet wrote:
>>
>> from libopen-pal.la :
>> dependency_libs=' -lrdmacm -libverbs -lscif -lnuma -ldl -lrt -lnsl
>> -lutil -lm'
>>
>>
>> i confirm mpicc fails linking
>>
>> but FWIT, using libtool does work (!)
>>
>> could the bug come from the mpicc (and other) wrappers ?
>>
>> Gilles
>>
>> $ gcc -g -O0 -o hw /csc/home1/gouaillardet/hw.c
>> -I/tmp/install/ompi.noromio/include -pthread -L/usr/lib64 -Wl,-rpath
>> -Wl,/usr/lib64 -Wl,-rpath -Wl,/tmp/install/ompi.noromio/lib
>> -Wl,--enable-new-dtags -L/tmp/install/ompi.noromio/lib -lmpi -lopen-rte
>> -lopen-pal -lm -lnuma -libverbs -lscif -lrdmacm -ldl -llustreapi
>>
>> $ /tmp/install/ompi.noromio/bin/mpicc -g -O0 -o hw -show ~/hw.c
>> gcc -g -O0 -o hw /csc/home1/gouaillardet/hw.c
>> -I/tmp/install/ompi.noromio/include -pthread -L/usr/lib64 -Wl,-rpath
>> -Wl,/usr/lib64 -Wl,-rpath -Wl,/tmp/install/ompi.noromio/lib
>> -Wl,--enable-new-dtags -L/tmp/install/ompi.noromio/lib -lmpi -lopen-rte
>> -lopen-pal -lm -lnuma -libverbs -lscif -lrdmacm -ldl -llustreapi
>> [gouaillardet@soleil build]$ /tmp/install/ompi.noromio/bin/mpicc -g -O0
>> -o hw ~/hw.c
>> /tmp/install/ompi.noromio/lib/libmpi.a(fbtl_posix_ipwritev.o): In
>> function `mca_fbtl_posix_ipwritev':
>> fbtl_posix_ipwritev.c:(.text+0x17b): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x237): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x3f4): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x48e): undefined reference to `aio_write'
>> /tmp/install/ompi.noromio/lib/libopen-pal.a(opal_pty.o): In function
>> `opal_openpty':
>> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>> /tmp/install/ompi.noromio/lib/libopen-pal.a(event.o): In function
>> `event_add_internal':
>> event.c:(.text+0x288d): undefined reference to `clock_gettime'
>>
>> $ /bin/sh ./static/libtool --silent --tag=CC --mode=compile gcc
>> -std=gnu99 -I/tmp/install/ompi.noromio/include -c ~/hw.c
>> $ /bin/sh ./static/libtool --silent --tag=CC --mode=link gcc
>> -std=gnu99 -o hw hw.o -L/tmp/install/ompi.noromio/lib -lmpi
>> $ ldd hw
>> linux-vdso.so.1 => (0x00007fff7530d000)
>> librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x00007f0ed541e000)
>> libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00007f0ed5210000)
>> libscif.so.0 => /usr/lib64/libscif.so.0 (0x0000003b9c600000)
>> libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000003ba5600000)
>> libdl.so.2 => /lib64/libdl.so.2 (0x0000003b9be00000)
>> librt.so.1 => /lib64/librt.so.1 (0x0000003b9ca00000)
>> libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003bae200000)
>> libutil.so.1 => /lib64/libutil.so.1 (0x0000003bac600000)
>> libm.so.6 => /lib64/libm.so.6 (0x0000003b9ba00000)
>> libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003b9c200000)
>> libc.so.6 => /lib64/libc.so.6 (0x0000003b9b600000)
>> /lib64/ld-linux-x86-64.so.2 (0x0000003b9b200000)
>>
>>
>>
>>
>> On 2014/08/05 7:56, Ralph Castain wrote:
>>
>> My thought was to post initially as a blocker, pending a discussion with
>> Jeff at tomorrow's telecon. If he thinks this is something we can fix in
>> some central point (thus catching it everywhere), then it could be quick and
>> worth doing. However, I'm skeptical as I tried to do that in the most
>> obvious place, and it failed (could be operator error).
>>
>> Will let you know tomorrow. Truly appreciate your digging on this!
>> Ralph
>>
>> On Aug 4, 2014, at 3:50 PM, Paul Hargrove <[email protected]>
>> <[email protected]> wrote:
>>
>>
>> Ralph and Jeff,
>>
>> I've been digging and find the problem is wider than just the one library
>> and has manifestations specific to FreeBSD, NetBSD and Solaris. I am adding
>> new info to the ticket as I unearth it.
>>
>> Additionally, it appears this existed in 1.8, 1.8.1 and in the 1.7 series as
>> well.
>> So, would suggest this NOT be a blocker for a 1.8.2 release.
>>
>> Of course I am willing to provide testing if you still want to push for a
>> quick resolution.
>>
>> -Paul
>>
>>
>> On Mon, Aug 4, 2014 at 1:27 PM, Ralph Castain <[email protected]>
>> <[email protected]> wrote:
>> Okay, I filed a blocker on this for 1.8.2 and assigned it to Jeff. I took a
>> crack at fixing it, but came up short :-(
>>
>>
>> On Aug 3, 2014, at 10:46 PM, Paul Hargrove <[email protected]>
>> <[email protected]> wrote:
>>
>>
>> I've identified the difference between the platform that does link libutil
>> and the one that does not.
>>
>> 1) libutil is linked (as an OMPI dependency) only on the working system:
>>
>> Working system:
>> $ grep 'checking for .* LIBS' configure.out
>> checking for OPAL LIBS... -lm -lpciaccess -ldl
>> checking for ORTE LIBS... -lm -lpciaccess -ldl -ltorque
>> checking for OMPI LIBS... -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>>
>> NON-working system:
>> $ grep 'checking for .* LIBS' configure.out
>> checking for OPAL LIBS... -lm -ldl
>> checking for ORTE LIBS... -lm -ldl -ltorque
>> checking for OMPI LIBS... -lm -ldl -ltorque
>>
>> So, the working system that does link libutil is doing so as an OMPI
>> dependency.
>> However it is also needed for opal (only caller of openpty is
>> opal/util/open_pty.c).
>>
>> 2) Only the working system is building ROMIO:
>>
>> Comparing the 'checking if * can compile' lines of configure output shows
>> only ONE difference:
>>
>> checking if MCA component fs:ufs can compile... yes
>> checking if MCA component fs:pvfs2 can compile... no
>> checking if MCA component io:ompio can compile... yes
>> -checking if MCA component io:romio can compile... no
>> +checking if MCA component io:romio can compile... yes
>> checking if MCA component mpool:grdma can compile... yes
>> checking if MCA component mpool:sm can compile... yes
>> checking if MCA component mpool:udreg can compile... no
>>
>> So, it appears that *if* ROMIO is configured in, then "-lutil" gets added to
>> OMPI_WRAPPER_EXTRA_LIBS.
>> This masks the fact that it is missing from OPAL_WRAPPER_EXTRA_LIBS.
>>
>>
>> I have confirmed that I can reproduce the static linking failure by adding
>> --disable-io-romio to the configure options of the system that worked
>> previously.
>>
>> So, I update my report (and the email subject line) to:
>> Static linking fails on Linux when not building ROMIO
>>
>> -Paul
>>
>>
>>
>> On Sun, Aug 3, 2014 at 6:22 PM, Paul Hargrove <[email protected]>
>> <[email protected]> wrote:
>> Hmm,
>>
>> On a different Linux/x86-64 host things work as expected with '-lutil'
>> linked explicitly:
>>
>> $ ./INST/bin/mpicc -showme BLD/examples/hello_c.c
>> pgcc BLD/examples/hello_c.c
>> -I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include
>> -L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
>> -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath
>> -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib
>> -Wl,-rpath
>> -Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>>
>> -L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib
>> -lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil
>>
>> Searching for relevant differences now...
>>
>> -Paul
>>
>>
>> On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrove <[email protected]>
>> <[email protected]> wrote:
>>
>> I've configured the 1.8.2rc3 tarball with "--enable-static --disable-shared"
>> on a fairly standard Linux/x86-64 platform. While there are no problems on
>> the same platform w/o these configure flags, with them I cannot link any
>> application codes.
>>
>> $ mpicc -g hello_c.c -o hello_c
>> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o):
>> In function `opal_openpty':
>> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>>
>> I checked "make openpty" and the manpage says to link with '-lutil'.
>> The '-showme' does not show libutil:
>>
>> $ mpicc -showme hello_c.c
>> gcc hello_c.c
>> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include
>> -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
>> -Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
>> -Wl,--enable-new-dtags
>> -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib
>> -lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm
>>
>>
>> It looks like configure is doing the right thing on some level, but failing
>> to add '-lutil' to the appropriate list of libs (OPAL_WRAPPER_EXTRA_LIBS?):
>>
>> ============================================================================
>> == Library and Function tests
>> ============================================================================
>> checking if we need -lutil for openpty... yes
>> checking for openpty... yes
>>
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove [email protected]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>>
>>
>> --
>> Paul H. Hargrove [email protected]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>>
>>
>> --
>> Paul H. Hargrove [email protected]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> _______________________________________________
>> devel mailing [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15492.php
>>
>> _______________________________________________
>> devel mailing [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15501.php
>>
>>
>>
>> --
>> Paul H. Hargrove [email protected]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> _______________________________________________
>> devel mailing [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15503.php
>>
>>
>>
>> _______________________________________________
>> devel mailing [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15504.php
>>
>>
>>
>> _______________________________________________
>> devel mailing [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15512.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> [email protected]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/08/15513.php
>>
>
>
>
>
> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/08/15525.php
Index: config/opal_setup_wrappers.m4
===================================================================
--- config/opal_setup_wrappers.m4 (revision 32438)
+++ config/opal_setup_wrappers.m4 (working copy)
@@ -12,6 +12,8 @@
dnl All rights reserved.
dnl Copyright (c) 2006-2010 Oracle and/or its affiliates. All rights reserved.
dnl Copyright (c) 2009-2013 Cisco Systems, Inc. All rights reserved.
+dnl Copyright (c) 2014 Research Organization for Information Science
+dnl and Technology (RIST). All rights reserved.
dnl $COPYRIGHT$
dnl
dnl Additional copyrights may follow
@@ -289,7 +291,7 @@
# asked for, as they know better than us.
AC_MSG_CHECKING([for OPAL LIBS])
OPAL_WRAPPER_EXTRA_LIBS="$opal_mca_wrapper_extra_libs"
- OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs])
+ OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs
$LIBS])
OPAL_WRAPPER_EXTRA_LIBS="$OPAL_WRAPPER_EXTRA_LIBS $with_wrapper_libs"
AC_SUBST([OPAL_WRAPPER_EXTRA_LIBS])
AC_MSG_RESULT([$OPAL_WRAPPER_EXTRA_LIBS])
@@ -322,7 +324,7 @@
AC_MSG_CHECKING([for ORTE LIBS])
ORTE_WRAPPER_EXTRA_LIBS="$orte_mca_wrapper_extra_libs"
- OPAL_FLAGS_APPEND_UNIQ([ORTE_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs])
+ OPAL_FLAGS_APPEND_UNIQ([ORTE_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs
$OPAL_WRAPPER_EXTRA_LIBS])
ORTE_WRAPPER_EXTRA_LIBS="$ORTE_WRAPPER_EXTRA_LIBS $with_wrapper_libs"
AC_SUBST([ORTE_WRAPPER_EXTRA_LIBS])
AC_MSG_RESULT([$ORTE_WRAPPER_EXTRA_LIBS])
@@ -390,7 +392,7 @@
AC_MSG_CHECKING([for OMPI LIBS])
OMPI_WRAPPER_EXTRA_LIBS="$ompi_mca_wrapper_extra_libs"
- OPAL_FLAGS_APPEND_UNIQ([OMPI_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs])
+ OPAL_FLAGS_APPEND_UNIQ([OMPI_WRAPPER_EXTRA_LIBS], [$wrapper_extra_libs
$ORTE_WRAPPER_EXTRA_LIBS])
OMPI_WRAPPER_EXTRA_LIBS="$OMPI_WRAPPER_EXTRA_LIBS $with_wrapper_libs"
AC_SUBST([OMPI_WRAPPER_EXTRA_LIBS])
AC_MSG_RESULT([$OMPI_WRAPPER_EXTRA_LIBS])