Re: [OMPI devel] OMPI devel] trunk warnings on x86
Paul, i confirm ampersand was missing and this was a bug /* a similar bug was fixed by Ralph in r32357 */ i commited r32408 in order to fix these three bugs. i also took the liberty to replace the OMPI_CAST_RTE_NAME with an inline function (only in debug mode) in order to get a compiler warning on both 32 and 64 bits arch in this case : #if OPAL_ENABLE_DEBUG static inline orte_process_name_t * OMPI_CAST_RTE_NAME(opal_process_name_t * name); #else #define OMPI_CAST_RTE_NAME(a) ((orte_process_name_t*)(a)) #endif Cheers, Gilles On 2014/08/03 14:49, Gilles GOUAILLARDET wrote: > Paul, > > imho, the root cause is a missing ampersand. > > I will double check this from tomorrow only > > Cheers, > > Gilles > > Ralph Castainwrote: >> Arg - that raises an interesting point. This is a pointer to a 64-bit >> number. Will uintptr_t resolve that problem on such platforms? >> >> >> On Aug 2, 2014, at 8:12 PM, Paul Hargrove wrote: >> >> >> Looks like on a 32-bit platform a (uintptr_t) cast is desired in the >> OMPI_CAST_RTE_NAME() macro. >> >> >> Warnings from current trunk tarball attributable to the missing case include: >> >> >> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89: >> warning: cast to pointer from integer of different size >> >> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97: >> warning: cast to pointer from integer of different size >> >> /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417: >> warning: cast to pointer from integer of different size >> >> >> -Paul >> >> >> -- >> >> Paul H. Hargrove phhargr...@lbl.gov >> >> Future Technologies Group >> >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> >> ___ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/08/15481.php >> >> >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/08/15484.php
Re: [OMPI devel] [1.8.2rc3] static linking fails on linux (openpty undefined)
Hmm, On a different Linux/x86-64 host things work as expected with '-lutil' linked explicitly: $ ./INST/bin/mpicc -showme BLD/examples/hello_c.c pgcc BLD/examples/hello_c.c -I/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/include -L/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/opt/torque/4.2.7.h1/lib -Wl,-rpath -Wl,/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib -L/scratch/scratchdirs/hargrove/OMPI/openmpi-1.8.2rc3-linux-x86_64-pgi-14.1/INST/lib -lmpi -lopen-rte -lopen-pal -lm -lpciaccess -ldl -ltorque -lrt -lnsl -lutil Searching for relevant differences now... -Paul On Sun, Aug 3, 2014 at 4:58 PM, Paul Hargrovewrote: > > I've configured the 1.8.2rc3 tarball with "--enable-static > --disable-shared" on a fairly standard Linux/x86-64 platform. While there > are no problems on the same platform w/o these configure flags, with them I > cannot link any application codes. > > $ mpicc -ghello_c.c -o hello_c > /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o): > In function `opal_openpty': > opal_pty.c:(.text+0x1): undefined reference to `openpty' > > I checked "make openpty" and the manpage says to link with '-lutil'. > The '-showme' does not show libutil: > > $ mpicc -showme hello_c.c > gcc hello_c.c > -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include > -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath > -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath > -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath > -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath > -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath > -Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib > -Wl,--enable-new-dtags > -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib > -lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm > > > It looks like configure is doing the right thing on some level, but > failing to add '-lutil' to the appropriate list of libs > (OPAL_WRAPPER_EXTRA_LIBS?): > > > > == Library and Function tests > > > checking if we need -lutil for openpty... yes > checking for openpty... yes > > > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
[OMPI devel] [1.8.2rc3] static linking fails on linux (openpty undefined)
I've configured the 1.8.2rc3 tarball with "--enable-static --disable-shared" on a fairly standard Linux/x86-64 platform. While there are no problems on the same platform w/o these configure flags, with them I cannot link any application codes. $ mpicc -ghello_c.c -o hello_c /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib/libopen-pal.a(opal_pty.o): In function `opal_openpty': opal_pty.c:(.text+0x1): undefined reference to `openpty' I checked "make openpty" and the manpage says to link with '-lutil'. The '-showme' does not show libutil: $ mpicc -showme hello_c.c gcc hello_c.c -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/include -pthread -L/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath -Wl,/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib -Wl,--enable-new-dtags -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.8.2rc3-linux-x86_64-static/INST/lib -lmpi -lopen-rte -lopen-pal -lm -ldl -ltorque -libverbs -lrdmacm It looks like configure is doing the right thing on some level, but failing to add '-lutil' to the appropriate list of libs (OPAL_WRAPPER_EXTRA_LIBS?): == Library and Function tests checking if we need -lutil for openpty... yes checking for openpty... yes -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [OMPI devel] [1.8.2rc3] another openib bug (#4377)
On Sun, Aug 3, 2014 at 12:49 PM, Paul Hargrovewrote: > BTW: > Even with the "ignore_device=1" problem fixed, I can't get btl:openib > running on x86. > So, there may be additional reports in the next few hours. > That turned out to be the already known issue in 1.8.2rc3 that was since fixed. So, with manual application of r32395 + the patch for ticket #4377 I can run btl:openib on x86+tavor -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
[OMPI devel] [1.8.2rc3] another openib bug (#4377)
I have a pair of x86/linux (32 bit) hosts connected by Mellanox Tavor HCAs. I have no idea if (or why) this has only appeared on this system, but I find that blt:openib thinks the INI file says to ignore these HCAs. See the 4th line below: [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ip.c:364:add_rdma_addr] Adding addr 172.18.0.105 (0x690012ac) subnet 0xac12 as mthca0:1 [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ini.c:170:ompi_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 23108 [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_ini.c:189:ompi_btl_openib_ini_query] Found corresponding INI values: Mellanox Tavor Infinihost [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:1541:init_one_device] device mthca0 skipped; ignore_device=1 [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:988:device_destruct] Failed to release mpool [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/btl_openib_component.c:1020:device_destruct] Failed to destroy device resources [pcp-j-5][[27705,1],0][/home/pcp1/phargrov/OMPI/openmpi-1.8.2rc3-linux-x86-mx/openmpi-1.8.2rc3/ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:1981:rdmacm_component_finalize] rdmacm_component_finalize Turns out this is known, and has been entered as trac ticket #4377, currently assigned to miked. Applying the 2-line patch attached to the ticket fixes the ignore_device=1 problem for me. Mike, Please apply that patch to trunk and CMR for 1.8.2 BTW: Even with the "ignore_device=1" problem fixed, I can't get btl:openib running on x86. So, there may be additional reports in the next few hours. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [OMPI devel] OMPI devel] trunk warnings on x86
Paul, imho, the root cause is a missing ampersand. I will double check this from tomorrow only Cheers, Gilles Ralph Castainwrote: >Arg - that raises an interesting point. This is a pointer to a 64-bit number. >Will uintptr_t resolve that problem on such platforms? > > >On Aug 2, 2014, at 8:12 PM, Paul Hargrove wrote: > > >Looks like on a 32-bit platform a (uintptr_t) cast is desired in the >OMPI_CAST_RTE_NAME() macro. > > >Warnings from current trunk tarball attributable to the missing case include: > > >/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89: > warning: cast to pointer from integer of different size > >/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97: > warning: cast to pointer from integer of different size > >/home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417: > warning: cast to pointer from integer of different size > > >-Paul > > >-- > >Paul H. Hargrove phhargr...@lbl.gov > >Future Technologies Group > >Computer and Data Sciences Department Tel: +1-510-495-2352 > >Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > >___ >devel mailing list >de...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >Link to this post: >http://www.open-mpi.org/community/lists/devel/2014/08/15481.php > >
Re: [OMPI devel] trunk warnings on x86
Whether just adding a (uintptr_t) cast is sufficient or not depends on the usage, and I don't pretend to have looked much deeper than seeing that this macro is common to the line numbers in the warnings I quoted. If the intent is to uniformly store a pointer then a (uintptr_t *) cast may be appropriate, though that would use the most-significant 32-bits on ppc32 and least-significant 32-bits on x86. Again, the appropriate form for the macro depends on how the field is used. -Paul On Sat, Aug 2, 2014 at 9:14 PM, Ralph Castainwrote: > Arg - that raises an interesting point. This is a pointer to a 64-bit > number. Will uintptr_t resolve that problem on such platforms? > > On Aug 2, 2014, at 8:12 PM, Paul Hargrove wrote: > > Looks like on a 32-bit platform a (uintptr_t) cast is desired in the > OMPI_CAST_RTE_NAME() macro. > > Warnings from current trunk tarball attributable to the missing case > include: > > /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89: > warning: cast to pointer from integer of different size > /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97: > warning: cast to pointer from integer of different size > /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417: > warning: cast to pointer from integer of different size > > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15481.php > > > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15482.php > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
[OMPI devel] trunk warnings on x86
Looks like on a 32-bit platform a (uintptr_t) cast is desired in the OMPI_CAST_RTE_NAME() macro. Warnings from current trunk tarball attributable to the missing case include: /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:89: warning: cast to pointer from integer of different size /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/runtime/ompi_mpi_abort.c:97: warning: cast to pointer from integer of different size /home/pcp1/phargrov/OMPI/openmpi-trunk-linux-x86-gcc/openmpi-1.9a1r32406/ompi/mca/pml/bfo/pml_bfo_failover.c:1417: warning: cast to pointer from integer of different size -Paul -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900