Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Adrian Reber
I have reported the same error a few days ago and submitted it now as a github issue: https://github.com/open-mpi/ompi/issues/371 On Mon, Feb 02, 2015 at 12:36:54PM +1100, Christopher Samuel wrote: > On 31/01/15 10:51, Jeff Squyres (jsquyres) wrote: > > > New tarball posted (same location). Now

Re: [OMPI devel] btl_openib.c:1200: mca_btl_openib_alloc: Assertion `qp != 255' failed

2015-02-02 Thread Adrian Reber
https://github.com/open-mpi/ompi/issues/372 On Sat, Jan 31, 2015 at 01:38:54PM +, Jeff Squyres (jsquyres) wrote: > Adrian -- > > Can you file this as a Github issue? Thanks. > > > > On Jan 17, 2015, at 12:58 PM, Adrian Reber wrote: > > > > This time my bug report is not PSM related: > >

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
Ah -- the point being that this is not an issue related to the libltdl work. > On Feb 2, 2015, at 2:51 AM, Adrian Reber wrote: > > I have reported the same error a few days ago and submitted it now as a > github issue: https://github.com/open-mpi/ompi/issues/371 > > On Mon, Feb 02, 2015 at 12:

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Chris Samuel
On Mon, 2 Feb 2015 11:35:40 AM Jeff Squyres wrote: > Ah -- the point being that this is not an issue related to the libltdl work. Sorry - I saw the request to test the tarball and tried it out, missed the significance of the subject. :-/ -- Christopher SamuelSenior Systems Administrat

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Ralph Castain
Returning to the libltdl question: I think we may have a problem here. If we remove libltdl and default to disable-dlopen, then the user will - without warning - slurp all components that are able to build into libompi. This includes everything they specified, BUT because of our "build if you can"

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Ralph Castain
Hi Chris Just out of curiosity: I see you are reporting about a build on the headnode of a BG cluster. We've never ported OMPI to BG - are you using it on such a system? Or were you just test building the code on a convenient server? Ralph On Mon, Feb 2, 2015 at 3:52 AM, Chris Samuel wrote: >

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
Uuuurggghhh. More below. > On Feb 2, 2015, at 1:04 PM, Ralph Castain wrote: > > Returning to the libltdl question: I think we may have a problem here. If we > remove libltdl and default to disable-dlopen, then the user will - without > warning - slurp all components that are able to build in

Re: [OMPI devel] HELP in OpenMPI - for PH.D research

2015-02-02 Thread Jeff Squyres (jsquyres)
On Jan 25, 2015, at 1:06 PM, Cyrille DIBAMOU MBEUYO wrote: > > Good afternoon development team, > > I have a small problem in OpenMPI to achieve my Ph.D research > > My problem is that : > > while saving the context.PID of a process running on a node with BLCR > through OpenMPI on the checkpoi

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
Re-adding devel, since Paul sent me the logs off-list. (per Ralph's comment, we may or may not stick with this don't-build-libltdl philosophy, but I'd still like to run this issue down) Howard: see Paul's notes below. It's on the hopper system at Nersc. Do you have any Cray insight here? (see

[OMPI devel] confusing output when no c++ compiler

2015-02-02 Thread Paul Hargrove
The output below occurred testing Jeff's no-embedded-libltdl tarball, but I am assuming in quite likely the same is true on the trunk. The "issue" is that I am told by configure that "C and C++ compilers are not link compatible". However, it appears I just don't have a C++ compiler at all!! I am

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
Jeff and Howard, Just a couple minor points: 1. In case one has lost track, the reason the behavior described by Jeff is erroneous is that /usr/lib contains 32-bit libs (and target is 64-bit). Therefore libtool should have replaced -lltdl with /usr/lib64/libltdl.so (if at all). 2a. Jeff does r

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
Ralph and I just chatted about this on the phone. IANAL, but after talking through the license stuff, we think there will be new license issues caused by --disable-dlopen behavior. It feels like there's a lot of unexpected issues coming up with (more-or-less) causing (most?) people to build wit

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
On Feb 2, 2015, at 5:24 PM, Jeff Squyres (jsquyres) wrote: > > IANAL, but after talking through the license stuff, we think there will be > new license issues caused by --disable-dlopen behavior. ARRGH -- that should have been: ...we think there will be ***NO*** new license issues caused by -

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Christopher Samuel
On 03/02/15 05:09, Ralph Castain wrote: > Just out of curiosity: I see you are reporting about a build on the > headnode of a BG cluster. We've never ported OMPI to BG - are you using > it on such a system? Or were you just test building the code on a > convenient server? Just a convenient server

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
Jeff, Looks like you didn't hit all the un-guarded references to lt_dladvise. Specifically you missed a struct decl: /[]/openmpi-libltdl-linux-x86_64-gcc/openmpi-gitclone/opal/util/lt_interface.c:25:8: error: unknown type name 'lt_dladvise' -Paul On Sat, Jan 31, 2015 at 4:44 AM, Jeff Squyr

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 1:58 PM, Paul Hargrove wrote: > 2b. I am retrying now with all of Cray's environment modules unloaded > except the one for the PGI compiler. Nathan had suggested something like > this to me in the past, but I've never had issues with the default > environment. I will rep

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
I had fixed it in my local tree but not yet pushed to my github branch; I was waiting to see what happened w.r.t. your failure on the NERSC machine. I pushed the fix up to my branch now; do you want a new tarball? > On Feb 2, 2015, at 5:56 PM, Paul Hargrove wrote: > > Jeff, > > Looks like yo

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
Jeff, If you are still chasing the goal of getting this branch to "just work", then I am willing to keep testing. Let me know when a new tarball is ready and I'll give it a run on all of my systems. -Paul On Mon, Feb 2, 2015 at 4:15 PM, Jeff Squyres (jsquyres) wrote: > I had fixed it in my lo

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Jeff Squyres (jsquyres)
Paul -- If you've got the cycles and it's easy, release the hounds on the tarball that I just uploaded to: http://www.open-mpi.org/~jsquyres/unofficial/ Thanks! > On Feb 2, 2015, at 7:19 PM, Paul Hargrove wrote: > > Jeff, > > If you are still chasing the goal of getting this branch to

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
Jeff, Having already pointed my script at your tarball's URL, typing "./test-ompi" releases about 60 "hounds". I get an email for each system as it's tests complete, and gmail filters tag only the ones where one or more configurations failed. So, the overhead for me is pretty small as long as th

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 4:13 PM, Paul Hargrove wrote: > HOWEVER - switching from PGI to GNU compilers made the problem go away. > So, I suspect it may be an issue with the installation/configuration of > the PGI compilers. > I've reproduced the problem on a non-Cray system with four different in

[OMPI devel] Build failure on OpenBSD (deja vu)

2015-02-02 Thread Paul Hargrove
The following comes from testing Jeff's no-embedded-libltdl work, but I suspect the same is true on tru^H^H^Hmaster. The output below, from "make V=1" shows a link failure from trying to use arc4random_addrandom(), which was removed on OpenBSD in late 2013. The part that bugs me is that I thought

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-02 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 5:22 PM, Paul Hargrove wrote: > So, the overhead for me is pretty small as long as the number of failures > is kept low. I jinxed it!!! I have, I believe, about 7 different failures now on various systems. All of those appear UNRELATED to the libltdl changes. I went ahe

[OMPI devel] When libltdl is not your friend

2015-02-02 Thread Paul Hargrove
Below is one example of what happens when you assume that you can trust the libltdl installed an otherwise very well maintained national center. I think this is another "vote" for continuing to embed (a working) libltdl. -Paul $ mpirun -mca btl sm,self -np 2 examples/ring_c' libibverbs: Warning:

Re: [OMPI devel] When libltdl is not your friend

2015-02-02 Thread Howard Pritchard
Hi Paul, Thanks for checking in depth into this. Just to help in determining how to proceed, which national center is this? Howard 2015-02-02 19:35 GMT-07:00 Paul Hargrove : > Below is one example of what happens when you assume that you can trust > the libltdl installed an otherwise very wel

Re: [OMPI devel] When libltdl is not your friend

2015-02-02 Thread Paul Hargrove
Howard, This was seen on NERSC's Carver. -Paul On Mon, Feb 2, 2015 at 6:49 PM, Howard Pritchard wrote: > Hi Paul, > > Thanks for checking in depth into this. Just to help in determining how > to proceed, which national center is this? > > Howard > > > 2015-02-02 19:35 GMT-07:00 Paul Hargrove

[OMPI devel] Master build failure on Mac OS 10.8 with --enable-static/--disable-shared

2015-02-02 Thread Paul Hargrove
I have a Mac OSX 10.8 system, where cc is clang. I have no problems with a default build from the current master tarball. However, a static-only build leads to a link failure on opal_wrapper. Configured with --prefix=... --enable-debug CC=cc CXX=c++ --enable-static --disable-shared Failing port

[OMPI devel] Master failure building oshmem java examples

2015-02-02 Thread Paul Hargrove
On a system on which 1.8.4rc5 passed all my tests, I see the following running "make" in the examples directory: [...] make[2]: Leaving directory `/brashear/hargrove/OMPI/openmpi-master-linux-x86_64-java/BLD/examples' make[2]: Entering directory `/brashear/hargrove/OMPI/openmpi-master-linux-x86_64

Re: [OMPI devel] When libltdl is not your friend

2015-02-02 Thread Paul Hargrove
It looks like I was too quick to blame libltdl. A build of the current 'master' tarball on the same system and identical configure arguments fails as seen below. While the failure is not identical, it is also a out-of-memory error. I am currently assuming that an rlimit has been lowered on this sy

[OMPI devel] Master assert failure on Linux/PPC64

2015-02-02 Thread Paul Hargrove
On a Linux/PPC64 system I see the failure below from a build of the current master tarball. This build was configured with --prefix=... --enable-debug \ CFLAGS=-m64 --with-wrapper-cflags=-m64 \ CXXFLAGS=-m64 --with-wrapper-cxxflags=-m64 \ FCFLAGS=-m64 --with-wrapper-fcflags=-m64 I am not

[OMPI devel] Master build failure w/ Solaris Studio 12.3 on Linux/x86-64

2015-02-02 Thread Paul Hargrove
On a Linux/x86-64 system I am using the Solaris Studio 12.3 compilers. I have configured the current master tarball as follows: --prefix=... --enable-debug \ CC=cc CXX=CC FC=f90 \ CXXFLAGS='-L/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -library=stlport4' \ --with-wrapper-cxxflags='-L/

Re: [OMPI devel] Build failure on OpenBSD (deja vu)

2015-02-02 Thread Ralph Castain
I see what happened - we upgraded libevent not that long ago, and I tried to catch all the OMPI-committed changes to it. However, I appear to have missed this one. I'll fix it now. Sorry about that... Ralph On Mon, Feb 2, 2015 at 6:11 PM, Paul Hargrove wrote: > The following comes from testing

Re: [OMPI devel] Master failure building oshmem java examples

2015-02-02 Thread Ralph Castain
Sigh...someone forgot to add those examples to the tarball. Fixing now. On Mon, Feb 2, 2015 at 7:15 PM, Paul Hargrove wrote: > On a system on which 1.8.4rc5 passed all my tests, I see the following > running "make" in the examples directory: > > [...] > make[2]: Leaving directory > `/brashear/h