Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Paul Hargrove
Gilles, This is on Mellanox's own system where /opt/mellanox/hcoll was updates Aug 2. This problem also did not occur unless I build libmpi statically. A run of "mpirun -mca coll ^ml -np 2 examples/ring_c" still crashes. So, I really don't know if this is the same issue, but suspect that it is not

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Gilles Gouaillardet
Paul, if ompi is built statically or with --disable-dlopen, I do not think --mca coll ^ml can prevent the crash (assuming this is the same issue we discussed before). note if you build dynamically and without --disable-dlopen, it might or might not crash, depending on how modules are enumerated, a

Re: [OMPI devel] v1.10.0rc4

2015-08-23 Thread Ralph Castain
Okay, I found the missing flags and added them. Please try rc5 > On Aug 22, 2015, at 5:25 PM, Paul Hargrove wrote: > > FWIW: master is OK with identical configure arguments. > -Paul > > On Sat, Aug 22, 2015 at 4:33 PM, Paul Hargrove > wrote: > Oops! I spoke too soo

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Ralph Castain
I think that’s true - this looks like the hcoll symbol issue. I’d suggest configuring with —enable-mca-no-build=coll-ml to resolve the problem in static builds, or follow Gilles suggestion about .ompi_ignore > On Aug 22, 2015, at 10:14 PM, Gilles Gouaillardet > mailto:gilles.gouaillar...@gma

Re: [OMPI devel] 1.10.0rc4 hcoll problem compiled statically

2015-08-23 Thread Paul Hargrove
Ralph, Indeed, configuration with --enable-mca-no-build=coll-ml resolved my problem. So, this *is* the same problem at was already known. Sorry for the false alarm. -Paul On Sun, Aug 23, 2015 at 9:43 AM, Ralph Castain wrote: > I think that’s true - this looks like the hcoll symbol issue. I’d s

Re: [OMPI devel] v1.10.0rc4

2015-08-23 Thread Paul Hargrove
Ralph, The configuration that showed this problem with rc4 ran fine with rc5. -Paul On Sun, Aug 23, 2015 at 9:41 AM, Ralph Castain wrote: > Okay, I found the missing flags and added them. Please try rc5 > > > On Aug 22, 2015, at 5:25 PM, Paul Hargrove wrote: > > FWIW: master is OK with identica

Re: [OMPI devel] v1.10.0rc4

2015-08-23 Thread Paul Hargrove
The tests of RC4 completed (even qemu-emulated ARM and MIPS) with no issues other than the two I reported previously. One was a relatively simple matter that Ralph has fixed in RC5. The other was a new aspect of a known issue, for which Gilles and Ralph gave me multiple options to work-around. Tha

[OMPI devel] 1.10.0rc5 - mx problem when compiled statically

2015-08-23 Thread Paul Hargrove
I regret to say that in my endless search for perfection (which is a journey, not a destination) I believe I found another issue: I had mx2g sources sitting around, which I compiled and installed on two systems (x86 and x86-64). These provide only compile/link tests, since my systems lack the hard

Re: [OMPI devel] 1.10.0rc5 - mx problem when compiled statically

2015-08-23 Thread Ralph Castain
Rather than generating another rc right away, could you please apply the following patch and see if it fixes the problem? diff --git a/ompi/debuggers/Makefile.am b/ompi/debuggers/Makefile.am index 3e48af8..343a0c4 100644 --- a/ompi/debuggers/Makefile.am +++ b/ompi/debuggers/Makefile.am @@ -10,6 +

[OMPI devel] Fwd: PMIx 2.0 API thoughts

2015-08-23 Thread Ralph Castain
FYI - for those not on the PMIx mailing list, we would welcome your input Ralph > Begin forwarded message: > > From: Ralph Castain > Subject: PMIx 2.0 API thoughts > Date: August 22, 2015 at 8:45:50 AM PDT > To: pmix-de...@open-mpi.org > > Hi folks > > At the last PMIx telecon, people asked a

Re: [OMPI devel] 1.10.0rc5 - mx problem when compiled statically

2015-08-23 Thread Paul Hargrove
Ralph, I will try the requested change and let you know. -Paul On Sun, Aug 23, 2015 at 7:27 PM, Ralph Castain wrote: > Rather than generating another rc right away, could you please apply the > following patch and see if it fixes the problem? > > *diff --git a/ompi/debuggers/Makefile.am b/ompi

Re: [OMPI devel] 1.10.0rc5 - mx problem when compiled statically

2015-08-23 Thread Paul Hargrove
Ralph, Sorry, but no change - still failing to load libmyriexpress.so and still no rpath at link: /bin/sh ../../libtool --tag=CC --mode=link gcc -std=gnu99 -fno-strict-aliasing -pthread -g -o dlopen_test dlopen_test.o ../../ompi/ libmpi.la ../../opal/libopen-pal.la -lrt -lm -lutil libtool:

Re: [OMPI devel] 1.10.0rc5 - mx problem when compiled statically

2015-08-23 Thread Ralph Castain
Hmm….I’ll bet this is the correct change, then: diff --git a/ompi/debuggers/Makefile.am b/ompi/debuggers/Makefile.am index 3e48af8..93a3046 100644 --- a/ompi/debuggers/Makefile.am +++ b/ompi/debuggers/Makefile.am @@ -10,6 +10,7 @@ # Copyright (c) 2004-2005 The Regents of the University of Califor