Re: [OMPI devel] 1.7.4rc2r30148 run failure NetBSD6-x86

2014-01-08 Thread Ralph Castain
Hmmm...looks to me like the code should protect against this - unless the system isn't correctly reporting binding support. Could you run this with "-mca ess_base_verbose 10"? This will output the topology we found, including the binding support (which isn't in the usual output). On Jan 8,

Re: [OMPI devel] 1.7.4rc2r30148 - crash in MPI_Init on Linux/x86

2014-01-08 Thread Paul Hargrove
Only takes <30 seconds of typing to start the test and I get email when it is done. Typing these emails takes more of my time than the actual testing does. -Paul On Wed, Jan 8, 2014 at 8:35 PM, Ralph Castain wrote: > If you have the time, it might be worth nailing it down.

Re: [OMPI devel] 1.7.4rc2r30148 - out-of-date mpirun usage output

2014-01-08 Thread Ralph Castain
Updated (including the man page) - thanks! On Jan 8, 2014, at 3:48 PM, Paul Hargrove wrote: > Note the following still indicates that "--bind-to none" is the default: > > -bash-4.2$ mpirun --help | grep -A2 'bind-to ' >--bind-to Policy for binding processes [none

Re: [OMPI devel] 1.7.4rc2r30148 - crash in MPI_Init on Linux/x86

2014-01-08 Thread Ralph Castain
If you have the time, it might be worth nailing it down. However, I'm mindful of all the things you need to do, so please only if you have the time. Thanks Ralph On Jan 8, 2014, at 8:23 PM, Paul Hargrove wrote: > Ralph, > > Building with gcc-4.1.2 fixed the problem for

Re: [OMPI devel] 1.7.4rc2r30148 - crash in MPI_Init on Linux/x86

2014-01-08 Thread Paul Hargrove
Ralph, Building with gcc-4.1.2 fixed the problem for me. I also removed an old install of ompi-1.4 that was in LD_LIBRARY_PATH at build time and might have been a contributing factor. If I'd known earlier that it was there, I wouldn't have reported the problem without first removing it. I can

Re: [OMPI devel] 1.7.4rc2r30148 run failure NetBSD6-x86

2014-01-08 Thread Ralph Castain
Hmmm...I see the problem. Looks like binding isn't supported on that system for some reason, so we need to turn "off" our auto-binding when we hit that condition. I'll check to see why that isn't happening (was supposed to do so) On Jan 8, 2014, at 3:43 PM, Paul Hargrove

Re: [OMPI devel] trunk build failure on {Free,Net,Open}BSD

2014-01-08 Thread Ralph Castain
Actually, as I look at it, the logic escapes me anyway. Basically, you only have two options - use the vfs struct for Sun, and use fs struct for everything else. I'm not aware of any other choice, and indeed the list of all the systems for the latter actually is intended to amount to "anything

Re: [OMPI devel] 1.7.4rc2r30148 - implicit decl of bzero on Solaris

2014-01-08 Thread Ralph Castain
Thanks Paul - fixed in trunk and cmr'd to 1.7.4 On Jan 8, 2014, at 6:22 PM, Paul Hargrove wrote: > As of 1.7.4rc2r30148 there appears to be a missing "#include " in > bcol_basesmuma_smcm.c. Both the Solaris Studio compiler (output below) and > gcc agree on this point. >

[OMPI devel] 1.7.4rc2r30148 - crash in MPI_Init on Linux/x86

2014-01-08 Thread Paul Hargrove
I am still testing the current 1.7.4rc tarball on my various systems. The latest failure (shown below) is a SEGV somewhere below MPI_Init on a old, but otherwise fairly normal, Linux/x86 (32-bit) system. $ /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/bin/mpirun -np 1

[OMPI devel] 1.7.4rc2r30148 - implicit decl of bzero on Solaris

2014-01-08 Thread Paul Hargrove
As of 1.7.4rc2r30148 there appears to be a missing "#include " in bcol_basesmuma_smcm.c. Both the Solaris Studio compiler (output below) and gcc agree on this point. CC bcol_basesmuma_smcm.lo

[hwloc-devel] Create success (hwloc git dev-32-g3e6a7f2)

2014-01-08 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc dev-32-g3e6a7f2 Start time: Wed Jan 8 21:01:02 EST 2014 End time: Wed Jan 8 21:03:45 EST 2014 Your friendly daemon, Cyrador

[OMPI devel] trunk - ibverbs configure error on Solaris-11

2014-01-08 Thread Paul Hargrove
When trying to configure the OMPI trunk on a Solaris-11/x86-64 with --enable-openib, I see the following error not seen with the 1.7 branch: *** Compiler flags checking which of CFLAGS are ok for debugger modules... -DNDEBUG -m64 -mt checking for debugger extra CFLAGS... -g checking for the C

Re: [OMPI devel] 1.7.4rc2r30031 - FreeBSD mpirun warning

2014-01-08 Thread Ralph Castain
Yeah - it only today was approved for move into 1.7.4 :-) Hopefully will make it into tonight's tarball On Jan 8, 2014, at 3:58 PM, Paul Hargrove wrote: > On Fri, Dec 20, 2013 at 4:16 PM, Ralph Castain wrote: > I'll silence it - thanks! > > Ralph, > >

Re: [OMPI devel] 1.7.4rc2r30031 - FreeBSD mpirun warning

2014-01-08 Thread Paul Hargrove
On Fri, Dec 20, 2013 at 4:16 PM, Ralph Castain wrote: > I'll silence it - thanks! > Ralph, As of the current tarball (1.7.4rc2r30148) this warning is still present. I have also now encountered the identical message on NetBSD-6/x86: -bash-4.2$ mpirun -np 1 --bind-to none

[OMPI devel] 1.7.4rc2r30148 - out-of-date mpirun usage output

2014-01-08 Thread Paul Hargrove
Note the following still indicates that "--bind-to none" is the default: -bash-4.2$ mpirun --help | grep -A2 'bind-to ' --bind-to Policy for binding processes [none (default) | hwthread | core | socket | numa | board] (supported

[OMPI devel] 1.7.4rc2r30148 run failure NetBSD6-x86

2014-01-08 Thread Paul Hargrove
While I have yet to get a working build on NetBSD for x86-64 h/w, I *have* successfully built Open MPI's current 1.7.4rc tarball on NetBSD-6 for x86. However, I can't *run* anything: Attempting the ring_c example on 2 cores: -bash-4.2$ mpirun -mca btl sm,self -np 2 examples/ring_c

[OMPI devel] MPI implementation requirements for low-layer network APIs

2014-01-08 Thread Jeff Squyres (jsquyres)
Short version: == Sean Hefty from Intel is soliciting feedback from the MPI community about what MPI needs from a new low-level networking API (that hopes to be the successor to libibverbs). He has asked me to gather this feedback and present it to the OpenFabrics "libfabric"

[OMPI devel] 1.7.4rc2r30148 - static link failure on NetBSD

2014-01-08 Thread Paul Hargrove
When I compile the current 1.7.4rc on NetBSD with no configure arguments, I still get the "make install" failure that I have detailed in previous emails. HOWEVER, if I configure with "--enable-static --disable-shared" then I get an earlier failure at build time (partial "make V=1" output shown

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Nathan Hjelm
Yeah. Its hard to say what the results will look like on Haswell. I expect they should show some improvement from George's change but we won't know until I can get to a Haswell node. Hopefully one becomes available today. -Nathan On Wed, Jan 08, 2014 at 08:59:34AM -0800, Paul Hargrove wrote: >

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Paul Hargrove
Nevermind, since Nathan just clarified that the results are not comparable. -Paul [Sent from my phone] On Jan 8, 2014 8:58 AM, "Paul Hargrove" wrote: > Interestingly enough the 4MB latency actually improved significantly > relative to the initial numbers. > > -Paul [Sent

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Paul Hargrove
Interestingly enough the 4MB latency actually improved significantly relative to the initial numbers. -Paul [Sent from my phone] On Jan 8, 2014 8:50 AM, "George Bosilca" wrote: > These results are way worst that the one you send on your previous email? > What is the reason?

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Nathan Hjelm
Sorry, should have said that this is a different cluster. These results were on Sandy Bridge and the others were on Haswell. Don't have mvapich on the Haswell cluster. Will check the current patch on Haswell later today. -Nathan On Wed, Jan 08, 2014 at 05:50:34PM +0100, George Bosilca wrote: >

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread George Bosilca
These results are way worst that the one you send on your previous email? What is the reason? George. On Jan 8, 2014, at 17:33 , Nathan Hjelm wrote: > Ah, good catch. A new version is attached that should eliminate the race > window for the multi-threaded case. Performance

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Nathan Hjelm
Ah, good catch. A new version is attached that should eliminate the race window for the multi-threaded case. Performance numbers are still looking really good. We beat mvapich2 in the small message ping-pong by a good margin. See the results below. The large message latency difference for large

Re: [OMPI devel] Missing --bycore option in Open MPI 1.7.?

2014-01-08 Thread Christoph Niethammer
Hi, Just found the following ticket which answers my question: https://svn.open-mpi.org/trac/ompi/ticket/4044 Sorry for spam. :/ Regards Christoph - Ursprüngliche Mail - Von: "Christoph Niethammer" An: "Open MPI Developers" Gesendet: Mittwoch,

[OMPI devel] Missing --bycore option in Open MPI 1.7.?

2014-01-08 Thread Christoph Niethammer
Hello Using Open MPI 1.7.3 I got the following error message when executing mpirun -np 16 --bycore /bin/hostname mpirun: Error: unknown option "--bycore" The option is documented in the man pages and with Open MPI 1.6.5 everything works fine. For --bysocket I get the same error but --bynode

Re: [hwloc-devel] hwloc with Xen system support - v2

2014-01-08 Thread Brice Goglin
Le 07/01/2014 15:19, Andrew Cooper a écrit : > On 07/01/14 11:54, Brice Goglin wrote: >> Can't check the code right now, but a couple questions below. I just checked the code. My only little complain is that we always get this error message in the terminal, even when not enabling Xen: xc:

Re: [hwloc-devel] cython re-write of python-hwloc

2014-01-08 Thread Brice Goglin
Thanks ! Brice Le 07/01/2014 21:32, Guy Streeter a écrit : > Partly to prepare for the eventual switch to python3, and partly for the > better refcount handling etc., I have rewritten python-hwloc and the requisite > python-libnuma in Cython. > > The only drawback I've noticed with this change