Re: [OMPI devel] SM component init unload

2012-07-03 Thread Ralph Castain
Good catch, George - thanks for the detailed explanation. I think what happened here was that we changed the ordering in MPI_Init a while back - we had always had a rule about when MPI components could access remote proc info, but had grown lax about it, so at least some of the BTLs had to be fixed

Re: [OMPI devel] SM component init unload

2012-07-03 Thread George Bosilca
Juan, Something weird is going on there. The selection mechanism for the SM coll and SM BTL should be very similar. However, the SM BTL successfully select itself while the SM coll fails to determine that all processes are local. In the coll SM the issue is that the remote procs do not have the

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Ralph Castain
Okay, please try this again with r26739 or above. You can remove the rest of the "verbose" settings and the --display-map so we declutter the output. Please add "-mca orte_nidmap_verbose 20" to your cmd line. Thanks! Ralph On Tue, Jul 3, 2012 at 1:50 PM, Juan A. Rico wrote: > Here is the outpu

[OMPI devel] ibarrier failures on MTT

2012-07-03 Thread Eugene Loh
I'll look at this more, but for now I'll just note that the new ibarrier test is showing lots of failures on MTT (cisco and oracle).

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Ralph Castain
Rats - no help there. I'll add some debug to the code base tonight that will tell us more about what's going on here. On Jul 3, 2012, at 1:50 PM, Juan A. Rico wrote: > Here is the output. > > [jarico@Metropolis-01 examples]$ > /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Juan A. Rico
Here is the output. [jarico@Metropolis-01 examples]$ /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca mca_ba

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Ralph Castain
Interesting - yes, coll sm doesn't think they are on the same node for some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote: > The code I run is a simple broadcast. > > When I do not specify components to run, the ou

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Juan Antonio Rico Gallego
The code I run is a simple broadcast. When I do not specify components to run, the output is (more verbose): [jarico@Metropolis-01 examples]$ /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority 99 -mca hwlo

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Jeff Squyres
The issue is that the "sm" coll component only implements a few of the MPI collective operations. It is usually mixed at run-time with other coll components to fill out the rest of the MPI collective operations. So what is happening is that OMPI is determining that it doesn't have implementati

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Juan Antonio Rico Gallego
Output is: [Metropolis-01:15355] hwloc:base:get_topology [Metropolis-01:15355] hwloc:base: no cpus specified - using root available cpuset JOB MAP Data for node: Metropolis-01 Num procs: 2 Process OMPI jobid: [59809,1] App: 0 Pro

Re: [OMPI devel] SM component init unload

2012-07-03 Thread Ralph Castain
Sounds strange - the locality is definitely being set in the code. Can you run it with -mca hwloc_base_verbose 5 --display-map? Should tell us where it thinks things are running, and what locality it is recording. On Jul 3, 2012, at 11:54 AM, Juan Antonio Rico Gallego wrote: > Hello everyone.

[OMPI devel] SM component init unload

2012-07-03 Thread Juan Antonio Rico Gallego
Hello everyone. Maybe you can help me: I got a subversion (r 26725) from the developers trunk. I configure with: ../../onecopy/ompi-trunk/configure --prefix=/home/jarico/shared/packages/openmpi-cas-dbg --disable-shared --enable-static --enable-debug --enable-mem-profile --enable-mem-debug CFLAG

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-03 Thread Jeff Squyres
On Jul 2, 2012, at 6:09 PM, Steve Wise wrote: >> Can you extend this new stuff to support RDMACM, including the warp-needed >> connector-sends-first stuff? > > I have no time right now. I could test something perhaps if someone can do > the initial pull of the rdma cpc code into the ofacm...

Re: [OMPI devel] openib btl and cq overflows

2012-07-03 Thread Jeff Squyres
We talked about this on the weekly call today. Conclusions: 1. Looks like we just goofed on the CQ default size values. Doh! 2. There does not appear to be any reason we're not using the device CQ max size by default. Ticket #3152 changes the trunk to do this (and we'll CMR to v1.6 and v1.7).

[OMPI devel] Fwd: EuroMPI 2012 Call for participation

2012-07-03 Thread Jeff Squyres
Hope to see some of you in Austria! > EuroMPI 2012 CALL FOR PARTICIPATION > > EuroMPI 2012 - the prime meeting for researchers, developers, and > students in message-passing parallel computing with MPI (and related > paradigms) - calls for *active participation* in the conference which > will ta