Re: [OMPI devel] some info is not pushed into the dstore

2014-05-28 Thread Ralph Castain
On May 28, 2014, at 1:18 AM, Gilles Gouaillardet wrote: > i finally got it :-) Hooray! Thanks for digging deeper. > > /* i previously got it "almost" right ... */ > > here is what happens on job 2 (with trunk) : > MPI_Intercomm_create calls ompi_comm_get_rprocs that calls ompi_proc_unpack >

Re: [OMPI devel] some info is not pushed into the dstore

2014-05-28 Thread Gilles Gouaillardet
i finally got it :-) /* i previously got it "almost" right ... */ here is what happens on job 2 (with trunk) : MPI_Intercomm_create calls ompi_comm_get_rprocs that calls ompi_proc_unpack => ompi_proc_unpack store job 3 info into opal_dstore_peer then ompi_comm_get_rprocs calls ompi_proc_set_loc

Re: [OMPI devel] some info is not pushed into the dstore

2014-05-27 Thread Ralph Castain
Hmmm...I did some digging, and the best I can tell is that root cause is that the second job ("b" in the test program) is never actually calling connect_accept! This looks like a change may have occurred in Intercomm_create that is causing it to not recognize the need to do so. Anyone confirm

Re: [OMPI devel] some info is not pushed into the dstore

2014-05-27 Thread Ralph Castain
Hi Gilles I concur on the typo and fixed it - thanks for catching it. I'll have to look into the problem you reported as it has been fixed in the past, and was working last I checked it. The info required for this 3-way connect/accept is supposed to be in the modex provided by the common commun

[OMPI devel] some info is not pushed into the dstore

2014-05-27 Thread Gilles Gouaillardet
Folks, while debugging the dynamic/intercomm_create from the ibm test suite, i found something odd. i ran *without* any batch manager on a VM (one socket and four cpus) mpirun -np 1 ./dynamic/intercomm_create it hangs by default it works with --mca coll ^ml basically : - task 0 spawns task 1 -