Actually, we wouldn't have to modify the interface - just have to define a DB_RTE flag and OR it to the DB_INTERNAL/DB_EXTERNAL one. We'd need to modify the "fetch" routines to pass the flag into them so we fetched the right things, but that's a simple change.
On Sep 18, 2013, at 10:12 AM, Ralph Castain <r...@open-mpi.org> wrote: > I struggled with that myself when doing my earlier patch - part of the reason > why I added the dpm API. > > I don't know how to update the locality without referencing RTE-specific > keys, so maybe the best thing would be to provide some kind of hook into the > db that says we want all the non-RTE keys? Would be simple to add that > capability, though we'd have to modify the interface so we specify "RTE key" > when doing the initial store. > > The "internal" flag is used to avoid re-sending data to the system under PMI. > We "store" our data as "external" in the PMI components so the data gets > pushed out, then fetch using PMI and store "internal" to put it in our > internal hash. So "internal" doesn't mean "non-RTE". > > > On Sep 18, 2013, at 10:02 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> I hit send too early. >> >> Now that we move the entire "local" modex is there any way to trim it down >> or to replace the entries that are not correct anymore? Like the locality? >> >> George. >> >> On Sep 18, 2013, at 18:53 , George Bosilca <bosi...@icl.utk.edu> wrote: >> >>> Regarding your comment on the bug trac, I noticed there is a DB_INTERNAL >>> flag. While I see how to set I could not figure out any way to get it back. >>> >>> With the required modification of the DB API can't we take advantage of it? >>> >>> George. >>> >>> >>> On Sep 18, 2013, at 18:52 , Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> Thanks George - much appreciated >>>> >>>> On Sep 18, 2013, at 9:49 AM, George Bosilca <bosi...@icl.utk.edu> wrote: >>>> >>>>> The test case was broken. I just pushed a fix. >>>>> >>>>> George. >>>>> >>>>> On Sep 18, 2013, at 16:49 , Ralph Castain <r...@open-mpi.org> wrote: >>>>> >>>>>> Hangs with any np > 1 >>>>>> >>>>>> However, I'm not sure if that's an issue with the test vs the underlying >>>>>> implementation >>>>>> >>>>>> On Sep 18, 2013, at 7:40 AM, "Jeff Squyres (jsquyres)" >>>>>> <jsquy...@cisco.com> wrote: >>>>>> >>>>>>> Does it hang when you run with -np 4? >>>>>>> >>>>>>> Sent from my phone. No type good. >>>>>>> >>>>>>> On Sep 18, 2013, at 4:10 PM, "Ralph Castain" <r...@open-mpi.org> wrote: >>>>>>> >>>>>>>> Strange - it works fine for me on my Mac. However, I see one >>>>>>>> difference - I only run it with np=1 >>>>>>>> >>>>>>>> On Sep 18, 2013, at 2:22 AM, Jeff Squyres (jsquyres) >>>>>>>> <jsquy...@cisco.com> wrote: >>>>>>>> >>>>>>>>> On Sep 18, 2013, at 9:33 AM, George Bosilca <bosi...@icl.utk.edu> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> 1. sm doesn't work between spawned processes. So you must have >>>>>>>>>> another network enabled. >>>>>>>>> >>>>>>>>> I know :-). I have tcp available as well (OMPI will abort if you >>>>>>>>> only run with sm,self because the comm_spawn will fail with >>>>>>>>> unreachable errors -- I just tested/proved this to myself). >>>>>>>>> >>>>>>>>>> 2. Don't use the test case attached to my email, I left an xterm >>>>>>>>>> based spawn and the debugging. It can't work without xterm support. >>>>>>>>>> Instead try using the test case from the trunk, the one committed by >>>>>>>>>> Ralph. >>>>>>>>> >>>>>>>>> I didn't see any "xterm" strings in there, but ok. :-) I ran with >>>>>>>>> orte/test/mpi/intercomm_create.c, and that hangs for me as well: >>>>>>>>> >>>>>>>>> ----- >>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>>>>>> [rank 4] >>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>>>>>> [rank 5] >>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>>>>>> [rank 6] >>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>>>>>> [rank 7] >>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>> [rank 4] >>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>> [rank 5] >>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>> [rank 6] >>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>> [rank 7] >>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>> [hang] >>>>>>>>> ----- >>>>>>>>> >>>>>>>>> Similarly, on my Mac, it hangs with no output: >>>>>>>>> >>>>>>>>> ----- >>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>> [hang] >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> George. >>>>>>>>>> >>>>>>>>>> On Sep 18, 2013, at 07:53 , "Jeff Squyres (jsquyres)" >>>>>>>>>> <jsquy...@cisco.com> wrote: >>>>>>>>>> >>>>>>>>>>> George -- >>>>>>>>>>> >>>>>>>>>>> When I build the SVN trunk (r29201) on 64 bit linux, your attached >>>>>>>>>>> test case hangs: >>>>>>>>>>> >>>>>>>>>>> ----- >>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, >>>>>>>>>>> &inter) [rank 4] >>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, >>>>>>>>>>> &inter) [rank 5] >>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, >>>>>>>>>>> &inter) [rank 6] >>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, >>>>>>>>>>> &inter) [rank 7] >>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>>>> [rank 4] >>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>>>> [rank 5] >>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>>>> [rank 6] >>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) >>>>>>>>>>> [rank 7] >>>>>>>>>>> [hang] >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>> On my Mac, it hangs without printing anything: >>>>>>>>>>> >>>>>>>>>>> ----- >>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>> [hang] >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sep 18, 2013, at 1:48 AM, George Bosilca <bosi...@icl.utk.edu> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Here is a quick (and definitively not the cleanest) patch that >>>>>>>>>>>> addresses the MPI_Intercomm issue at the MPI level. It should be >>>>>>>>>>>> applied after removal of 29166. >>>>>>>>>>>> >>>>>>>>>>>> I also added the corrected test case stressing the corner cases by >>>>>>>>>>>> doing barriers at every inter-comm creation and doing a clean >>>>>>>>>>>> disconnect. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Jeff Squyres >>>>>>>>>>> jsquy...@cisco.com >>>>>>>>>>> For corporate legal information go to: >>>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jeff Squyres >>>>>>>>> jsquy...@cisco.com >>>>>>>>> For corporate legal information go to: >>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >