Sounds like something has been broken - what Jeff describes is the intended behavior
> On May 16, 2016, at 8:00 AM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: > > Jeff, > > this is not what I observed > (tcp btl, 2 to 4 nodes with one task per node, cutoff=0) > the add_procs of the tcp btl is invoked once with the 4 tasks. > I checked the sources and found cutoff only controls if the modex is invoked > once for all at init, or on demand. > > Cheers, > > Gilles > > On Monday, May 16, 2016, Jeff Squyres (jsquyres) <jsquy...@cisco.com > <mailto:jsquy...@cisco.com>> wrote: > We changed the way BTL add_procs is invoked on master and v2.x for > scalability reasons. > > In short: add_procs is only invoked the first time you talk to a given peer. > The cutoff switch is an override to that -- if the sizeof COMM_WORLD is less > than the cutoff, we revert to the old behavior of calling add_procs for all > procs. > > As for why one BTL would be chosen over another, be sure to look at not only > the priority of the component/module, but also the exclusivity level. In > short, only BTLs with the same exclusivity level will be considered (e.g., > this is how we exclude TCP when using HPC-class networks), and then the BTL > modules with the highest priority will be used for a given peer. > > > > On May 16, 2016, at 7:19 AM, Gilles Gouaillardet > > <gilles.gouaillar...@gmail.com <javascript:;>> wrote: > > > > it seems I misunderstood some things ... > > > > add_procs is always invoked, regardless the cutoff value. > > cutoff is used to retrieve processes info via the modex "on demand" vs at > > init time. > > > > Please someone correct me and/or elaborate if needed > > > > Cheers, > > > > Gilles > > > > On Monday, May 16, 2016, Gilles Gouaillardet <gil...@rist.or.jp > > <javascript:;>> wrote: > > i cannot reproduce this behavior. > > > > note mca_btl_tcp_add_procs is invoked once per tcp component (e.g. once per > > physical NIC) > > > > so you might want to explicitly select one nic > > > > mpirun --mca btl_tcp_if_include xxx ... > > > > my printf output are the same and regardless the mpi_add_procs_cutoff value > > > > > > Cheers, > > > > > > Gilles > > On 5/16/2016 12:22 AM, dpchoudh . wrote: > >> Sorry, I accidentally pressed 'Send' before I was done writing the last > >> mail. What I wanted to ask was what is the parameter mpi_add_procs_cutoff > >> and why adding it seems to make a difference in the code path but not in > >> the end result of the program? How would it help me debug my problem? > >> > >> Thank you > >> Durga > >> > >> The surgeon general advises you to eat right, exercise regularly and quit > >> ageing. > >> > >> On Sun, May 15, 2016 at 11:17 AM, dpchoudh . <dpcho...@gmail.com > >> <javascript:;>> wrote: > >> Hello Gilles > >> > >> Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to the > >> output, as follows: > >> > >> With -mca mpi_add_procs_cutoff 1024: > >> reachable = 0x1 > >> (Note that add_procs was called once and the value of 'reachable is > >> correct') > >> > >> Without -mca mpi_add_procs_cutoff 1024 > >> reachable = 0x0 > >> reachable = NULL > >> reachable = NULL > >> (Note that add_procs() was caklled three times and the value of > >> 'reachable' seems wrong. > >> > >> The program does run correctly in either case. The program listing is as > >> below (note that I have removed output from the program itself in the > >> above reporting.) > >> > >> The code that prints 'reachable' is as follows: > >> > >> if (reachable == NULL) > >> printf("reachable = NULL\n"); > >> else > >> { > >> int i; > >> printf("reachable = "); > >> for (i = 0; i < reachable->array_size; i++) > >> printf("\t0x%llu", reachable->bitmap[i]); > >> printf("\n\n"); > >> } > >> return OPAL_SUCCESS; > >> > >> And the code for the test program is as follows: > >> > >> #include <mpi.h> > >> #include <stdio.h> > >> #include <string.h> > >> #include <stdlib.h> > >> > >> int main(int argc, char *argv[]) > >> { > >> int world_size, world_rank, name_len; > >> char hostname[MPI_MAX_PROCESSOR_NAME], buf[8]; > >> > >> MPI_Init(&argc, &argv); > >> MPI_Comm_size(MPI_COMM_WORLD, &world_size); > >> MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); > >> MPI_Get_processor_name(hostname, &name_len); > >> printf("Hello world from processor %s, rank %d out of %d > >> processors\n", hostname, world_rank, world_size); > >> if (world_rank == 1) > >> { > >> MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); > >> printf("%s received %s, rank %d\n", hostname, buf, world_rank); > >> } > >> else > >> { > >> strcpy(buf, "haha!"); > >> MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD); > >> printf("%s sent %s, rank %d\n", hostname, buf, world_rank); > >> } > >> MPI_Barrier(MPI_COMM_WORLD); > >> MPI_Finalize(); > >> return 0; > >> } > >> > >> > >> > >> The surgeon general advises you to eat right, exercise regularly and quit > >> ageing. > >> > >> On Sun, May 15, 2016 at 10:49 AM, Gilles Gouaillardet > >> <gilles.gouaillar...@gmail.com <javascript:;>> wrote: > >> At first glance, that seems a bit odd... > >> are you sure you correctly print the reachable bitmap ? > >> I would suggest you add some instrumentation to understand what happens > >> (e.g., printf before opal_bitmap_set_bit() and other places that prevent > >> this from happening) > >> > >> one more thing ... > >> now, master default behavior is > >> mpirun --mca mpi_add_procs_cutoff 0 ... > >> you might want to try > >> mpirun --mca mpi_add_procs_cutoff 1024 ... > >> and see if things make more sense. > >> if it helps, and iirc, there is a parameter so a btl can report it does > >> not support cutoff. > >> > >> > >> Cheers, > >> > >> Gilles > >> > >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com <javascript:;>> > >> wrote: > >> Hello Gilles > >> > >> Thanks for jumping in to help again. Actually, I had already tried some of > >> your suggestions before asking for help. > >> > >> I have several interconnects that can run both openib and tcp BTL. To > >> simplify things, I explicitly mentioned TCP: > >> > >> mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp ./mpitest > >> > >> where mpitest is a small program that does MPI_Send()/MPI_Recv() on a > >> small string, and then does an MPI_Barrier(). The program does work as > >> expected. > >> > >> I put a printf on the last line of mca_tcp_add_procs() to print the value > >> of 'reachable'. What I saw was that the value was always 0 when it was > >> invoked for Send()/Recv() and the pointer itself was NULL when invoked for > >> Barrier() > >> > >> Next I looked at pml_ob1_add_procs(), where the call chain starts, and > >> found that it initializes and passes an opal_bitmap_t reachable down the > >> call chain, but the resulting value is not used later in the code (the > >> memory is simply freed later). > >> > >> That, coupled with the fact that I am trying to imitate what the other BTL > >> implementations are doing, yet in mca_bml_r2_endpoint_add_btl() by BTL is > >> not being picked up, left me puzzled. Please note that the interconnect > >> that I am developing for is on a different cluster (than where I ran the > >> above test for TCP BTL.) > >> > >> Thanks again > >> Durga > >> > >> The surgeon general advises you to eat right, exercise regularly and quit > >> ageing. > >> > >> On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet > >> <gilles.gouaillar...@gmail.com <javascript:;>> wrote: > >> did you check the add_procs callbacks ? > >> (e.g. mca_btl_tcp_add_procs() for the tcp btl) > >> this is where the reachable bitmap is set, and I guess this is what you > >> are looking for. > >> > >> keep in mind that if several btl can be used, the one with the higher > >> exclusivity is used > >> (e.g. tcp is never used if openib is available) > >> you can simply force your btl and self, and the ob1 pml, so you do not > >> have to worry about other btl exclusivity. > >> > >> Cheers, > >> > >> Gilles > >> > >> > >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com <javascript:;>> > >> wrote: > >> Hello all > >> > >> I have been struggling with this issue for a while and figured it might be > >> a good idea to ask for help. > >> > >> Where (in the code path) is the connectivity map created? > >> > >> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but > >> obviously I am not setting it up right, because this routine is not > >> finding the BTL corresponding to my interconnect. > >> > >> Thanks in advance > >> Durga > >> > >> The surgeon general advises you to eat right, exercise regularly and quit > >> ageing. > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org <javascript:;> > >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2016/05/18975.php > >> <http://www.open-mpi.org/community/lists/devel/2016/05/18975.php> > >> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org <javascript:;> > >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2016/05/18977.php > >> <http://www.open-mpi.org/community/lists/devel/2016/05/18977.php> > >> > >> > >> > >> > >> _______________________________________________ > >> devel mailing list > >> > >> de...@open-mpi.org <javascript:;> > >> > >> Subscription: > >> https://www.open-mpi.org/mailman/listinfo.cgi/devel > >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > >> > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2016/05/18979.php > >> <http://www.open-mpi.org/community/lists/devel/2016/05/18979.php> > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org <javascript:;> > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > > <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2016/05/18981.php > > <http://www.open-mpi.org/community/lists/devel/2016/05/18981.php> > > > -- > Jeff Squyres > jsquy...@cisco.com <javascript:;> > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > <http://www.cisco.com/web/about/doing_business/legal/cri/> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <javascript:;> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/05/18982.php > <http://www.open-mpi.org/community/lists/devel/2016/05/18982.php> > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/05/18983.php