Sounds like something has been broken - what Jeff describes is the intended 
behavior

> On May 16, 2016, at 8:00 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> Jeff,
> 
> this is not what I observed
> (tcp btl, 2 to 4 nodes with one task per node, cutoff=0)
> the add_procs of the tcp btl is invoked once with the 4 tasks.
> I checked the sources and found cutoff only controls if the modex is invoked 
> once for all at init, or on demand.
> 
> Cheers,
> 
> Gilles
> 
> On Monday, May 16, 2016, Jeff Squyres (jsquyres) <jsquy...@cisco.com 
> <mailto:jsquy...@cisco.com>> wrote:
> We changed the way BTL add_procs is invoked on master and v2.x for 
> scalability reasons.
> 
> In short: add_procs is only invoked the first time you talk to a given peer.  
> The cutoff switch is an override to that -- if the sizeof COMM_WORLD is less 
> than the cutoff, we revert to the old behavior of calling add_procs for all 
> procs.
> 
> As for why one BTL would be chosen over another, be sure to look at not only 
> the priority of the component/module, but also the exclusivity level.  In 
> short, only BTLs with the same exclusivity level will be considered (e.g., 
> this is how we exclude TCP when using HPC-class networks), and then the BTL 
> modules with the highest priority will be used for a given peer.
> 
> 
> > On May 16, 2016, at 7:19 AM, Gilles Gouaillardet 
> > <gilles.gouaillar...@gmail.com <javascript:;>> wrote:
> >
> > it seems I misunderstood some things ...
> >
> > add_procs is always invoked, regardless the cutoff value.
> > cutoff is used to retrieve processes info via the modex "on demand" vs at 
> > init time.
> >
> > Please someone correct me and/or elaborate if needed
> >
> > Cheers,
> >
> > Gilles
> >
> > On Monday, May 16, 2016, Gilles Gouaillardet <gil...@rist.or.jp 
> > <javascript:;>> wrote:
> > i cannot reproduce this behavior.
> >
> > note mca_btl_tcp_add_procs is invoked once per tcp component (e.g. once per 
> > physical NIC)
> >
> > so you might want to explicitly select one nic
> >
> > mpirun --mca btl_tcp_if_include xxx ...
> >
> > my printf output are the same and regardless the mpi_add_procs_cutoff value
> >
> >
> > Cheers,
> >
> >
> > Gilles
> > On 5/16/2016 12:22 AM, dpchoudh . wrote:
> >> Sorry, I accidentally pressed 'Send' before I was done writing the last 
> >> mail. What I wanted to ask was what is the parameter mpi_add_procs_cutoff 
> >> and why adding it seems to make a difference in the code path but not in 
> >> the end result of the program? How would it help me debug my problem?
> >>
> >> Thank you
> >> Durga
> >>
> >> The surgeon general advises you to eat right, exercise regularly and quit 
> >> ageing.
> >>
> >> On Sun, May 15, 2016 at 11:17 AM, dpchoudh . <dpcho...@gmail.com 
> >> <javascript:;>> wrote:
> >> Hello Gilles
> >>
> >> Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to the 
> >> output, as follows:
> >>
> >> With -mca mpi_add_procs_cutoff 1024:
> >> reachable =     0x1
> >> (Note that add_procs was called once and the value of 'reachable is 
> >> correct')
> >>
> >> Without -mca mpi_add_procs_cutoff 1024
> >> reachable =     0x0
> >> reachable = NULL
> >> reachable = NULL
> >> (Note that add_procs() was caklled three times and the value of 
> >> 'reachable' seems wrong.
> >>
> >> The program does run correctly in either case. The program listing is as 
> >> below (note that I have removed output from the program itself in the 
> >> above reporting.)
> >>
> >> The code that prints 'reachable' is as follows:
> >>
> >> if (reachable == NULL)
> >>     printf("reachable = NULL\n");
> >> else
> >> {
> >>     int i;
> >>     printf("reachable = ");
> >>     for (i = 0; i < reachable->array_size; i++)
> >>     printf("\t0x%llu", reachable->bitmap[i]);
> >>     printf("\n\n");
> >> }
> >> return OPAL_SUCCESS;
> >>
> >> And the code for the test program is as follows:
> >>
> >> #include <mpi.h>
> >> #include <stdio.h>
> >> #include <string.h>
> >> #include <stdlib.h>
> >>
> >> int main(int argc, char *argv[])
> >> {
> >>     int world_size, world_rank, name_len;
> >>     char hostname[MPI_MAX_PROCESSOR_NAME], buf[8];
> >>
> >>     MPI_Init(&argc, &argv);
> >>     MPI_Comm_size(MPI_COMM_WORLD, &world_size);
> >>     MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
> >>     MPI_Get_processor_name(hostname, &name_len);
> >>     printf("Hello world from processor %s, rank %d out of %d 
> >> processors\n", hostname, world_rank, world_size);
> >>     if (world_rank == 1)
> >>     {
> >>     MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
> >>     printf("%s received %s, rank %d\n", hostname, buf, world_rank);
> >>     }
> >>     else
> >>     {
> >>     strcpy(buf, "haha!");
> >>     MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
> >>     printf("%s sent %s, rank %d\n", hostname, buf, world_rank);
> >>     }
> >>     MPI_Barrier(MPI_COMM_WORLD);
> >>     MPI_Finalize();
> >>     return 0;
> >> }
> >>
> >>
> >>
> >> The surgeon general advises you to eat right, exercise regularly and quit 
> >> ageing.
> >>
> >> On Sun, May 15, 2016 at 10:49 AM, Gilles Gouaillardet 
> >> <gilles.gouaillar...@gmail.com <javascript:;>> wrote:
> >> At first glance, that seems a bit odd...
> >> are you sure you correctly print the reachable bitmap ?
> >> I would suggest you add some instrumentation to understand what happens
> >> (e.g., printf before opal_bitmap_set_bit() and other places that prevent 
> >> this from happening)
> >>
> >> one more thing ...
> >> now, master default behavior is
> >> mpirun --mca mpi_add_procs_cutoff 0 ...
> >> you might want to try
> >> mpirun --mca mpi_add_procs_cutoff 1024 ...
> >> and see if things make more sense.
> >> if it helps, and iirc, there is a parameter so a btl can report it does 
> >> not support cutoff.
> >>
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com <javascript:;>> 
> >> wrote:
> >> Hello Gilles
> >>
> >> Thanks for jumping in to help again. Actually, I had already tried some of 
> >> your suggestions before asking for help.
> >>
> >> I have several interconnects that can run both openib and tcp BTL. To 
> >> simplify things, I explicitly mentioned TCP:
> >>
> >> mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp ./mpitest
> >>
> >> where mpitest is a small program that does MPI_Send()/MPI_Recv() on a 
> >> small string, and then does an MPI_Barrier(). The program does work as 
> >> expected.
> >>
> >> I put a printf on the last line of mca_tcp_add_procs() to print the value 
> >> of 'reachable'. What I saw was that the value was always 0 when it was 
> >> invoked for Send()/Recv() and the pointer itself was NULL when invoked for 
> >> Barrier()
> >>
> >> Next I looked at pml_ob1_add_procs(), where the call chain starts, and 
> >> found that it initializes and passes an opal_bitmap_t reachable down the 
> >> call chain, but the resulting value is not used later in the code (the 
> >> memory is simply freed later).
> >>
> >> That, coupled with the fact that I am trying to imitate what the other BTL 
> >> implementations are doing, yet in mca_bml_r2_endpoint_add_btl() by BTL is 
> >> not being picked up, left me puzzled. Please note that the interconnect 
> >> that I am developing for is on a different cluster (than where I ran the 
> >> above test for TCP BTL.)
> >>
> >> Thanks again
> >> Durga
> >>
> >> The surgeon general advises you to eat right, exercise regularly and quit 
> >> ageing.
> >>
> >> On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet 
> >> <gilles.gouaillar...@gmail.com <javascript:;>> wrote:
> >> did you check the add_procs callbacks ?
> >> (e.g. mca_btl_tcp_add_procs() for the tcp btl)
> >> this is where the reachable bitmap is set, and I guess this is what you 
> >> are looking for.
> >>
> >> keep in mind that if several btl can be used, the one with the higher 
> >> exclusivity is used
> >> (e.g. tcp is never used if openib is available)
> >> you can simply force your btl and self, and the ob1 pml, so you do not 
> >> have to worry about other btl exclusivity.
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >>
> >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com <javascript:;>> 
> >> wrote:
> >> Hello all
> >>
> >> I have been struggling with this issue for a while and figured it might be 
> >> a good idea to ask for help.
> >>
> >> Where (in the code path) is the connectivity map created?
> >>
> >> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but 
> >> obviously I am not setting it up right, because this routine is not 
> >> finding the BTL corresponding to my interconnect.
> >>
> >> Thanks in advance
> >> Durga
> >>
> >> The surgeon general advises you to eat right, exercise regularly and quit 
> >> ageing.
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <javascript:;>
> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/devel/2016/05/18975.php 
> >> <http://www.open-mpi.org/community/lists/devel/2016/05/18975.php>
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> de...@open-mpi.org <javascript:;>
> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/devel/2016/05/18977.php 
> >> <http://www.open-mpi.org/community/lists/devel/2016/05/18977.php>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >>
> >> de...@open-mpi.org <javascript:;>
> >>
> >> Subscription:
> >> https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> >>
> >> Link to this post:
> >> http://www.open-mpi.org/community/lists/devel/2016/05/18979.php 
> >> <http://www.open-mpi.org/community/lists/devel/2016/05/18979.php>
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org <javascript:;>
> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2016/05/18981.php 
> > <http://www.open-mpi.org/community/lists/devel/2016/05/18981.php>
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com <javascript:;>
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/ 
> <http://www.cisco.com/web/about/doing_business/legal/cri/>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <javascript:;>
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/05/18982.php 
> <http://www.open-mpi.org/community/lists/devel/2016/05/18982.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/05/18983.php

Reply via email to