We changed the way BTL add_procs is invoked on master and v2.x for scalability 
reasons.

In short: add_procs is only invoked the first time you talk to a given peer.  
The cutoff switch is an override to that -- if the sizeof COMM_WORLD is less 
than the cutoff, we revert to the old behavior of calling add_procs for all 
procs.

As for why one BTL would be chosen over another, be sure to look at not only 
the priority of the component/module, but also the exclusivity level.  In 
short, only BTLs with the same exclusivity level will be considered (e.g., this 
is how we exclude TCP when using HPC-class networks), and then the BTL modules 
with the highest priority will be used for a given peer.


> On May 16, 2016, at 7:19 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> it seems I misunderstood some things ...
> 
> add_procs is always invoked, regardless the cutoff value.
> cutoff is used to retrieve processes info via the modex "on demand" vs at 
> init time.
> 
> Please someone correct me and/or elaborate if needed
> 
> Cheers,
> 
> Gilles
> 
> On Monday, May 16, 2016, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
> i cannot reproduce this behavior.
> 
> note mca_btl_tcp_add_procs is invoked once per tcp component (e.g. once per 
> physical NIC)
> 
> so you might want to explicitly select one nic
> 
> mpirun --mca btl_tcp_if_include xxx ...
> 
> my printf output are the same and regardless the mpi_add_procs_cutoff value
> 
> 
> Cheers,
> 
> 
> Gilles
> On 5/16/2016 12:22 AM, dpchoudh . wrote:
>> Sorry, I accidentally pressed 'Send' before I was done writing the last 
>> mail. What I wanted to ask was what is the parameter mpi_add_procs_cutoff 
>> and why adding it seems to make a difference in the code path but not in the 
>> end result of the program? How would it help me debug my problem?
>> 
>> Thank you
>> Durga
>> 
>> The surgeon general advises you to eat right, exercise regularly and quit 
>> ageing.
>> 
>> On Sun, May 15, 2016 at 11:17 AM, dpchoudh . <dpcho...@gmail.com> wrote:
>> Hello Gilles
>> 
>> Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to the 
>> output, as follows:
>> 
>> With -mca mpi_add_procs_cutoff 1024:
>> reachable =     0x1
>> (Note that add_procs was called once and the value of 'reachable is correct')
>> 
>> Without -mca mpi_add_procs_cutoff 1024
>> reachable =     0x0
>> reachable = NULL
>> reachable = NULL
>> (Note that add_procs() was caklled three times and the value of 'reachable' 
>> seems wrong.
>> 
>> The program does run correctly in either case. The program listing is as 
>> below (note that I have removed output from the program itself in the above 
>> reporting.)
>> 
>> The code that prints 'reachable' is as follows:
>> 
>> if (reachable == NULL)
>>     printf("reachable = NULL\n");
>> else
>> {
>>     int i;
>>     printf("reachable = ");
>>     for (i = 0; i < reachable->array_size; i++)
>>     printf("\t0x%llu", reachable->bitmap[i]);
>>     printf("\n\n");
>> }
>> return OPAL_SUCCESS;
>> 
>> And the code for the test program is as follows:
>> 
>> #include <mpi.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <stdlib.h>
>> 
>> int main(int argc, char *argv[])
>> {
>>     int world_size, world_rank, name_len;
>>     char hostname[MPI_MAX_PROCESSOR_NAME], buf[8];
>> 
>>     MPI_Init(&argc, &argv);
>>     MPI_Comm_size(MPI_COMM_WORLD, &world_size);
>>     MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
>>     MPI_Get_processor_name(hostname, &name_len);
>>     printf("Hello world from processor %s, rank %d out of %d processors\n", 
>> hostname, world_rank, world_size);
>>     if (world_rank == 1)
>>     {
>>     MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
>>     printf("%s received %s, rank %d\n", hostname, buf, world_rank);
>>     }
>>     else
>>     {
>>     strcpy(buf, "haha!"); 
>>     MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
>>     printf("%s sent %s, rank %d\n", hostname, buf, world_rank);
>>     }
>>     MPI_Barrier(MPI_COMM_WORLD);
>>     MPI_Finalize();
>>     return 0;
>> }
>> 
>> 
>> 
>> The surgeon general advises you to eat right, exercise regularly and quit 
>> ageing.
>> 
>> On Sun, May 15, 2016 at 10:49 AM, Gilles Gouaillardet 
>> <gilles.gouaillar...@gmail.com> wrote:
>> At first glance, that seems a bit odd...
>> are you sure you correctly print the reachable bitmap ?
>> I would suggest you add some instrumentation to understand what happens
>> (e.g., printf before opal_bitmap_set_bit() and other places that prevent 
>> this from happening)
>> 
>> one more thing ...
>> now, master default behavior is
>> mpirun --mca mpi_add_procs_cutoff 0 ...
>> you might want to try
>> mpirun --mca mpi_add_procs_cutoff 1024 ...
>> and see if things make more sense.
>> if it helps, and iirc, there is a parameter so a btl can report it does not 
>> support cutoff.
>> 
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote:
>> Hello Gilles
>> 
>> Thanks for jumping in to help again. Actually, I had already tried some of 
>> your suggestions before asking for help.
>> 
>> I have several interconnects that can run both openib and tcp BTL. To 
>> simplify things, I explicitly mentioned TCP:
>> 
>> mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp ./mpitest
>> 
>> where mpitest is a small program that does MPI_Send()/MPI_Recv() on a small 
>> string, and then does an MPI_Barrier(). The program does work as expected.
>> 
>> I put a printf on the last line of mca_tcp_add_procs() to print the value of 
>> 'reachable'. What I saw was that the value was always 0 when it was invoked 
>> for Send()/Recv() and the pointer itself was NULL when invoked for Barrier()
>> 
>> Next I looked at pml_ob1_add_procs(), where the call chain starts, and found 
>> that it initializes and passes an opal_bitmap_t reachable down the call 
>> chain, but the resulting value is not used later in the code (the memory is 
>> simply freed later).
>> 
>> That, coupled with the fact that I am trying to imitate what the other BTL 
>> implementations are doing, yet in mca_bml_r2_endpoint_add_btl() by BTL is 
>> not being picked up, left me puzzled. Please note that the interconnect that 
>> I am developing for is on a different cluster (than where I ran the above 
>> test for TCP BTL.)
>> 
>> Thanks again
>> Durga
>> 
>> The surgeon general advises you to eat right, exercise regularly and quit 
>> ageing.
>> 
>> On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet 
>> <gilles.gouaillar...@gmail.com> wrote:
>> did you check the add_procs callbacks ?
>> (e.g. mca_btl_tcp_add_procs() for the tcp btl)
>> this is where the reachable bitmap is set, and I guess this is what you are 
>> looking for.
>> 
>> keep in mind that if several btl can be used, the one with the higher 
>> exclusivity is used
>> (e.g. tcp is never used if openib is available)
>> you can simply force your btl and self, and the ob1 pml, so you do not have 
>> to worry about other btl exclusivity.
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> 
>> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote:
>> Hello all
>> 
>> I have been struggling with this issue for a while and figured it might be a 
>> good idea to ask for help.
>> 
>> Where (in the code path) is the connectivity map created?
>> 
>> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but obviously 
>> I am not setting it up right, because this routine is not finding the BTL 
>> corresponding to my interconnect.
>> 
>> Thanks in advance
>> Durga
>> 
>> The surgeon general advises you to eat right, exercise regularly and quit 
>> ageing.
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/05/18975.php
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/05/18977.php
>> 
>> 
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> 
>> de...@open-mpi.org
>> 
>> Subscription: 
>> https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/05/18979.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/05/18981.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to