Sebastien, Your analysis is correct in case the checkpoint/restart approach maintained by ORNL is enabled. This is not the code path of the "normal" MPI processes, where the PML OB1 is used. In this generic case the function mca_pml_ob1_iprobe, defined in the file ompi/mca/pml/ob1/pml_ob1_iprobe.c is used.
george. On Sep 27, 2011, at 14:36 , Sébastien Boisvert wrote: > Hello, > > As I understand, When MPI_Iprobe is called, the code that is called is the > function pointed by the attribute > > mca_pml_base_module_iprobe_fn_t pml_iprobe; > > > in ompi/mca/pml/pml.h > > > In the file ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c (Open-MPI 1.4.3), > ompi_crcp_bkmrk_pml_iprobe calls drain_message_find_any. > > > In drain_message_find_any (in ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c), there is > a loop over all MPI ranks > regardless of the peer parameter. > For instance, with 256 peers, probing for peer 255 requires 256 iterations > while probing for peer 0 requires 1 iteration. > > > As I understand it, the linked list ompi_crcp_bkmrk_pml_peer_refs is > populated with nprocs entries where nprocs is presumably the number of MPI > ranks in MPI_COMM_WORLD. > > > If my understanding is right, here are some suggestions: > > > 1. ompi_crcp_bkmrk_pml_peer_refs should be an array so that when peer is not > MPI_ANY_SOURCE, MPI_Iprobe can returns in constant time. > > > 2. There should be some sort of round-robin mechanism for the case where the > peer is MPI_ANY_SOURCE, otherwise lower ranks will get more probed and higher > ranks will > suffer from starvation. This could be done by having a current position in > the peer list (or array, see point 1). Instead of starting to loop on the > first, the loop would start at current position and > a maximum of nprocs iterations would take place. > > > A code review is on my blog: > http://dskernel.blogspot.com/2011/09/code-review-what-happens-in-open-mpis.html > > > > Sébastien > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel