Hi, at least for the specific test program I used, the negative values for the peer attribute disappeared after George's modifications in 20844.
One remark: after installation, I had to remove the '#include "ompi_config.h"' line in the "include/peruse.h" header to get PERUSE applications to compile. Otherwise I got a missing header error message for ompi_config.h. Regards, Kiril On Mon, 2009-03-23 at 16:34 -0400, George Bosilca wrote: > You are absolutely right, the peer should never be set to -1 on any of > the PERUSE callbacks. I checked the code this morning and figure out > what was the problem. We report the peer and the tag attached to a > request before setting the right values (some code moved around). I > submitted a patch and created a "move request" to have this correction > as soon as possible on one of our stable releases. The move request > can be followed using our TRAC system and the following link > (https://svn.open-mpi.org/trac/ompi/ticket/1845 > ). If you want to play with this change please update your Open MPI > installation to a nightly build or a fresh checkout from the SVN with > at least revision 20844 (a nightly including this change will be > posted on our website tomorrow morning). > > Thanks, > george. > > On Mar 23, 2009, at 13:23 , Samuel K. Gutierrez wrote: > > > Hi Kiril, > > > > Appreciate the quick response. > > > >> Hi Samuel, > >> > >> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT) > >> "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: > >>> Hi All, > >>> > >>> I'm writing a simple profiling library which utilizes > >>> PERUSE. My callback > >> > >> So am I :) > >> > >>> function counts communication events (see example code > >>> below). I noticed > >>> that in OMPI v1.3 spec->peer is sometimes a negative > >>> value (OMPI v1.2.6 > >>> did not exhibit this behavior). I added some boundary > >>> checks, but it > >>> seems as if this is a bug? I hope I'm not missing > >>> something... > >> > >> It took me quite some time to reproduce the error - I also > > > > Sorry about that - I should have provided more information. > > > >> got peer value "-1" for the Peruse peruse_comm_spec_t > >> struct. I only managed to reproduce this with > >> communication of a process with itself, which is an > >> unusual scenario. Anyway, for all the tests I did, the > >> error happened only when: > >> > >> -a process communicates with itself > >> -the MPI receive call is made > >> -the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is > >> triggered > > > > That's interesting... Nice work! > > > >> > >> > >> The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be > >> the place where the above event is called with a wrong > >> value of the peer attribute. > >> > >> I will let you know if I find something. > > > > I will also take a look. > > > >> > >> > >> Best regards, > >> Kiril > >> > >>> > >>> The peruse test provided in the OMPI v1.3 source > >>> exhibits similar behavior: > >>> mpirun -np 2 ./mpi_peruse | grep peer:-1 > >>> > >>> int callback(peruse_event_h event_h, MPI_Aint unique_id, > >>> peruse_comm_spec_t *spec, void *param) { > >>> if (spec->peer == rank) { > >>> return MPI_SUCCESS; > >>> } > >>> rrCounts[spec->peer]++; > >>> return MPI_SUCCESS; > >>> } > >>> > >>> > >>> Any insight is greatly appreciated. > >>> > >>> Thanks, > >>> > >>> Samuel K. Gutierrez > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> > > > > Appreciate the help, > > > > Samuel K. Gutierrez > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel