You are absolutely right, the peer should never be set to -1 on any of
the PERUSE callbacks. I checked the code this morning and figure out
what was the problem. We report the peer and the tag attached to a
request before setting the right values (some code moved around). I
submitted a patch and created a "move request" to have this correction
as soon as possible on one of our stable releases. The move request
can be followed using our TRAC system and the following link (https://svn.open-mpi.org/trac/ompi/ticket/1845
). If you want to play with this change please update your Open MPI
installation to a nightly build or a fresh checkout from the SVN with
at least revision 20844 (a nightly including this change will be
posted on our website tomorrow morning).
Thanks,
george.
On Mar 23, 2009, at 13:23 , Samuel K. Gutierrez wrote:
Hi Kiril,
Appreciate the quick response.
Hi Samuel,
On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
"Samuel K. Gutierrez" <sam...@lanl.gov> wrote:
Hi All,
I'm writing a simple profiling library which utilizes
PERUSE. My callback
So am I :)
function counts communication events (see example code
below). I noticed
that in OMPI v1.3 spec->peer is sometimes a negative
value (OMPI v1.2.6
did not exhibit this behavior). I added some boundary
checks, but it
seems as if this is a bug? I hope I'm not missing
something...
It took me quite some time to reproduce the error - I also
Sorry about that - I should have provided more information.
got peer value "-1" for the Peruse peruse_comm_spec_t
struct. I only managed to reproduce this with
communication of a process with itself, which is an
unusual scenario. Anyway, for all the tests I did, the
error happened only when:
-a process communicates with itself
-the MPI receive call is made
-the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is
triggered
That's interesting... Nice work!
The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be
the place where the above event is called with a wrong
value of the peer attribute.
I will let you know if I find something.
I will also take a look.
Best regards,
Kiril
The peruse test provided in the OMPI v1.3 source
exhibits similar behavior:
mpirun -np 2 ./mpi_peruse | grep peer:-1
int callback(peruse_event_h event_h, MPI_Aint unique_id,
peruse_comm_spec_t *spec, void *param) {
if (spec->peer == rank) {
return MPI_SUCCESS;
}
rrCounts[spec->peer]++;
return MPI_SUCCESS;
}
Any insight is greatly appreciated.
Thanks,
Samuel K. Gutierrez
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
Appreciate the help,
Samuel K. Gutierrez
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel