Hi Samuel,

On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
 "Samuel K. Gutierrez" <sam...@lanl.gov> wrote:
Hi All,

I'm writing a simple profiling library which utilizes PERUSE. My callback

So am I :)

function counts communication events (see example code below). I noticed that in OMPI v1.3 spec->peer is sometimes a negative value (OMPI v1.2.6 did not exhibit this behavior). I added some boundary checks, but it seems as if this is a bug? I hope I'm not missing something...

It took me quite some time to reproduce the error - I also got peer value "-1" for the Peruse peruse_comm_spec_t struct. I only managed to reproduce this with communication of a process with itself, which is an unusual scenario. Anyway, for all the tests I did, the error happened only when:

-a process communicates with itself
-the MPI receive call is made
-the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is triggered


The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be the place where the above event is called with a wrong value of the peer attribute.

I will let you know if I find something.


Best regards,
Kiril


The peruse test provided in the OMPI v1.3 source exhibits similar behavior:
mpirun -np 2 ./mpi_peruse | grep peer:-1

int callback(peruse_event_h event_h, MPI_Aint unique_id,
peruse_comm_spec_t *spec, void *param) {
   if (spec->peer == rank) {
       return MPI_SUCCESS;
   }
   rrCounts[spec->peer]++;
   return MPI_SUCCESS;
}


Any insight is greatly appreciated.

Thanks,

Samuel K. Gutierrez
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to