You are absolutely right, the peer should never be set to -1 on any of the PERUSE callbacks. I checked the code this morning and figure out what was the problem. We report the peer and the tag attached to a request before setting the right values (some code moved around). I submitted a patch and created a "move request" to have this correction as soon as possible on one of our stable releases. The move request can be followed using our TRAC system and the following link (https://svn.open-mpi.org/trac/ompi/ticket/1845 ). If you want to play with this change please update your Open MPI installation to a nightly build or a fresh checkout from the SVN with at least revision 20844 (a nightly including this change will be posted on our website tomorrow morning).

  Thanks,
    george.

On Mar 23, 2009, at 13:23 , Samuel K. Gutierrez wrote:

Hi Kiril,

Appreciate the quick response.

Hi Samuel,

On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
 "Samuel K. Gutierrez" <sam...@lanl.gov> wrote:
Hi All,

I'm writing a simple profiling library which utilizes
PERUSE.  My callback

So am I :)

function counts communication events (see example code
below).  I noticed
that in OMPI v1.3 spec->peer is sometimes a negative
value (OMPI v1.2.6
did not exhibit this behavior).  I added some boundary
checks, but it
seems as if this is a bug?  I hope I'm not missing
something...

It took me quite some time to reproduce the error - I also

Sorry about that - I should have provided more information.

got peer value "-1" for the Peruse peruse_comm_spec_t
struct. I only managed to reproduce this with
communication of a process with itself, which is an
unusual scenario. Anyway, for all the tests I did, the
error happened only when:

-a process communicates with itself
-the MPI receive call is made
-the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is
triggered

That's interesting... Nice work!



The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be
the place where the above event is called with a wrong
value of the peer attribute.

I will let you know if I find something.

I will also take a look.



Best regards,
Kiril


The peruse test provided in the OMPI v1.3 source
exhibits similar behavior:
mpirun -np 2 ./mpi_peruse | grep peer:-1

int callback(peruse_event_h event_h, MPI_Aint unique_id,
peruse_comm_spec_t *spec, void *param) {
  if (spec->peer == rank) {
      return MPI_SUCCESS;
  }
  rrCounts[spec->peer]++;
  return MPI_SUCCESS;
}


Any insight is greatly appreciated.

Thanks,

Samuel K. Gutierrez
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Appreciate the help,

Samuel K. Gutierrez
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to