Hello,

I'm trying to debug a segfaulting application; the segfault does not
happen consistently, however, so my guess is that it is due to some
memory corruption problem which I'm trying to find.

I'm using code like this:

  MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &flag, &status);
  if(flag) {
    int size;
    MPI_Get_count(&status, MPI_BYTE, &size);
    void* row = xmalloc(size);
    /* ... */
    MPI_Recv(row, size, MPI_BYTE,
             status.MPI_SOURCE, status.MPI_TAG, MPI_COMM_WORLD,
             &status);
  /* ... */
  }

Question: is it possible that, in the time my program progresses from
MPI_Iprobe() to MPI_Recv(), another message has arrived, that matches
the MPI_Recv(), but is not the one originally matched from
MPI_Iprobe()?  (e.g. a shorter one)

In particular, could it be that the size of the message actually
received by MPI_Recv() does not match `size` (the variable)?

In case a shorter message (different from the one initially matched)
was received, can I get the actual message size via a new call to
MPI_Get_count(&mpi_recv_status ...)?

(My application is sending variable-length messages from one rank to
the other at a quite high rate, so such a mismatch could potentially
be deadly.)

Best regards,
Riccardo

Reply via email to