> On Apr 6, 2018, at 1:41 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > > Noam, > > According to your stack trace the correct way to call the mca_pml_ob1_dump is > with the communicator from the PMPI call. Thus, this call was successful: > > (gdb) call mca_pml_ob1_dump(0xed932d0, 1) > $1 = 0 > > I should have been more clear, the output is not on gdb but on the output > stream of your application. If you run your application by hand with mpirun, > the output should be on the terminal where you started mpirun. If you start > your job with a batch schedule, the output should be in the output file > associated with your job. >
OK, that makes sense. Here’s what I get from the two relevant processes. compute-1-9 should be receiving, and 1-10 sending, I believe. Is it possible that the fact that all send send/recv pairs (nodes 1-3 on each set of 4 sending to 0, which is receiving from each one in turn) are using the same tag (200) is confusing things? [compute-1-9:29662] Communicator MPI COMMUNICATOR 5 SPLIT FROM 3 [0xeba14d0](5) rank 0 recv_seq 8855 num_procs 4 last_probed 0 [compute-1-9:29662] [Rank 1] expected_seq 175 ompi_proc 0xeb0ec50 send_seq 8941 [compute-1-9:29662] [Rank 2] expected_seq 127 ompi_proc 0xeb97200 send_seq 385 [compute-1-9:29662] unexpected frag [compute-1-9:29662] hdr RNDV [ ] ctx 5 src 2 tag 200 seq 126 msg_length 86777600 [compute-1-9:29662] [Rank 3] expected_seq 8558 ompi_proc 0x2b8ee8000f90 send_seq 5 [compute-1-9:29662] unexpected frag [compute-1-9:29662] hdr RNDV [ ] ctx 5 src 3 tag 200 seq 8557 msg_length 86777600 [compute-1-10:15673] Communicator MPI COMMUNICATOR 5 SPLIT FROM 3 [0xe9cc6a0](5) rank 1 recv_seq 9119 num_procs 4 last_probed 0 [compute-1-10:15673] [Rank 0] expected_seq 8942 ompi_proc 0xe8e1db0 send_seq 174 [compute-1-10:15673] [Rank 2] expected_seq 54 ompi_proc 0xe9d7940 send_seq 8561 [compute-1-10:15673] [Rank 3] expected_seq 126 ompi_proc 0xe9c20c0 send_seq 385 ____________ || |U.S. NAVAL| |_RESEARCH_| LABORATORY Noam Bernstein, Ph.D. Center for Materials Physics and Technology U.S. Naval Research Laboratory T +1 202 404 8628 F +1 202 404 7546 https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users