Hi,

A source of sudden deadlocks at larger scale can be a change of send behavior 
from buffered to synchronous mode. You can try whether your application 
deadlocks at smaller scale, if you replace all send by ssend (e.g., add`#define 
MPI_Send MPI_Ssend` and `#define MPI_Isend MPI_Issend` after the include of the 
MPI header).
An application with correct communication pattern should run with synchronous 
send without deadlock.
To check for other deadlock pattern in your application you can use tools like 
MUST [1] or Totalview.

Best
Joachim


[1] https://itc.rwth-aachen.de/must/
________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of George Bosilca via 
users <users@lists.open-mpi.org>
Sent: Sunday, September 11, 2022 10:40:42 PM
To: Open MPI Users <users@lists.open-mpi.org>
Cc: George Bosilca <bosi...@icl.utk.edu>
Subject: Re: [OMPI users] Subcommunicator communications do not complete 
intermittently

Assuming a correct implementation the described communication pattern should 
work seamlessly.

Would it be possible to either share a reproducer or provide the execution 
stack by attaching a debugger to the deadlocked application to see the state of 
the different processes. I wonder if all processes join eventually the gather 
on comm_world or dinner of them are stuck on some orthogonal collective 
communication pattern.

George




On Fri, Sep 9, 2022, 21:24 Niranda Perera via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Hi all,

I have the following use case. I have N mpi ranks in the global communicator, 
and I split it into two, first being rank 0, and the other being all ranks from 
1-->N-1.
Rank0 acts as a master and ranks [1, N-1] act as workers. I use rank0 to 
broadcast (blocking) a set of values to ranks [1, N-1] ocer comm_world. Rank0 
then immediately calls a gather (blocking) over comm_world and busywait for 
results. Once the broadcast is received by workers, they call a method 
foo(args, local_comm). Inside foo, workers communicate with each other using 
the subcommunicator, and each produce N-1 results, which would be sent to Rank0 
as gather responses over comm_world. Inside foo there are multiple iterations, 
collectives, send-receives, etc.

This seems to be working okay with smaller parallelism and smaller tasks of 
foo. But when the parallelism increases (eg: 64... 512), only a single 
iteration completes inside foo. Subsequent iterations, seems to be hanging.

Is this an anti-pattern in MPI? Should I use igather, ibcast instead of 
blocking calls?

Any help is greatly appreciated.

--
Niranda Perera
https://niranda.dev/
@n1r44<https://twitter.com/N1R44>

Reply via email to