[mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Thursday, June 27, 2013 12:48 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Application hangs on mpi_waitall
Attached is the message list for rank 0 for the communication step that is
failing. There are about 160
mailto:us...@open-mpi.org>
Subject: Re: [OMPI users] Application hangs on mpi_waitall
It ran a bit longer but still deadlocked. All matching sends are posted
1:1with posted recvs so it is a delivery issue of some kind. I'm running a
debug compiled version tonight to see what that might turn up
gt; Date:
> To: Open MPI Users <us...@open-mpi.org>
> Subject: Re: [OMPI users] Application hangs on mpi_waitall
>
>
> Ed,
>
> Im not sure but there might be a case where the BTL is getting overwhelmed by
> the nob-blocking operations while trying to setup the
Ed, how large are the messages that you are sending and receiving?
Rolf
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Ed Blosch
Sent: Thursday, June 27, 2013 9:01 AM
To: us...@open-mpi.org
Subject: Re: [OMPI users] Application hangs on mpi_waitall
It ran
rs <us...@open-mpi.org>
Subject: Re: [OMPI users] Application hangs on mpi_waitall
Ed,
Im not sure but there might be a case where the BTL is getting overwhelmed by
the nob-blocking operations while trying to setup the connection. There is a
simple test for this. Add an MPI_Alltoall with a rea
Ed,
Im not sure but there might be a case where the BTL is getting overwhelmed by
the nob-blocking operations while trying to setup the connection. There is a
simple test for this. Add an MPI_Alltoall with a reasonable size (100k) before
you start posting the non-blocking receives, and let's
An update: I recoded the mpi_waitall as a loop over the requests with
mpi_test and a 30 second timeout. The timeout happens unpredictably,
sometimes after 10 minutes of run time, other times after 15 minutes, for
the exact same case.
After 30 seconds, I print out the status of all outstanding