Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
[mailto:users-boun...@open-mpi.org] On Behalf Of Blosch, Edwin L Sent: Thursday, June 27, 2013 12:48 PM To: Open MPI Users Subject: EXTERNAL: Re: [OMPI users] Application hangs on mpi_waitall Attached is the message list for rank 0 for the communication step that is failing. There are about 160

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
mailto:us...@open-mpi.org> Subject: Re: [OMPI users] Application hangs on mpi_waitall It ran a bit longer but still deadlocked. All matching sends are posted 1:1with posted recvs so it is a delivery issue of some kind. I'm running a debug compiled version tonight to see what that might turn up

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread George Bosilca
gt; Date: > To: Open MPI Users <us...@open-mpi.org> > Subject: Re: [OMPI users] Application hangs on mpi_waitall > > > Ed, > > Im not sure but there might be a case where the BTL is getting overwhelmed by > the nob-blocking operations while trying to setup the

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Rolf vandeVaart
Ed, how large are the messages that you are sending and receiving? Rolf From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ed Blosch Sent: Thursday, June 27, 2013 9:01 AM To: us...@open-mpi.org Subject: Re: [OMPI users] Application hangs on mpi_waitall It ran

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Ed Blosch
rs <us...@open-mpi.org> Subject: Re: [OMPI users] Application hangs on mpi_waitall Ed, Im not sure but there might be a case where the BTL is getting overwhelmed by the nob-blocking operations while trying to setup the connection. There is a simple test for this. Add an MPI_Alltoall with a rea

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-26 Thread George Bosilca
Ed, Im not sure but there might be a case where the BTL is getting overwhelmed by the nob-blocking operations while trying to setup the connection. There is a simple test for this. Add an MPI_Alltoall with a reasonable size (100k) before you start posting the non-blocking receives, and let's

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-25 Thread eblosch
An update: I recoded the mpi_waitall as a loop over the requests with mpi_test and a 30 second timeout. The timeout happens unpredictably, sometimes after 10 minutes of run time, other times after 15 minutes, for the exact same case. After 30 seconds, I print out the status of all outstanding

[OMPI users] Application hangs on mpi_waitall

2013-06-18 Thread Blosch, Edwin L
I'm running OpenMPI 1.6.4 and seeing a problem where mpi_waitall never returns. The case runs fine with MVAPICH. The logic associated with the communications has been extensively debugged in the past; we don't think it has errors. Each process posts non-blocking receives, non-blocking