Re: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-06-30 Thread Steven Eliuk
Ramps up over time, we had a bunch of locked up nodes over the weekend and have traced it back to this. Let me see if I can share more details, I will review with everyone tomorrow and get back to you, Rolf vandeVaart wrote: Hi Steven, Thanks for the report. Very

Re: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-06-30 Thread Rolf vandeVaart
Hi Steven, Thanks for the report. Very little has changed between 1.8.5 and 1.8.6 within the CUDA-aware specific code so I am perplexed. Also interesting that you do not see the issue with 1.8.5 and CUDA 7.0. You mentioned that it is hard to share the code on this but maybe you could share

Re: [OMPI users] Running with native ugni on a Cray XC

2015-06-30 Thread Howard Pritchard
Hi Nick No. Have to use mpirun in this case. You need. to ask for a larger batch allocation than the initial mpirun requires. You do need to ask for batch alloc though. Also note that mpirun doesnt currently work with nativized slurm. Its on my todo list to fix. Howard -- sent

Re: [OMPI users] Allgather Implementation Details

2015-06-30 Thread George Bosilca
Saliya, On Tue, Jun 30, 2015 at 10:50 AM, Saliya Ekanayake wrote: > Hi, > > I am experiencing some bottleneck with allgatherv routine in one of our > programs and wonder how it works internally. Could you please share some > details on this? > Open MPI has a tunable approach

[OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-06-30 Thread Steven Eliuk
Hi All, Looks like we have found a large memory leak, Very difficult to share code on this but here are some details, 1.8.5 w/ Cuda 7.0 — no memory leak 1.8.5 w/ cuda 6.5 — no memory leak 1.8.6 w/ cuda 7.0 — large memory leak 1.8.5 w/ cuda 6.5 — no memory leak mvapich2 2.1 GDR — no issue on

[OMPI users] Allgather Implementation Details

2015-06-30 Thread Saliya Ekanayake
Hi, I am experiencing some bottleneck with allgatherv routine in one of our programs and wonder how it works internally. Could you please share some details on this? I found this [1] paper from Gropp discussing an efficient implementation. Is this similar to what we get in OpenMPI? [1]

Re: [OMPI users] Progress on target of MPI_Win_lock on Infiniband

2015-06-30 Thread Marc-Andre Hermanns
Hi Thomas, as far as I know MPI does _not_ guarantee asynchronous progress (unlike OpenSHMEM) because it would require some implementations to start a progress thread. Jeff has a nice blog post regarding this: http://blogs.cisco.com/performance/mpi-progress I was surprised to see this behavior

Re: [OMPI users] Progress on target of MPI_Win_lock on Infiniband

2015-06-30 Thread Thomas Jahns
On 06/29/15 17:25, Nathan Hjelm wrote: This is not a configuration issue. On 1.8.x and master we use two-sided communication to emulation one-sided. Since we do not currently have async progress this requires the target to call into MPI to progress RMA communication. This will change in 2.x. I