Ramps up over time, we had a bunch of locked up nodes over the weekend and have
traced it back to this.
Let me see if I can share more details,
I will review with everyone tomorrow and get back to you,
Rolf vandeVaart wrote:
Hi Steven,
Thanks for the report. Very
Hi Steven,
Thanks for the report. Very little has changed between 1.8.5 and 1.8.6 within
the CUDA-aware specific code so I am perplexed. Also interesting that you do
not see the issue with 1.8.5 and CUDA 7.0.
You mentioned that it is hard to share the code on this but maybe you could
share
Hi Nick
No. Have to use mpirun in this case. You need. to ask for a larger batch
allocation than the initial mpirun requires. You do need to ask for batch
alloc though. Also note that mpirun doesnt currently work with nativized
slurm. Its on my todo list to fix.
Howard
--
sent
Saliya,
On Tue, Jun 30, 2015 at 10:50 AM, Saliya Ekanayake
wrote:
> Hi,
>
> I am experiencing some bottleneck with allgatherv routine in one of our
> programs and wonder how it works internally. Could you please share some
> details on this?
>
Open MPI has a tunable approach
Hi All,
Looks like we have found a large memory leak,
Very difficult to share code on this but here are some details,
1.8.5 w/ Cuda 7.0 — no memory leak
1.8.5 w/ cuda 6.5 — no memory leak
1.8.6 w/ cuda 7.0 — large memory leak
1.8.5 w/ cuda 6.5 — no memory leak
mvapich2 2.1 GDR — no issue on
Hi,
I am experiencing some bottleneck with allgatherv routine in one of our
programs and wonder how it works internally. Could you please share some
details on this?
I found this [1] paper from Gropp discussing an efficient implementation.
Is this similar to what we get in OpenMPI?
[1]
Hi Thomas,
as far as I know MPI does _not_ guarantee asynchronous progress
(unlike OpenSHMEM) because it would require some implementations to
start a progress thread.
Jeff has a nice blog post regarding this:
http://blogs.cisco.com/performance/mpi-progress
I was surprised to see this behavior
On 06/29/15 17:25, Nathan Hjelm wrote:
This is not a configuration issue. On 1.8.x and master we use two-sided
communication to emulation one-sided. Since we do not currently have
async progress this requires the target to call into MPI to progress RMA
communication.
This will change in 2.x. I