[OMPI devel] Exit status

2011-04-13 Thread Ralph Castain
I've run across an interesting issue for which I don't have a ready answer. If an MPI process aborts, we automatically abort the entire job. If an MPI process returns a non-zero exit status, indicating that there was something abnormal about its termination, we ignore it and let the job continu

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread Ken Lloyd
George, Yes. GPUDirect eliminated an additional (host) memory buffering step between the HCA and the GPU that took CPU cycles. I was never very comfortable with the kernel patch necessary, nor the patched OFED required to make it all work. Having said that, it did provide a ~14% improvement in th

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread George Bosilca
On Apr 13, 2011, at 14:48 , Rolf vandeVaart wrote: > This work does not depend on GPU Direct. It is making use of the fact that > one can malloc memory, register it with IB, and register it with CUDA via the > new 4.0 API cuMemHostRegister API. Then one can copy device memory into this > mem

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread Rolf vandeVaart
[Answering both questions with this email] These changes depend on new features in CUDA 4.0. With CUDA 4.0, there is the concept of Unified Virtual Addresses, so the addresses do not overlap. They are all unique within the process. There is an API in the CUDA 4.0 that one can use to query wh

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread Ken Lloyd
Rolf, I haven't had a chance to review the code yet, but how do these changes relate to CUDA 4.0 - especially the UVA and GPUDirect 2.0 implementation? Ken On Wed, 2011-04-13 at 09:47 -0700, Rolf vandeVaart wrote: > WHAT: Add support to send data directly from CUDA device memory via > MPI calls.

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread Brice Goglin
Hello Rolf, This "CUDA device memory" isn't memory mapped in the host, right? Then what does its address look like ? When you say "when it is detected that a buffer is CUDA device memory", if the actual device and host address spaces are different, how do you know that device addresses and usual h

[OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-13 Thread Rolf vandeVaart
WHAT: Add support to send data directly from CUDA device memory via MPI calls. TIMEOUT: April 25, 2011 DETAILS: When programming in a mixed MPI and CUDA environment, one cannot currently send data directly from CUDA device memory. The programmer first has to move the data into host memory, and

Re: [OMPI devel] Add child to another parent.

2011-04-13 Thread Hugo Meyer
When the proc restarts, it calls orte_routed.init_routes. If you look in routed cm, you should see a call to "register_sync" - this is where the proc sends a message to the local daemon, allowing it to "learn" the port/address where the proc resides. I've done this. I had a problem because when i