On Jul 5, 2007, at 4:16 PM, Glendenning, Lisa wrote:

Ron Brightwell at SNL has asked me to look into optimizing Open MPI's
one-sided operations over Portals.  Does anyone have any guidance or
thoughts for this?

Hi Lisa -

There are currently two implementations of the one-sided interface for Open MPI: pt2pt and rdma.

The pt2pt component is implemented entirely over the interfaces used to implement the MPI-1 point-to-point interface. So it ends up doing lots of copies and is entirely two-sided. It could support async progress with threads, but that doesn't help the XT platform all that much. It was the first one-sided component implemented, mostly because we needed to support protocols like MX and PSM that don't really expose one-sided semantics, and I only wanted to support one new component per release.

The rdma component is implemented over our BTL (byte transport layer -- the device driver our communication is written over), and can either use call-back based send/receive or true rdma. The true rdma is only for put/get for contiguous datatypes. The performance on OpenIB is ok, but not great (I'll send you some more details off list). I'd assume that the performance on Portals would be similar. However, the btl_put and btl_get implementation for the Portals BTL was implemented assuming it would only be used the way the PML (the MPI-1 point-to-point implementation) used it. It won't work with the rdma one-sided component at this time. I can go into more details if you decide that fixing the Portals BTL to support the rdma component is a path you want to look at.

Then, of course, there's the option of writing a Portals-specific one- sided component. The component interface is pretty straight-forward -- it's the MPI-2 one-sided chapter interface functions, plus an initialization function. This is the path towards best performance, but also means the most code to write. The existing code in Open MPI handles the attribute management, but that's about it if you go this route. Of course, you can always copy freely from the rdma and pt2pt components. There used to be a document somewhere describing how to add a new component, but I think it is horribly out of date. I'll see if I can find it and send it your way.

Of course, the first starting point is to get a checkout of the code and get it built. There are instructions for getting an SVN checkout of Open MPI (and how to get it built from there) available on the web page:

    http://www.open-mpi.org/svn/

Building on the XT platform (if you're going that route) is slightly more complicated, and you probably want to take a look at the horribly out of date wiki page on the subject here:

  https://svn.open-mpi.org/trac/ompi/wiki/CrayXT3


Hopefully, that's enough to get you started. If you have any questions, ask away.

Brian

--
  Brian W. Barrett
  Networking Team, CCS-1
  Los Alamos National Laboratory


Reply via email to