Hi Jeff
I am glad this question was asked.
Thanks to whoever did it.
Acronyms are always a pain, particularly if you don't know them,
and they are in no dictionary.
OFUD, OFED, OPENIB, MCA, BTL, SM, OOB, ... the list goes on and on.
Your answer makes a great start for another FAQ entry,
called, say, "The OpenMPI Hacker's Dictionary".
Cheers,
Gus Correa
Jeff Squyres wrote:
On Dec 2, 2010, at 3:59 AM, 阚圣哲 wrote:
When I use openmpi mpirun --mca btl <arg1>, I find arg1 can be ofud, self, sm,
openib, but www.open-mpi.org desn't explain those args.
"BTL" stands for "byte transfer layer" -- is the lowest networking software layer for the
"ob1" MPI transport in Open MPI (ob1 is usually the default transport in Open MPI).
Each BTL supports a different kind of network:
- ofud: experimental UD-based OpenFabrics transport. I would not use this; it
was developed as part of research and was never really finished.
- self: send-to-self (i.e., loopback to the same MPI process)
- sm: shared memory
- openib: generalized OpenFabrics transport.
Open MPI will automatically pick which BTL to use on a per-communication basis, based on which MPI process peer you are communicating with.
The "--mca btl ..." argument to mpirun restricts which BTLs Open MPI will use at run-time.
I can't understand the mean of "ofud", what different between "ofud" and
"openib",
I also can't understand the different between "ibcm" and "rdmacm", when I use mpirun
--mca btl_openib_cpc_include <arg2>.
There are 4 different ways for openib BTL to make connections across Open
Fabrics networks:
- oob: the default ("out of band", meaning that it uses TCP sockets)
- xoob: the default when using Mellanox XRC ("out of band with XRC support")
- rdmacm: the default when using iWARP (because iWARP doesn't support OOB or
XOOB)
- ibcm: not currently used; it's an IB-specific method that was never really
finished
Usually, the right CM is just automatically picked -- you shouldn't need to
manually select anything.
maybe www.open-mpi.org can publish a openmpi's document to explain those args
and principle.
We are lacking in the documentation department; contributions would be greatly
appreciated...
The README file has a bunch about BTLs; that may be helpful reading.