Hello,

I am using the latest trunk version of OMPI, in order to take advantage of the 
new CUDA RDMA features (smcuda BTL). RDMA support is superb, however, I have to 
give a manual parameter

mpirun --mca pml ob1 ...

to have the OB1 upper layer selected and, consequently, to get smcuda 
activated. Otherwise mpirun chooses the cm upper layer, which is wrong. The 
hardware is a

InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02).

This is the output of 
mpirun - mca pml_base_verbose 100

[cas002:05518] select: component cm selected
[cas002:05518] mca: base: close: component v closed
[cas002:05518] mca: base: close: unloading component v
[cas002:05518] mca: base: close: component bfo closed
[cas002:05518] mca: base: close: unloading component bfo
[cas002:05518] mca: base: close: component csum closed
[cas002:05518] mca: base: close: unloading component csum
[cas002:05518] mca: base: close: component dr closed
[cas002:05518] mca: base: close: unloading component dr
[cas002:05518] mca: base: close: component ob1 closed
[cas002:05518] mca: base: close: unloading component ob1
[cas002:05520] mca: base: components_open: component cm open function successful
[cas002:05520] mca: base: components_open: found loaded component csum
[cas002:05520] mca: base: components_open: component csum has no register 
function
[cas002:05520] mca: base: components_open: component csum open function 
successful
[cas002:05520] mca: base: components_open: found loaded component dr
[cas002:05520] mca: base: components_open: component dr has no register function
[cas002:05520] mca: base: components_open: component dr open function successful
[cas002:05520] mca: base: components_open: found loaded component ob1
[cas002:05520] mca: base: components_open: component ob1 has no register 
function
[cas002:05520] mca: base: components_open: component ob1 open function 
successful
[cas002:05520] select: component v not in the include list
[cas002:05520] select: component bfo not in the include list
[cas002:05520] select: initializing pml component cm
[cas002:05520] select: init returned priority 30
[cas002:05520] select: component csum not in the include list
[cas002:05520] select: component dr not in the include list
[cas002:05520] select: initializing pml component ob1
[cas002:05520] select: init returned failure for component ob1
[cas002:05520] selected cm best priority 30
[cas002:05520] select: component cm selected
[cas002:05520] mca: base: close: component v closed
[cas002:05520] mca: base: close: unloading component v
[cas002:05520] mca: base: close: component bfo closed
[cas002:05520] mca: base: close: unloading component bfo
[cas002:05520] mca: base: close: component csum closed
[cas002:05520] mca: base: close: unloading component csum
[cas002:05520] mca: base: close: component dr closed
[cas002:05520] mca: base: close: unloading component dr
[cas002:05520] mca: base: close: component ob1 closed
[cas002:05520] mca: base: close: unloading component ob1
[cas002:05518] check:select: checking my pml cm against rank=0 pml cm
[cas002:05517] check:select: rank=0
[cas002:05520] check:select: checking my pml cm against rank=0 pml cm
[cas002:05519] check:select: checking my pml cm against rank=0 pml cm

Configure options:
./configure --with-openib --with-cuda --prefix=/home/it1/glaser/local 
--with-tm=/opt/torque --enable-shared

Does anyone have any idea what causes openmpi to select cm by default?

Thanks,
Jens.

Reply via email to