> The trouble is when I try to add some "--mca" parameters to force it to > use TCP/Ethernet, the program seems to hang. I get the headers of the > "osu_bw" output, but no results, even on the first case (1 byte payload > per packet). This is occurring on both the IB-enabled nodes, and on the > Ethernet-only nodes. The specific syntax I was using was: "mpirun > --mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw" When we want to run over TCP and IPoIB on an IB/PSM equipped cluster, we use: --mca btl sm --mca btl tcp,self --mca btl_tcp_if_exclude eth0 --mca btl_tcp_if_include ib0 --mca mtl ^psm
based on this, it looks like the following might work for you: --mca btl sm,tcp,self --mca btl_tcp_if_exclude ib0 --mca btl_tcp_if_include eth0 --mca btl ^openib If you don't have ib0 ports configured on the IB nodes, probably you don't need the" --mca btl_tcp_if_exclude ib0." -Tom > > The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4 > compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with > 1.6.5 compiled with Intel 13.0.1 compilers. I haven't tested any other > combinations yet. > > Any ideas here? It's very possible this is a system configuration > problem, but I don't know where to look. At this point, any ideas would > be welcome, either about the specific situation, or general pointers on > mpirun debugging flags to use. I can't find much in the docs yet on > run-time debugging for OpenMPI, as opposed to debugging the application. > Maybe I'm just looking in the wrong place. > > > Thanks, > > -- > Lloyd Brown > Systems Administrator > Fulton Supercomputing Lab > Brigham Young University > http://marylou.byu.edu > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users