Hello, I am running into the following issue while trying to run osu_latency:
-- -bash-3.2$ mpiexec --mca btl openib,self -mca btl_openib_warn_default_gid_ prefix 0 -np 2 --hostfile mpihosts /home/jagga/osu-micro-benchmarks-3.3/openmpi/ofed-1.5.2/bin/osu_latency # OSU MPI Latency Test v3.3 # Size Latency (us) [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:325:qp_connect_all] error modifing QP to RTR errno says Invalid argument [amber04][[10252,1],1][connect/btl_openib_connect_oob.c:815:rml_recv_cb] error in endpoint reply start connect -------------------------------------------------------------------------- mpiexec has exited due to process rank 1 with PID 6781 on node amber04 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). -------------------------------------------------------------------------- -- I can get around this by adding the "--mca btl_openib_cpc_include rdmacm" option. However, I have another host with a different HCA with all the same drivers and software versions that I can run this same command successfully with using the rdmacm option. What could be causing one of my environments to fail but the other to work fine (without the rdmacm option)? -- [root@amber03 ~]# ofed_info | grep OFED MLNX_OFED_LINUX-1.5.2-1.0.0 (OFED-1.5.2-20101020-1520): MLNX_OFED_LINUX-1.5.2-1.0.0 (/mswg/release/ofed-1.5.2-rpms/rnfs-utils/rnfs-utils-1.1.5-10.OFED.src.rpm): [root@amber03 ~]# ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.7.9294 node_guid: 78e7:d103:0021:8884 sys_image_guid: 78e7:d103:0021:8887 vendor_id: 0x02c9 vendor_part_id: 26438 hw_ver: 0xB0 board_id: HP_0200000003 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 20 port_lmc: 0x00 link_layer: IB port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet -- Any help would be greatly appreciated. Thanks, -J