
you should probably use -mca tcp,self  -mca btl_openib_if_include ib0.8109


On 10/3/08, Matt Burgess <burgess.m...@gmail.com> wrote:
> Hi,
> I'm trying to get openmpi working over openib partitions. On this cluster,
> the partition number is 0x109. The ib interfaces are pingable over the
> appropriate ib0.8109 interface:
> d2:/opt/openmpi-ib # ifconfig ib0.8109
> ib0.8109  Link encap:UNSPEC  HWaddr
> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
>           inet addr:  Bcast:  Mask:
>           inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
>           RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
>           collisions:0 txqueuelen:256
>           RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)
> I have tried the following:
> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am
> missing?
> I was successful using tcp only:
> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
> /cluster/pallas/x86_64-ib/IMB-MPI1
> Thanks,
> Matt Burgess
