Running an MPI command with LD_PRELOAD=libsdp.so at the beginning won't cause SDP to be used on remote nodes. You have to find a way to load libsdp.so on all nodes, this might work better:
LD_PRELOAD=libsdp.so mpirun -np 4 env LD_PRELOAD=libsdp.so /there/vasp/20060503/vasp.4.6/vasp.mpi Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Bernhard Fischer > Sent: Friday, August 18, 2006 12:22 PM > To: Eitan Zahavi > Cc: openib-general@openib.org > Subject: Re: [openib-general] [patch] libsdp typo in config_parser > > On Fri, Aug 18, 2006 at 10:05:35PM +0300, Eitan Zahavi wrote: > >Hi Bernhard > > > >SDP traffic will not show on the IPoIB counters. It does no > go through > >IPoIB. > > That's what i thought, thanks for confirming. > >You can use > >lsmod | grep ib_sdp > >to see how many connections are made over SDP. > > Running lam via 2 nodes, on 2 CPUs each, i see: > # lsmod | grep ib_sdp > ib_sdp 28184 4 > rdma_cm 27912 1 ib_sdp > ib_core 53632 12 > ib_ucm,ib_uverbs,ib_sdp,rdma_cm,ib_cm,ib_local_sa,ib_umad,ib_i > poib,ib_multicast,ib_sa,ib_mthca,ib_mad > > I did start lamboot with libsdp.so preloaded: > $ LD_PRELOAD=/usr/local/lib64/libsdp.so lamboot l > $ lamnodes C -c -n > node13ib.infiniband > node13ib.infiniband > node15ib.infiniband > node15ib.infiniband > $ LD_PRELOAD=/usr/local/lib64/libsdp.so mpirun -np 4 > /there/vasp/20060503/vasp.4.6/vasp.mpi > > Still, ifconfig ib0 (which hosts node??ib.infiniband on > 10.100.0.0/24) shows that the > communication is being sent over ipoib as ifconfigs counters > constantly > go up when communicating (only one user is active on the system). > $ /sbin/ifconfig ib0 > ib0 Link encap:UNSPEC HWaddr > 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 > inet addr:10.100.0.13 Bcast:10.100.0.255 > Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 > RX packets:182037964 errors:0 dropped:0 overruns:0 frame:0 > TX packets:183607689 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:128 > RX bytes:189334244937 (180563.2 Mb) TX > bytes:194777918565 (185754.6 Mb) > > My libsdp.conf looks like this: > $ cat /usr/local/etc/libsdp.conf > #log min-level 1 destination file libsdp.log > use both connect * 10.100.0.0/24:* > use both server * 10.100.0.0/24:* > > So i fear i'm missing something crucial. > Ideas? > > >Exact number of packets and data can flowing through the IB > port can be > >obtained by : > >/sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets > >/sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packets > > $ for i in > /sys/class/infiniband/mthca0/ports/1/counters/*packets;do > echo -n $i:' ' ; cat $i;done > /sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets > : 185010549 > /sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packet > s: 186584856 > > PS: The different pingpong test (which have outdated names in > the openib > wiki, btw) do work just fine if run from the very same user, > so i think > that the basic verbs communication would work proper. > > > _______________________________________________ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general