Hello Or, Thanks for the reply.
I enabled debug and these are the results of my test. (below) First off, I ran this same test on bonded ethernet and on a single IB interface with success. sudo route add -net 224.0.0.0/3 gw 192.168.47.102 socat STDIO UDP4-DATAGRAM:224.1.0.1:6666,bind=:6666,range= 192.168.47.0/24,ip-add-membership=224.1.0.1:192.168.47.102 sudo route add -net 224.0.0.0/3 gw 192.168.47.100 socat STDIO UDP4-DATAGRAM:224.1.0.1:6666,bind=:6666,range= 192.168.47.0/24,ip-add-membership=224.1.0.1:192.168.47.100 socat sets up a peer-peer multicast communication, the expected results are echoed data on the sending end and data on the receiving end. When attempting this test with bonded IB interfaces, I only get get the echoed data on the sending end and nothing on the recieving end. here are the results from dmesg [ 859.128720] bonding: bond3 is being created... [ 859.129468] bonding: bond3: setting mode to active-backup (1). [ 859.129501] bonding: bond3: Setting MII monitoring interval to 100. [ 859.141557] bonding: bond3: doing slave updates when interface is down. [ 859.141563] bonding: bond3: Adding slave ib0. [ 859.141566] bonding bond3: master_dev is not up in bond_enslave [ 859.141567] bonding: bond3: Warning: enslaved VLAN challenged slave ib0. Adding VLANs will be blocked as long as ib0 is part of bond bond3 [ 859.141570] bonding: bond3: Warning: The first slave device specified does not support setting the MAC address. Setting fail_over_mac to active.<7>ib0: bringing up interface [ 859.182437] ib0: starting multicast thread [ 859.182568] ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff [ 859.182580] ib0: restarting multicast task [ 859.182583] ib0: stopping multicast thread [ 859.182586] ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:0001 [ 859.182589] ib0: starting multicast thread [ 859.182739] ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0) [ 859.182951] ib0: Created ah ffff8804379e8680 [ 859.182954] ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV ffff8804379e8680, LID 0xc000, SL 0 [ 859.183088] ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001 [ 859.183222] ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0) [ 859.183354] ib0: Created ah ffff8804389a9880 [ 859.183359] ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff8804389a9880, LID 0xc001, SL 0 [ 859.184369] bonding: bond3: enslaving ib0 as a backup interface with a down link. [ 859.186365] ib0: successfully joined all multicast groups [ 859.186385] ib0: restarting multicast task [ 859.186386] ib0: stopping multicast thread [ 859.186389] ib0: starting multicast thread [ 859.186500] ib0: successfully joined all multicast groups [ 859.188608] bonding: bond3: doing slave updates when interface is down. [ 859.188613] bonding: bond3: Adding slave ib1. [ 859.188615] bonding bond3: master_dev is not up in bond_enslave [ 859.188617] bonding: bond3: Warning: enslaved VLAN challenged slave ib1. Adding VLANs will be blocked as long as ib1 is part of bond bond3 [ 859.221889] ib1: bringing up interface [ 859.222359] ib1: starting multicast thread [ 859.222483] ib1: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff [ 859.222494] ib1: restarting multicast task [ 859.222498] ib1: stopping multicast thread [ 859.222500] ib1: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:0001 [ 859.222503] ib1: starting multicast thread [ 859.224240] bonding: bond3: enslaving ib1 as a backup interface with a down link. [ 859.224634] ib1: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0) [ 859.224837] ib1: Created ah ffff880436cc8400 [ 859.224841] ib1: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV ffff880436cc8400, LID 0xc000, SL 0 [ 859.224968] ib1: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001 [ 859.225099] ib1: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0) [ 859.225223] ib1: Created ah ffff88043840fec0 [ 859.225228] ib1: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff88043840fec0, LID 0xc001, SL 0 [ 859.226956] ib1: successfully joined all multicast groups [ 859.226961] ib1: restarting multicast task [ 859.226962] ib1: stopping multicast thread [ 859.226964] ib1: starting multicast thread [ 859.227074] ib1: successfully joined all multicast groups [ 859.228034] ib0: mtu > 2044 will cause multicast packet drops. [ 859.229779] ib1: mtu > 2044 will cause multicast packet drops. [ 859.233134] ADDRCONF(NETDEV_UP): bond3: link is not ready [ 859.233153] bonding: bond3: link status definitely up for interface ib0. [ 859.233156] bonding: bond3: making interface ib0 the new active one. [ 859.233167] ib0: restarting multicast task [ 859.233170] ib0: stopping multicast thread [ 859.233172] ib0: adding multicast entry for mgid 0001:0000:0000:0000:0000:0000:0000:0000 [ 859.233175] ib0: starting multicast thread [ 859.233178] bonding: bond3: first active interface up! [ 859.233180] bonding: bond3: link status definitely up for interface ib1. [ 859.233289] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 859.234904] ADDRCONF(NETDEV_CHANGE): bond3: link becomes ready [ 859.234944] ib0: restarting multicast task [ 859.234948] ib0: stopping multicast thread [ 859.234951] ib0: adding multicast entry for mgid ff12:601b:ffff:0000:0000:0001:ff00:f778 [ 859.234954] ib0: starting multicast thread [ 859.235069] ib0: joining MGID ff12:601b:ffff:0000:0000:0001:ff00:f778 [ 859.235090] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 859.235095] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 859.235162] ib0: restarting multicast task [ 859.235163] ib0: stopping multicast thread [ 859.235166] ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:00fb [ 859.235168] ib0: starting multicast thread [ 859.235200] ib0: join completion for ff12:601b:ffff:0000:0000:0001:ff00:f778 (status 0) [ 859.235304] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 859.235343] ib0: Created ah ffff88043a9b9440 [ 859.235347] ib0: MGID ff12:601b:ffff:0000:0000:0001:ff00:f778 AV ffff88043a9b9440, LID 0xc002, SL 0 [ 859.235408] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 859.235412] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 859.235481] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 859.235592] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 859.235596] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 859.260028] ib0: setting up send only multicast group for ff12:601b:ffff:0000:0000:0000:0000:0016 [ 859.260042] ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, starting join [ 859.260136] ib0: multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 [ 859.263792] ib0: setting up send only multicast group for ff12:401b:ffff:0000:0000:0000:0000:0016 [ 859.263806] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 859.263883] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 [ 860.600025] ib0: setting up send only multicast group for ff12:601b:ffff:0000:0000:0000:0000:0002 [ 860.600035] ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting join [ 860.600149] ib0: multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22 [ 863.230303] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 863.230406] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 863.230411] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 864.600035] ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting join [ 864.600124] ib0: multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22 [ 868.600034] ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0002, starting join [ 868.600119] ib0: multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0002, status -22 [ 868.620031] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 868.620112] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 [ 869.100039] ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, starting join [ 869.100124] ib0: multicast join failed for ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 [ 869.600029] bond3: no IPv6 routers present [ 879.230231] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 879.230349] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 879.230355] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 886.919993] ib0: restarting multicast task [ 886.919997] ib0: stopping multicast thread [ 886.920002] ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0001:0001 [ 886.920005] ib0: starting multicast thread [ 886.920140] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 886.920244] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 886.920248] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 886.934421] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 886.934520] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 [ 889.000014] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 889.000102] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 [ 899.053269] ib0: restarting multicast task [ 899.053273] ib0: stopping multicast thread [ 899.053277] ib0: deleting multicast group ff12:401b:ffff:0000:0000:0000:0001:0001 [ 899.053280] ib0: deleting multicast group ff12:401b:ffff:0000:0000:0000:0001:0001 [ 899.053285] ib0: starting multicast thread [ 899.053430] ib0: joining MGID 0001:0000:0000:0000:0000:0000:0000:0000 [ 899.053540] ib0: join completion for 0001:0000:0000:0000:0000:0000:0000:0000 (status -22) [ 899.053544] ib0: multicast join failed for 0001:0000:0000:0000:0000:0000:0000:0000, status -22 [ 899.073152] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 899.073241] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 [ 903.420017] ib0: no multicast record for ff12:401b:ffff:0000:0000:0000:0000:0016, starting join [ 903.420100] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22 Thank you, Dennis P. On Mon, Apr 20, 2009 at 8:17 AM, Or Gerlitz <[email protected]> wrote: > Dennis Portello wrote: > > Regular TCP/IP unicast works, though dmesg is full of warning about > > multicast failing. Multicast does not work at all. > > Unicast IP relies on ARP and IPoIB ARPs use the broadcast multicast group, > so > IB multicast does work on your setup... to see what IB multicast groups are > being > joined by your IPoIB devices, you can use the ipoib debugfs entries > > $ mount -t debugfs none /sys/kernel/debug > $ cat /sys/kernel/debug/ipoib/ibxxx_mcg > > see Documentation/infiniband/ipoib.txt for more info > > Or. >
_______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
