Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]
Hi Nitin, On Tue, 2005-03-15 at 16:15, Nitin Hande wrote: > This is cool, I have got Solaris IPoIB happily working with the > OpenSM now. It plumbs, pings and snoops on 0x pkey. Great. That's good news. I'll work on a real fix for this now. > On other hand, on my linux node, if I try to use 8001 partition and > configure IB interface with IP addr (same time while ib0 is using 0x > pkey), I get the following error, you may want to investigate that > > [EMAIL PROTECTED] ~]# echo 0x8001 > /sys/class/net/ib0/create_child > [EMAIL PROTECTED] ~]# ifconfig ib0.8001 10.10.1.1 > [EMAIL PROTECTED]: multicast join failed for > ff12:401b:8001:0:0:0::, status -22 > ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0::, > status -22 I will look into this but I suspect this is caused by the response to some request in the join "flow" to be more than 1 RMPP packet. Remember that OpenSM is currently hamstrung in this manner until there is sufficient RMPP for SA GetTableResps. Thanks. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]
Nitin> On other hand, on my linux node, if I try to use 8001 Nitin> partition and configure IB interface with IP addr (same Nitin> time while ib0 is using 0x pkey), I get the following Nitin> error, you may want to investigate that I think this is probably an OpenSM issue (does OpenSM support multiple partitions?). On my fabric, running Topspin's embedded SM on a switch, I can do: # modprobe ib_ipoib # echo 0x8001 > /sys/class/net/ib0/create_child # ifconfig ib0.8001 up on both systems. On system #1 I have: # ifconfig ib0.8001 ib0.8001 Link encap:UNSPEC HWaddr 00-13-04-06-FE-80-00-00-00-00-00-00-00-00-00-00 inet6 addr: fe80::202:c901:7fc:c711/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:300 (300.0 b) and on system #2 I'm able to do: # ping6 -I ib0.8001 fe80::202:c901:7fc:c711 PING fe80::202:c901:7fc:c711(fe80::202:c901:7fc:c711) from fe80::202:c901:78c:e461 ib0.8001: 56 data bytes 64 bytes from fe80::202:c901:7fc:c711: icmp_seq=1 ttl=64 time=4.56 ms 64 bytes from fe80::202:c901:7fc:c711: icmp_seq=2 ttl=64 time=0.077 ms 64 bytes from fe80::202:c901:7fc:c711: icmp_seq=3 ttl=64 time=0.065 ms - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]
Hal, On Fri, 2005-03-04 at 12:53, Hal Rosenstock wrote: > Hi again Nitin, > > Finally got a chance to work on this. I have a workaround for you for > now. Real patch later... Let me know if this does the trick for you. It > did for me. > > -- Hal > > Index: osm_sa_mcmember_record.c > === > --- osm_sa_mcmember_record.c (revision 1953) > +++ osm_sa_mcmember_record.c (working copy) > @@ -1522,9 +1522,11 @@ >if ((IB_MCR_COMPMASK_PROXY & comp_mask) && >(p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit; > > +#if 0 >/* if defined MUST match exactly !*/ >if ((IB_MCR_COMPMASK_MTU_SEL & comp_mask) && >((p_rcvd_rec->mtu >> 6) != (p_mgrp->mcmember_rec.mtu >> 6))) goto Exit; > +#endif > >if ((IB_MCR_COMPMASK_MTU & comp_mask) && >((p_rcvd_rec->mtu & 0x3F) != (p_mgrp->mcmember_rec.mtu & 0x3F))) goto > Exit; This is cool, I have got Solaris IPoIB happily working with the OpenSM now. It plumbs, pings and snoops on 0x pkey. Here is some output: [EMAIL PROTECTED] ~]# cat /etc/path_to_inst | grep ibd "/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/pci15b3,[EMAIL PROTECTED]/[EMAIL PROTECTED],,ipib" 0 "ibd" "/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/pci15b3,[EMAIL PROTECTED]/[EMAIL PROTECTED],,ipib" 1 "ibd" [EMAIL PROTECTED] ~]# ifconfig ibd0 ibd0: flags=1000843 mtu 2044 index 3 inet 192.168.100.111 netmask ff00 broadcast 192.168.100.255 ipib 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 [EMAIL PROTECTED] ~]# ping 192.168.100.112 192.168.100.112 is alive [EMAIL PROTECTED] ~]# snoop -d ibd1 192.168.100.112 -> *ARP C Who is 192.168.100.111, 192.168.100.111 ? 192.168.100.111 -> 192.168.100.112 ARP R 192.168.100.111, 192.168.100.111 is 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 192.168.100.111 -> 192.168.100.112 ICMP Echo request (ID: 641 Sequence number: 0) 192.168.100.112 -> 192.168.100.111 ICMP Echo reply (ID: 641 Sequence number: 0) This is fantastic. Thanks Hal !.. BTW, I have not tested it with multiple GetTable reponse - RMPP packet. On other hand, on my linux node, if I try to use 8001 partition and configure IB interface with IP addr (same time while ib0 is using 0x pkey), I get the following error, you may want to investigate that [EMAIL PROTECTED] ~]# echo 0x8001 > /sys/class/net/ib0/create_child [EMAIL PROTECTED] ~]# ifconfig ib0.8001 10.10.1.1 [EMAIL PROTECTED]: multicast join failed for ff12:401b:8001:0:0:0::, status -22 ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 [EMAIL PROTECTED] ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 b0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status -22 Thanks Nitin > > > -Forwarded Message- > > From: Hal Rosenstock <[EMAIL PROTECTED]> > To: Nitin Hande <[EMAIL PROTECTED]> > Cc: openib , Tom Duffy <[EMAIL PROTECTED]> > Subject: Re: [openib-general] Solaris IPoIB MTU with OpenSM > Date: 24 Feb 2005 08:42:23 -0500 > > Hi Nitin, > > On Wed, 2005-02-23 at 17:19, Nitin Hande wrote: > > Hal, > > > > [comments below] > > On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote: > > > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote: > > > > So I tried the latest patches and preliminarily things seem to be > > > > working fine. > > > > > > Yipee. > > [snip..] > > > > > > > > > > > So after this test above, I try to run snoop on the solaris interface > > > > and get the following error message from the layer below IPoIB: > > > > > > > > Feb 22 19:50:25 dongon.
[Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]
Hi again Nitin, Finally got a chance to work on this. I have a workaround for you for now. Real patch later... Let me know if this does the trick for you. It did for me. -- Hal Index: osm_sa_mcmember_record.c === --- osm_sa_mcmember_record.c(revision 1953) +++ osm_sa_mcmember_record.c(working copy) @@ -1522,9 +1522,11 @@ if ((IB_MCR_COMPMASK_PROXY & comp_mask) && (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit; +#if 0 /* if defined MUST match exactly !*/ if ((IB_MCR_COMPMASK_MTU_SEL & comp_mask) && ((p_rcvd_rec->mtu >> 6) != (p_mgrp->mcmember_rec.mtu >> 6))) goto Exit; +#endif if ((IB_MCR_COMPMASK_MTU & comp_mask) && ((p_rcvd_rec->mtu & 0x3F) != (p_mgrp->mcmember_rec.mtu & 0x3F))) goto Exit; -Forwarded Message- From: Hal Rosenstock <[EMAIL PROTECTED]> To: Nitin Hande <[EMAIL PROTECTED]> Cc: openib , Tom Duffy <[EMAIL PROTECTED]> Subject: Re: [openib-general] Solaris IPoIB MTU with OpenSM Date: 24 Feb 2005 08:42:23 -0500 Hi Nitin, On Wed, 2005-02-23 at 17:19, Nitin Hande wrote: > Hal, > > [comments below] > On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote: > > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote: > > > So I tried the latest patches and preliminarily things seem to be > > > working fine. > > > > Yipee. > [snip..] > > > > > > > > So after this test above, I try to run snoop on the solaris interface > > > and get the following error message from the layer below IPoIB: > > > > > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE: > > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY Could not get list of > > > IBA multicast groups > > > > > > My preliminary assumption is that OpenSm is not returning the list of > > > multicast groups that the ibd interface has joined. I will look at the > > > MAD's tomorrow and try to ascertain that. > > > > How does S10 request this ? Remember that if it is a GetTable and > > doesn't fit in a single MAD, it will be broken now. If that is the case, > > we will live with this until we have real RMPP. > Below is an an example of a single GetTable request and response between > Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a > single request/response. I have also provided a MAD output between > Solaris IPoIB driver and IBSRM single GetTable request response below > this example. > > Here is the MAD trace between solaris and OpenSM: > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x12 - SubnAdmGetTable() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d100ec > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00 > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Incoming MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x92 - > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d100ec > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8.
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hi Nitin, On Wed, 2005-02-23 at 17:19, Nitin Hande wrote: > Hal, > > [comments below] > On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote: > > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote: > > > So I tried the latest patches and preliminarily things seem to be > > > working fine. > > > > Yipee. > [snip..] > > > > > > > > So after this test above, I try to run snoop on the solaris interface > > > and get the following error message from the layer below IPoIB: > > > > > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE: > > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY Could not get list of > > > IBA multicast groups > > > > > > My preliminary assumption is that OpenSm is not returning the list of > > > multicast groups that the ibd interface has joined. I will look at the > > > MAD's tomorrow and try to ascertain that. > > > > How does S10 request this ? Remember that if it is a GetTable and > > doesn't fit in a single MAD, it will be broken now. If that is the case, > > we will live with this until we have real RMPP. > Below is an an example of a single GetTable request and response between > Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a > single request/response. I have also provided a MAD output between > Solaris IPoIB driver and IBSRM single GetTable request response below > this example. > > Here is the MAD trace between solaris and OpenSM: > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x12 - SubnAdmGetTable() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d100ec > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00 > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Incoming MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x92 - > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d100ec > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. > 20: 00 00 00 14 00 00 00 00 00 00 00 00 00 07 00 00 > 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 It is likely failing the component checking in osm_sa_mcmember_record.c::__osm_sa_mcm_by_comp_mask_cb due to an endian issue. Either you can debug this code or I will early next week. The component mask in the request is 0x80b4 so the only components checked are QKey (0xb1b), MTU (exactly 2048 (4)), PKey (0x), and scope (2). If I don't hear anything by next week, I will work on this then. Thanks. -- Hal > Here is the transaction between IBSRM and Solaris IPoIB driver. > > Outgoing MAD: >
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hal, [comments below] On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote: > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote: > > So I tried the latest patches and preliminarily things seem to be > > working fine. > > Yipee. [snip..] > > > > > So after this test above, I try to run snoop on the solaris interface > > and get the following error message from the layer below IPoIB: > > > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE: > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY Could not get list of > > IBA multicast groups > > > > My preliminary assumption is that OpenSm is not returning the list of > > multicast groups that the ibd interface has joined. I will look at the > > MAD's tomorrow and try to ascertain that. > > How does S10 request this ? Remember that if it is a GetTable and > doesn't fit in a single MAD, it will be broken now. If that is the case, > we will live with this until we have real RMPP. Below is an an example of a single GetTable request and response between Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a single request/response. I have also provided a MAD output between Solaris IPoIB driver and IBSRM single GetTable request response below this example. Here is the MAD trace between solaris and OpenSM: Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x12 - SubnAdmGetTable() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d100ec AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Incoming MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x92 - Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d100ec AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec .vQ. 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. 20: 00 00 00 14 00 00 00 00 00 00 00 00 00 07 00 00 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Here is the transaction between IBSRM and Solaris IPoIB driver. Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x12 - SubnAdmGetTable() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x8fecc61009a AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 12 00 00 00 00 08 fe cc 61 00 00 00 9a ...a 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-22 at 22:56, Nitin Hande wrote: > So I tried the latest patches and preliminarily things seem to be > working fine. Yipee. > The PathRecord response is successful and so is the MTU > correct. I need to spend some more time looking at MAD and confirm it. I > could configure both interfaces and ping each other this time. Here is > some out on the solaris side: > > [EMAIL PROTECTED] ~]# ifconfig -a > lo0: flags=2001000849 mtu > 8232 index 1 > inet 127.0.0.1 netmask ff00 > ibd0: flags=1000843 mtu 2044 index > 28 > inet 192.168.100.105 netmask ff00 broadcast 192.168.100.255 > ipib 0:2c:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 > . > [EMAIL PROTECTED] ~]# ping -s 192.168.100.104 > PING 192.168.100.104: 56 data bytes > 64 bytes from 192.168.100.104: icmp_seq=0. time=0.590 ms > 64 bytes from 192.168.100.104: icmp_seq=1. time=0.434 ms > 64 bytes from 192.168.100.104: icmp_seq=2. time=0.365 ms > > the other side is a openib interface runing OpenSM. > > So after this test above, I try to run snoop on the solaris interface > and get the following error message from the layer below IPoIB: > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE: > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY Could not get list of > IBA multicast groups > > My preliminary assumption is that OpenSm is not returning the list of > multicast groups that the ibd interface has joined. I will look at the > MAD's tomorrow and try to ascertain that. How does S10 request this ? Remember that if it is a GetTable and doesn't fit in a single MAD, it will be broken now. If that is the case, we will live with this until we have real RMPP. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hal, On Thu, 2005-02-17 at 13:12, Hal Rosenstock wrote: > Hi Nitin, > > On Wed, 2005-02-16 at 17:33, Nitin Hande wrote: > > On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote: > > > On Wed, 2005-02-16 at 16:08, Nitin Hande wrote: > > > > Hal, > > [snip..] [snip...] > > > > > > > > > > > > Before the patch the selector was coming 04. Do you reply 84 seeing a > > particular component mask and otherwise 01 ??(I think not..) > > I agree that OpenSM responds/should respond the same regardless of the > component mask in the request. > > I was unaware of OpenSM responding with MTU of 01 until now. I have a > theory as to this. Any chance I can get the osm logs from a -V run of > the above ? > > I also have a simple patch below to try which is just to test the > theory. This is off the latest version but should be easy to apply to > any version of osm_sa_mcmember_record.c. > > This is separate from the support for PathRecords with multicast DGID > and/or DLID. I have the changes for this scoped out and should be able > to implement by early next week. So I tried the latest patches and preliminarily things seem to be working fine. The PathRecord response is successful and so is the MTU correct. I need to spend some more time looking at MAD and confirm it. I could configure both interfaces and ping each other this time. Here is some out on the solaris side: [EMAIL PROTECTED] ~]# ifconfig -a lo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff00 ibd0: flags=1000843 mtu 2044 index 28 inet 192.168.100.105 netmask ff00 broadcast 192.168.100.255 ipib 0:2c:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 . [EMAIL PROTECTED] ~]# ping -s 192.168.100.104 PING 192.168.100.104: 56 data bytes 64 bytes from 192.168.100.104: icmp_seq=0. time=0.590 ms 64 bytes from 192.168.100.104: icmp_seq=1. time=0.434 ms 64 bytes from 192.168.100.104: icmp_seq=2. time=0.365 ms the other side is a openib interface runing OpenSM. So after this test above, I try to run snoop on the solaris interface and get the following error message from the layer below IPoIB: Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE: ibd0: HCA GUID 0002c901097651d0 port 1 PKEY Could not get list of IBA multicast groups My preliminary assumption is that OpenSm is not returning the list of multicast groups that the ibd interface has joined. I will look at the MAD's tomorrow and try to ascertain that. Thanks Nitin > > Thanks. > > -- Hal > > Index: osm_sa_mcmember_record.c > === > --- osm_sa_mcmember_record.c (revision 1821) > +++ osm_sa_mcmember_record.c (working copy) > @@ -1325,11 +1325,13 @@ >/* copy qkey mlid tclass pkey sl_flow_hop mtu rate pkt_life > sl_flow_hop */ >__copy_from_create_mc_rec(&mcmember_rec, &p_mgrp->mcmember_rec); > > +#if 0 >if(p_mgrp->well_known) >{ > p_mgrp->mcmember_rec.mtu = mtu; > mcmember_rec.mtu = mtu; >} > +#endif > >/* Release the lock as we don't need it. */ >CL_PLOCK_RELEASE( p_rcv->p_lock ); > > > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hi Nitin, On Wed, 2005-02-16 at 17:33, Nitin Hande wrote: > On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote: > > On Wed, 2005-02-16 at 16:08, Nitin Hande wrote: > > > Hal, > [snip..] > > > > > > > > > Here is the trace of 256 sized MTU: > > > > > > Outgoing MAD: > > > BaseVersion: 0x1 > > > MgmtClass: 0x3 - SubnAdm > > > ClassVersion: 0x2 > > > R_Method: 0x12 - SubnAdmGetTable() > > > Status: 0x0 - NO_ERROR > > > ClassSpecific: 0x0 > > > TransactionID: 0x97651d10096 > > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > > > Incoming MAD: > > > BaseVersion: 0x1 > > > MgmtClass: 0x3 - SubnAdm > > > ClassVersion: 0x2 > > > R_Method: 0x92 - > > > Status: 0x0 - NO_ERROR > > > ClassSpecific: 0x0 > > > TransactionID: 0x97651d10096 > > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > > 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > > > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. > > > 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L > > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00 > > > 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ... > > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > > > And on other occassions where OpenSM reports the 2048 sized MTU: > > > > > > Outgoing MAD: > > > BaseVersion: 0x1 > > > MgmtClass: 0x3 - SubnAdm > > > ClassVersion: 0x2 > > > R_Method: 0x12 - SubnAdmGetTable() > > > Status: 0x0 - NO_ERROR > > > ClassSpecific: 0x0 > > > TransactionID: 0x97651d1009a > > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a .vQ. > > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > c0: 00 00 00 00 00 00 00 00 00 00 0
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote: > On Wed, 2005-02-16 at 16:08, Nitin Hande wrote: > > Hal, [snip..] > > > > > > Here is the trace of 256 sized MTU: > > > > Outgoing MAD: > > BaseVersion: 0x1 > > MgmtClass: 0x3 - SubnAdm > > ClassVersion: 0x2 > > R_Method: 0x12 - SubnAdmGetTable() > > Status: 0x0 - NO_ERROR > > ClassSpecific: 0x0 > > TransactionID: 0x97651d10096 > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > Incoming MAD: > > BaseVersion: 0x1 > > MgmtClass: 0x3 - SubnAdm > > ClassVersion: 0x2 > > R_Method: 0x92 - > > Status: 0x0 - NO_ERROR > > ClassSpecific: 0x0 > > TransactionID: 0x97651d10096 > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. > > 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00 > > 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ... > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > And on other occassions where OpenSM reports the 2048 sized MTU: > > > > Outgoing MAD: > > BaseVersion: 0x1 > > MgmtClass: 0x3 - SubnAdm > > ClassVersion: 0x2 > > R_Method: 0x12 - SubnAdmGetTable() > > Status: 0x0 - NO_ERROR > > ClassSpecific: 0x0 > > TransactionID: 0x97651d1009a > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a .vQ. > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Wed, 2005-02-16 at 16:08, Nitin Hande wrote: > Hal, > > On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote: > > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > > > I have a hunch for whats happening here, but before I jump into any > > > conclusions, I am seeing some other issue between Solaris IPoIB driver > > > and OpenSM. After joining the Broadcast group, the PathRecord Response > > > coming from OpenSM signals an error with Invalid GUID. > > > > Is the MTU from the PathRecord used ? Is that the theory ? So these are > > one and the same issue. Thanks. > No, I was more of thinking of an endian issue between IBD and the layer > beneath it during the MCMemberRecord response. The mtu is not dependant > on PathRecord Response. Thanks to Tom, we have figured out a way of > consistently reproducing this on our systems here. The way to reproduce > is (basically start everything fresh): > 1. rmmod {ib_mthca, umad and ipoib}, stop opensm > 2. unplumb ibd driver and modunload ibd on Solaris, > 3. modprobe and restart opensm > 4. plumb ibd interface. > You should see ibd setting the mtu size to 252. Some of the above steps > maybe unecessary. From the trace, it looks like OpenSM is reporting 256 > bytes of MTU to ipoib for MCMemberRecord response. > > Here is the trace of 256 sized MTU: > > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x12 - SubnAdmGetTable() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d10096 > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Incoming MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x92 - > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d10096 > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. > 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00 > 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > And on other occassions where OpenSM reports the 2048 sized MTU: > > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x12 - SubnAdmGetTable() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d1009a > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a .vQ. > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hal, On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote: > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > > I have a hunch for whats happening here, but before I jump into any > > conclusions, I am seeing some other issue between Solaris IPoIB driver > > and OpenSM. After joining the Broadcast group, the PathRecord Response > > coming from OpenSM signals an error with Invalid GUID. > > Is the MTU from the PathRecord used ? Is that the theory ? So these are > one and the same issue. Thanks. No, I was more of thinking of an endian issue between IBD and the layer beneath it during the MCMemberRecord response. The mtu is not dependant on PathRecord Response. Thanks to Tom, we have figured out a way of consistently reproducing this on our systems here. The way to reproduce is (basically start everything fresh): 1. rmmod {ib_mthca, umad and ipoib}, stop opensm 2. unplumb ibd driver and modunload ibd on Solaris, 3. modprobe and restart opensm 4. plumb ibd interface. You should see ibd setting the mtu size to 252. Some of the above steps maybe unecessary. From the trace, it looks like OpenSM is reporting 256 bytes of MTU to ipoib for MCMemberRecord response. Here is the trace of 256 sized MTU: Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x12 - SubnAdmGetTable() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d10096 AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00 ... 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Incoming MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x92 - Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d10096 AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96 .vQ. 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01 .8w. 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00 ...L 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00 ... 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 And on other occassions where OpenSM reports the 2048 sized MTU: Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x12 - SubnAdmGetTable() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d1009a AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a .vQ. 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > I have a hunch for whats happening here, but before I jump into any > conclusions, I am seeing some other issue between Solaris IPoIB driver > and OpenSM. After joining the Broadcast group, the PathRecord Response > coming from OpenSM signals an error with Invalid GUID. Is the MTU from the PathRecord used ? Is that the theory ? So these are one and the same issue. Thanks. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-15 at 15:57, Hal Rosenstock wrote: > On Tue, 2005-02-15 at 17:45, Nitin Hande wrote: > > Here is the osm log, I think we may have a lead, the dest GID is wrong: > > : > > > > Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 > > joining MLID 0xC001. > > Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > > with GUID = 0x. > > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic > > Notice type:3 num:66 from LID:0x0001 > > GID:0xfe80,0x0002c9010a99e031 > > Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 > > joining MLID 0xC011. > > Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > > with GUID = 0x. > > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > > with GUID = 0x. > > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > > with GUID = 0x. > > Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > > with GUID = 0x. > > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > > Join State != FullMember - required for create. > > Are you sure ? No, I was looking at some other location. > > It looks to me like it is trying to get a PathRecord to the broadcast > MGID as DGID: > AttributeID: 0x35 - SA_PATHRECORD_ATTRID > ... > 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff > > I'm not sure this is necessary as aren't all the parameters returned in > the SA GetResp MCMemberRecord in response to the Set ? Anyhow it is > perfectly legal. OpenSM just doesn't support it right now. Yes ,sure. Will wait for the patches.. Thanks Nitin > > -- Hal > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-15 at 17:45, Nitin Hande wrote: > Here is the osm log, I think we may have a lead, the dest GID is wrong: > : > > Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 > joining MLID 0xC001. > Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > with GUID = 0x. > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic > Notice type:3 num:66 from LID:0x0001 > GID:0xfe80,0x0002c9010a99e031 > Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 > joining MLID 0xC011. > Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > with GUID = 0x. > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > with GUID = 0x. > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > with GUID = 0x. > Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port > with GUID = 0x. > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided > Join State != FullMember - required for create. Are you sure ? It looks to me like it is trying to get a PathRecord to the broadcast MGID as DGID: AttributeID: 0x35 - SA_PATHRECORD_ATTRID ... 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff I'm not sure this is necessary as aren't all the parameters returned in the SA GetResp MCMemberRecord in response to the Set ? Anyhow it is perfectly legal. OpenSM just doesn't support it right now. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-15 at 13:45, Hal Rosenstock wrote: > Hi again Nitin, > > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > > After joining the Broadcast group, the PathRecord Response > > coming from OpenSM signals an error with Invalid GUID. I wonder why, > > There appear to be only 2 places in the code (I'm not saying the code is > right) where this can occur. > > osm_sa_path_record.c: > > if( p_sa_mad->comp_mask & IB_PR_COMPMASK_DGID ) > ... > /* > This 'error' is the client's fault (bad gid) so > don't enter it as an error in our own log. > Return an error response to the client. > */ > osm_log( p_rcv->p_log, OSM_LOG_VERBOSE, >"__osm_pr_rcv_get_end_points: " >"No dest port with GUID = 0x%016" PRIx64 ".\n", >cl_ntoh64( p_pr->dgid.unicast.interface_id) ); > > sa_status = IB_SA_MAD_STATUS_INVALID_GID; > > and a similar thing for SGID > > if( p_sa_mad->comp_mask & IB_PR_COMPMASK_SGID ) > { > ... > > /* > This 'error' is the client's fault (bad gid) so > don't enter it as an error in our own log. > Return an error response to the client. > */ > osm_log( p_rcv->p_log, OSM_LOG_VERBOSE, >"__osm_pr_rcv_get_end_points: " >"No source port with GUID = 0x%016" PRIx64 ".\n", >cl_ntoh64( p_pr->sgid.unicast.interface_id) ); > > Can you look in the osm.log to see if the source or dest GID is > implicated ? This will help me chase it down. Thanks. Here is the osm log, I think we may have a lead, the dest GID is wrong: : Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 joining MLID 0xC001. Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port with GUID = 0x. Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic Notice type:3 num:66 from LID:0x0001 GID:0xfe80,0x0002c9010a99e031 Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1 joining MLID 0xC011. Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port with GUID = 0x. Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port with GUID = 0x. Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port with GUID = 0x. Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port with GUID = 0x. Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided Join State != FullMember - required for create. thanks Nitin > > -- Hal > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hi Nitin, On Tue, 2005-02-15 at 16:45, Hal Rosenstock wrote: > Can you look in the osm.log to see if the source or dest GID is > implicated ? This will help me chase it down. Thanks. Both SGID and DGID are in the component mask but my bet is on the DGID. OpenSM does not currently support PathRecords for MGIDs. I will working on fixing this. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
Hi again Nitin, On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > After joining the Broadcast group, the PathRecord Response > coming from OpenSM signals an error with Invalid GUID. I wonder why, There appear to be only 2 places in the code (I'm not saying the code is right) where this can occur. osm_sa_path_record.c: if( p_sa_mad->comp_mask & IB_PR_COMPMASK_DGID ) ... /* This 'error' is the client's fault (bad gid) so don't enter it as an error in our own log. Return an error response to the client. */ osm_log( p_rcv->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No dest port with GUID = 0x%016" PRIx64 ".\n", cl_ntoh64( p_pr->dgid.unicast.interface_id) ); sa_status = IB_SA_MAD_STATUS_INVALID_GID; and a similar thing for SGID if( p_sa_mad->comp_mask & IB_PR_COMPMASK_SGID ) { ... /* This 'error' is the client's fault (bad gid) so don't enter it as an error in our own log. Return an error response to the client. */ osm_log( p_rcv->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No source port with GUID = 0x%016" PRIx64 ".\n", cl_ntoh64( p_pr->sgid.unicast.interface_id) ); Can you look in the osm.log to see if the source or dest GID is implicated ? This will help me chase it down. Thanks. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Solaris IPoIB MTU with OpenSM
On Tue, 2005-02-15 at 16:36, Nitin Hande wrote: > I have a hunch for whats happening here, Glad to hear this as I don't have a clue :-) > but before I jump into any > conclusions, I am seeing some other issue between Solaris IPoIB driver > and OpenSM. After joining the Broadcast group, the PathRecord Response > coming from OpenSM signals an error with Invalid GUID. I wonder why, I missed this as my decode is manual being 1.0.a style. I will look at this and my traces and get back to you later on this. > Here is the mad trace: > > > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x2 - SubnAdmSet() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d10034 > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 02 00 00 00 00 09 76 51 d1 00 00 00 34 .vQ4 > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00 > 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b 00 00 00 00 .vQ. > 60: ff ff 00 00 00 00 00 00 21 00 00 00 00 00 00 00 !... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Incoming MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x81 - > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d10034 > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 81 00 00 00 00 09 76 51 d1 00 00 00 34 .vQ4 > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00 > 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] > 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00 > 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b c0 00 04 00 .vQ. > 60: ff ff 03 12 00 00 00 00 21 00 00 00 00 00 00 00 !... > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > OpenSM responds positively to MCMEMBERRECORD and then: > > Outgoing MAD: > BaseVersion: 0x1 > MgmtClass: 0x3 - SubnAdm > ClassVersion: 0x2 > R_Method: 0x12 - SubnAdmGetTable() > Status: 0x0 - NO_ERROR > ClassSpecific: 0x0 > TransactionID: 0x97651d1003a > AttributeID: 0x35 - SA_PATHRECORD_ATTRID > > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 3a .vQ: > 10: 00 35 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .5.. > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 00 00 > 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff [EMAIL PROTECTED] > 50: fe 80 00 00 00 00 00 00 00 02 c9 01 09 76 51 d1 .vQ. > 60: 00 00 00 00 00 00 00 00 00 81 00 00 00 00 00 00 > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00
Re: [openib-general] Solaris IPoIB MTU with OpenSM
I have a hunch for whats happening here, but before I jump into any conclusions, I am seeing some other issue between Solaris IPoIB driver and OpenSM. After joining the Broadcast group, the PathRecord Response coming from OpenSM signals an error with Invalid GUID. I wonder why, Here is the mad trace: Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x2 - SubnAdmSet() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d10034 AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 02 00 00 00 00 09 76 51 d1 00 00 00 34 .vQ4 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b 00 00 00 00 .vQ. 60: ff ff 00 00 00 00 00 00 21 00 00 00 00 00 00 00 !... 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Incoming MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x81 - Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d10034 AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 81 00 00 00 00 09 76 51 d1 00 00 00 34 .vQ4 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .8.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00 [EMAIL PROTECTED] 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b c0 00 04 00 .vQ. 60: ff ff 03 12 00 00 00 00 21 00 00 00 00 00 00 00 !... 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 OpenSM responds positively to MCMEMBERRECORD and then: Outgoing MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x12 - SubnAdmGetTable() Status: 0x0 - NO_ERROR ClassSpecific: 0x0 TransactionID: 0x97651d1003a AttributeID: 0x35 - SA_PATHRECORD_ATTRID 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 3a .vQ: 10: 00 35 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 .5.. 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 00 00 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff [EMAIL PROTECTED] 50: fe 80 00 00 00 00 00 00 00 02 c9 01 09 76 51 d1 .vQ. 60: 00 00 00 00 00 00 00 00 00 81 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Incoming MAD: BaseVersion: 0x1 MgmtClass: 0x3 - SubnAdm ClassVersion: 0x2 R_Method: 0x92 - Status: 0x5
[openib-general] Solaris IPoIB MTU with OpenSM
Hi, Unfortunately, the Solaris 10 IPoIB MTU with OpenSM is back to the maximum size of 252 again :-( I'm not sure whether this was ever really fixed although I do now see the packets indicating an exact MTU of 4 (2048 bytes). I'm not sure what Solaris doesn't like about the OpenSM response to the MCMemberRecord. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general