Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]

2005-03-15 Thread Hal Rosenstock
Hi Nitin,

On Tue, 2005-03-15 at 16:15, Nitin Hande wrote:
> This is cool, I have got Solaris IPoIB happily working with the
> OpenSM now. It plumbs, pings and snoops on 0x pkey. 

Great. That's good news. I'll work on a real fix for this now.

> On other hand, on my linux node, if I try to use 8001 partition and
> configure IB interface with IP addr (same time while ib0 is using 0x
> pkey), I get the following error, you may want to investigate that
> 
> [EMAIL PROTECTED] ~]# echo 0x8001 > /sys/class/net/ib0/create_child
> [EMAIL PROTECTED] ~]# ifconfig ib0.8001 10.10.1.1
> [EMAIL PROTECTED]: multicast join failed for
> ff12:401b:8001:0:0:0::, status -22
>  ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
> status -22

I will look into this but I suspect this is caused by the response to
some request in the join "flow" to be more than 1 RMPP packet. Remember
that OpenSM is currently hamstrung in this manner until there is
sufficient RMPP for SA GetTableResps.

Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]

2005-03-15 Thread Roland Dreier
Nitin> On other hand, on my linux node, if I try to use 8001
Nitin> partition and configure IB interface with IP addr (same
Nitin> time while ib0 is using 0x pkey), I get the following
Nitin> error, you may want to investigate that

I think this is probably an OpenSM issue (does OpenSM support multiple
partitions?).  On my fabric, running Topspin's embedded SM on a
switch, I can do:

# modprobe ib_ipoib
# echo 0x8001 > /sys/class/net/ib0/create_child
# ifconfig ib0.8001 up

on both systems.  On system #1 I have:

# ifconfig ib0.8001
ib0.8001  Link encap:UNSPEC  HWaddr 
00-13-04-06-FE-80-00-00-00-00-00-00-00-00-00-00
  inet6 addr: fe80::202:c901:7fc:c711/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:0 (0.0 b)  TX bytes:300 (300.0 b)

and on system #2 I'm able to do:

# ping6 -I ib0.8001 fe80::202:c901:7fc:c711
PING fe80::202:c901:7fc:c711(fe80::202:c901:7fc:c711) from 
fe80::202:c901:78c:e461 ib0.8001: 56 data bytes
64 bytes from fe80::202:c901:7fc:c711: icmp_seq=1 ttl=64 time=4.56 ms
64 bytes from fe80::202:c901:7fc:c711: icmp_seq=2 ttl=64 time=0.077 ms
64 bytes from fe80::202:c901:7fc:c711: icmp_seq=3 ttl=64 time=0.065 ms

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]

2005-03-15 Thread Nitin Hande
Hal,

On Fri, 2005-03-04 at 12:53, Hal Rosenstock wrote:
> Hi again Nitin,
> 
> Finally got a chance to work on this. I have a workaround for you for
> now. Real patch later... Let me know if this does the trick for you. It
> did for me.
> 
> -- Hal
> 
> Index: osm_sa_mcmember_record.c
> ===
> --- osm_sa_mcmember_record.c  (revision 1953)
> +++ osm_sa_mcmember_record.c  (working copy)
> @@ -1522,9 +1522,11 @@
>if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
>(p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit;
>  
> +#if 0
>/* if defined MUST match exactly !*/
>if ((IB_MCR_COMPMASK_MTU_SEL & comp_mask) &&
>((p_rcvd_rec->mtu >> 6) != (p_mgrp->mcmember_rec.mtu >> 6))) goto Exit;
> +#endif
>  
>if ((IB_MCR_COMPMASK_MTU & comp_mask) &&
>((p_rcvd_rec->mtu & 0x3F) != (p_mgrp->mcmember_rec.mtu & 0x3F))) goto 
> Exit;
This is cool, I have got Solaris IPoIB happily working with the OpenSM
now. It plumbs, pings and snoops on 0x pkey. Here is some output:

[EMAIL PROTECTED] ~]# cat /etc/path_to_inst | grep ibd
"/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/pci15b3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],,ipib" 0 "ibd"
"/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/pci15b3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],,ipib" 1 "ibd"
[EMAIL PROTECTED] ~]# ifconfig ibd0
ibd0: flags=1000843 mtu 2044 index
3
inet 192.168.100.111 netmask ff00 broadcast 192.168.100.255
ipib 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 
[EMAIL PROTECTED] ~]# ping 192.168.100.112
192.168.100.112 is alive
[EMAIL PROTECTED] ~]# snoop -d ibd1
192.168.100.112 -> *ARP C Who is 192.168.100.111,
192.168.100.111 ?
192.168.100.111 -> 192.168.100.112 ARP R 192.168.100.111,
192.168.100.111 is 0:0:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1
192.168.100.111 -> 192.168.100.112 ICMP Echo request (ID: 641 Sequence
number: 0)
192.168.100.112 -> 192.168.100.111 ICMP Echo reply (ID: 641 Sequence
number: 0)

This is fantastic. Thanks Hal !..

BTW, I have not tested it with multiple GetTable reponse - RMPP packet.
 
On other hand, on my linux node, if I try to use 8001 partition and
configure IB interface with IP addr (same time while ib0 is using 0x
pkey), I get the following error, you may want to investigate that

[EMAIL PROTECTED] ~]# echo 0x8001 > /sys/class/net/ib0/create_child
[EMAIL PROTECTED] ~]# ifconfig ib0.8001 10.10.1.1
[EMAIL PROTECTED]: multicast join failed for
ff12:401b:8001:0:0:0::, status -22
 ~]# ib0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
[EMAIL PROTECTED] ~]# ib0.8001: multicast join failed for
ff12:401b:8001:0:0:0::, status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status
-22
0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22
0.8001: multicast join failed for ff12:401b:8001:0:0:0::, status
-22
b0.8001: multicast join failed for ff12:401b:8001:0:0:0::,
status -22

Thanks
Nitin



> 
> 
> -Forwarded Message-
> 
> From: Hal Rosenstock <[EMAIL PROTECTED]>
> To: Nitin Hande <[EMAIL PROTECTED]>
> Cc: openib , Tom Duffy <[EMAIL PROTECTED]>
> Subject: Re: [openib-general] Solaris IPoIB MTU with OpenSM
> Date: 24 Feb 2005 08:42:23 -0500
> 
> Hi Nitin,
> 
> On Wed, 2005-02-23 at 17:19, Nitin Hande wrote:
> > Hal, 
> > 
> > [comments below]
> > On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote:
> > > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> > > > So I tried the latest patches and preliminarily things seem to be
> > > > working fine. 
> > > 
> > > Yipee.
> > [snip..]
> > > 
> > > > 
> > > > So after this test above, I try to run snoop on the solaris interface
> > > > and get the following error message from the layer below IPoIB:
> > > > 
> > > > Feb 22 19:50:25 dongon.

[Fwd: Re: [openib-general] Solaris IPoIB MTU with OpenSM]

2005-03-04 Thread Hal Rosenstock
Hi again Nitin,

Finally got a chance to work on this. I have a workaround for you for
now. Real patch later... Let me know if this does the trick for you. It
did for me.

-- Hal

Index: osm_sa_mcmember_record.c
===
--- osm_sa_mcmember_record.c(revision 1953)
+++ osm_sa_mcmember_record.c(working copy)
@@ -1522,9 +1522,11 @@
   if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
   (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit;
 
+#if 0
   /* if defined MUST match exactly !*/
   if ((IB_MCR_COMPMASK_MTU_SEL & comp_mask) &&
   ((p_rcvd_rec->mtu >> 6) != (p_mgrp->mcmember_rec.mtu >> 6))) goto Exit;
+#endif
 
   if ((IB_MCR_COMPMASK_MTU & comp_mask) &&
   ((p_rcvd_rec->mtu & 0x3F) != (p_mgrp->mcmember_rec.mtu & 0x3F))) goto 
Exit;



-Forwarded Message-

From: Hal Rosenstock <[EMAIL PROTECTED]>
To: Nitin Hande <[EMAIL PROTECTED]>
Cc: openib , Tom Duffy <[EMAIL PROTECTED]>
Subject: Re: [openib-general] Solaris IPoIB MTU with OpenSM
Date: 24 Feb 2005 08:42:23 -0500

Hi Nitin,

On Wed, 2005-02-23 at 17:19, Nitin Hande wrote:
> Hal, 
> 
> [comments below]
> On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote:
> > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> > > So I tried the latest patches and preliminarily things seem to be
> > > working fine. 
> > 
> > Yipee.
> [snip..]
> > 
> > > 
> > > So after this test above, I try to run snoop on the solaris interface
> > > and get the following error message from the layer below IPoIB:
> > > 
> > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
> > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY  Could not get list of
> > > IBA multicast groups
> > > 
> > > My preliminary assumption is that OpenSm is not returning the list of
> > > multicast groups that the ibd interface has joined. I will look at the
> > > MAD's tomorrow and try to ascertain that.
> > 
> > How does S10 request this ? Remember that if it is a GetTable and
> > doesn't fit in a single MAD, it will be broken now. If that is the case,
> > we will live with this until we have real RMPP.
> Below is an an example of a single GetTable request and response between
> Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a
> single request/response.  I have also provided a MAD output between
> Solaris IPoIB driver and IBSRM single GetTable request response below
> this example.
> 
> Here is the MAD trace between solaris and OpenSM:
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100ec
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00  
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x92 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100ec
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8.

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-24 Thread Hal Rosenstock
Hi Nitin,

On Wed, 2005-02-23 at 17:19, Nitin Hande wrote:
> Hal, 
> 
> [comments below]
> On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote:
> > On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> > > So I tried the latest patches and preliminarily things seem to be
> > > working fine. 
> > 
> > Yipee.
> [snip..]
> > 
> > > 
> > > So after this test above, I try to run snoop on the solaris interface
> > > and get the following error message from the layer below IPoIB:
> > > 
> > > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
> > > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY  Could not get list of
> > > IBA multicast groups
> > > 
> > > My preliminary assumption is that OpenSm is not returning the list of
> > > multicast groups that the ibd interface has joined. I will look at the
> > > MAD's tomorrow and try to ascertain that.
> > 
> > How does S10 request this ? Remember that if it is a GetTable and
> > doesn't fit in a single MAD, it will be broken now. If that is the case,
> > we will live with this until we have real RMPP.
> Below is an an example of a single GetTable request and response between
> Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a
> single request/response.  I have also provided a MAD output between
> Solaris IPoIB driver and IBSRM single GetTable request response below
> this example.
> 
> Here is the MAD trace between solaris and OpenSM:
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100ec
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00  
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x92 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d100ec
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
> 20: 00 00 00 14 00 00 00 00 00 00 00 00 00 07 00 00  
> 30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

It is likely failing the component checking in
osm_sa_mcmember_record.c::__osm_sa_mcm_by_comp_mask_cb due to an endian
issue. Either you can debug this code or I will early next week.

The component mask in the request is 0x80b4 so the only components
checked are QKey (0xb1b), MTU (exactly 2048 (4)), PKey (0x), and
scope (2).

If I don't hear anything by next week, I will work on this then.

Thanks.

-- Hal

> Here is the transaction between IBSRM and Solaris IPoIB driver. 
> 
> Outgoing MAD:
>

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-23 Thread Nitin Hande
Hal, 

[comments below]
On Wed, 2005-02-23 at 02:19, Hal Rosenstock wrote:
> On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> > So I tried the latest patches and preliminarily things seem to be
> > working fine. 
> 
> Yipee.
[snip..]
> 
> > 
> > So after this test above, I try to run snoop on the solaris interface
> > and get the following error message from the layer below IPoIB:
> > 
> > Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
> > ibd0: HCA GUID 0002c901097651d0 port 1 PKEY  Could not get list of
> > IBA multicast groups
> > 
> > My preliminary assumption is that OpenSm is not returning the list of
> > multicast groups that the ibd interface has joined. I will look at the
> > MAD's tomorrow and try to ascertain that.
> 
> How does S10 request this ? Remember that if it is a GetTable and
> doesn't fit in a single MAD, it will be broken now. If that is the case,
> we will live with this until we have real RMPP.
Below is an an example of a single GetTable request and response between
Solaris and OpenSM. OpenSM is not reporting the MCgroups in case of a
single request/response.  I have also provided a MAD output between
Solaris IPoIB driver and IBSRM single GetTable request response below
this example.

Here is the MAD trace between solaris and OpenSM:
Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x12 - SubnAdmGetTable()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d100ec
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
50: 00 00 00 00 00 00 00 00 00 00 0b 1b 00 00 84 00  
60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
Incoming MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x92 -
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d100ec
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 ec  .vQ.
10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
20: 00 00 00 14 00 00 00 00 00 00 00 00 00 07 00 00  
30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

Here is the transaction between IBSRM and Solaris IPoIB driver. 

Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x12 - SubnAdmGetTable()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x8fecc61009a
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 12 00 00 00 00 08 fe cc 61 00 00 00 9a  ...a
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 00 80 b4 00 00 00 00 00 00 00 00  
40: 00 00 00 00 00 00 00 00 00

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-23 Thread Hal Rosenstock
On Tue, 2005-02-22 at 22:56, Nitin Hande wrote:
> So I tried the latest patches and preliminarily things seem to be
> working fine. 

Yipee.

> The PathRecord response is successful and so is the MTU
> correct. I need to spend some more time looking at MAD and confirm it. I
> could configure both interfaces and ping each other this time. Here is
> some out on the solaris side:
> 
> [EMAIL PROTECTED] ~]# ifconfig -a
> lo0: flags=2001000849 mtu
> 8232 index 1
> inet 127.0.0.1 netmask ff00 
> ibd0: flags=1000843 mtu 2044 index
> 28
> inet 192.168.100.105 netmask ff00 broadcast 192.168.100.255
> ipib 0:2c:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 
> .
> [EMAIL PROTECTED] ~]# ping -s 192.168.100.104
> PING 192.168.100.104: 56 data bytes
> 64 bytes from 192.168.100.104: icmp_seq=0. time=0.590 ms
> 64 bytes from 192.168.100.104: icmp_seq=1. time=0.434 ms
> 64 bytes from 192.168.100.104: icmp_seq=2. time=0.365 ms
> 
> the other side is a openib interface runing OpenSM.
> 
> So after this test above, I try to run snoop on the solaris interface
> and get the following error message from the layer below IPoIB:
> 
> Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
> ibd0: HCA GUID 0002c901097651d0 port 1 PKEY  Could not get list of
> IBA multicast groups
> 
> My preliminary assumption is that OpenSm is not returning the list of
> multicast groups that the ibd interface has joined. I will look at the
> MAD's tomorrow and try to ascertain that.

How does S10 request this ? Remember that if it is a GetTable and
doesn't fit in a single MAD, it will be broken now. If that is the case,
we will live with this until we have real RMPP.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-22 Thread Nitin Hande
Hal,


On Thu, 2005-02-17 at 13:12, Hal Rosenstock wrote:
> Hi Nitin,
> 
> On Wed, 2005-02-16 at 17:33, Nitin Hande wrote: 
> > On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote:
> > > On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> > > > Hal,
> > [snip..]
[snip...]
> > > > 
> > > > 
> > 
> > Before the patch the selector was coming 04. Do you reply 84 seeing a
> > particular component mask and otherwise 01 ??(I think not..) 
> 
> I agree that OpenSM responds/should respond the same regardless of the
> component mask in the request.
> 
> I was unaware of OpenSM responding with MTU of 01 until now. I have a
> theory as to this. Any chance I can get the osm logs from a -V run of
> the above ? 
> 
> I also have a simple patch below to try which is just to test the
> theory. This is off the latest version but should be easy to apply to
> any version of osm_sa_mcmember_record.c.
> 
> This is separate from the support for PathRecords with multicast DGID
> and/or DLID. I have the changes for this scoped out and should be able
> to implement by early next week.

So I tried the latest patches and preliminarily things seem to be
working fine. The PathRecord response is successful and so is the MTU
correct. I need to spend some more time looking at MAD and confirm it. I
could configure both interfaces and ping each other this time. Here is
some out on the solaris side:

[EMAIL PROTECTED] ~]# ifconfig -a
lo0: flags=2001000849 mtu
8232 index 1
inet 127.0.0.1 netmask ff00 
ibd0: flags=1000843 mtu 2044 index
28
inet 192.168.100.105 netmask ff00 broadcast 192.168.100.255
ipib 0:2c:0:16:fe:80:0:0:0:0:0:0:0:2:c9:1:9:76:51:d1 
.
[EMAIL PROTECTED] ~]# ping -s 192.168.100.104
PING 192.168.100.104: 56 data bytes
64 bytes from 192.168.100.104: icmp_seq=0. time=0.590 ms
64 bytes from 192.168.100.104: icmp_seq=1. time=0.434 ms
64 bytes from 192.168.100.104: icmp_seq=2. time=0.365 ms

the other side is a openib interface runing OpenSM.

So after this test above, I try to run snoop on the solaris interface
and get the following error message from the layer below IPoIB:

Feb 22 19:50:25 dongon.SFBay.Sun.COM ibd: [ID 517869 kern.info] NOTICE:
ibd0: HCA GUID 0002c901097651d0 port 1 PKEY  Could not get list of
IBA multicast groups

My preliminary assumption is that OpenSm is not returning the list of
multicast groups that the ibd interface has joined. I will look at the
MAD's tomorrow and try to ascertain that.

Thanks
Nitin


 



> 
> Thanks.
> 
> -- Hal
> 
> Index: osm_sa_mcmember_record.c
> ===
> --- osm_sa_mcmember_record.c  (revision 1821)
> +++ osm_sa_mcmember_record.c  (working copy)
> @@ -1325,11 +1325,13 @@
>/* copy qkey mlid tclass pkey sl_flow_hop mtu rate pkt_life
> sl_flow_hop */
>__copy_from_create_mc_rec(&mcmember_rec, &p_mgrp->mcmember_rec);
>  
> +#if 0
>if(p_mgrp->well_known)
>{
>  p_mgrp->mcmember_rec.mtu = mtu;
>  mcmember_rec.mtu = mtu;
>}
> +#endif
>  
>/* Release the lock as we don't need it. */
>CL_PLOCK_RELEASE( p_rcv->p_lock );
> 
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-17 Thread Hal Rosenstock
Hi Nitin,

On Wed, 2005-02-16 at 17:33, Nitin Hande wrote: 
> On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote:
> > On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> > > Hal,
> [snip..]
> > > 
> > > 
> > > Here is the trace of 256 sized MTU:
> > > 
> > > Outgoing MAD:
> > > BaseVersion: 0x1
> > > MgmtClass: 0x3 - SubnAdm
> > > ClassVersion: 0x2
> > > R_Method: 0x12 - SubnAdmGetTable()
> > > Status: 0x0 - NO_ERROR
> > > ClassSpecific: 0x0
> > > TransactionID: 0x97651d10096
> > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > > 
> > > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> > >  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 
> > > Incoming MAD:
> > > BaseVersion: 0x1
> > > MgmtClass: 0x3 - SubnAdm
> > > ClassVersion: 0x2
> > > R_Method: 0x92 -
> > > Status: 0x0 - NO_ERROR
> > > ClassSpecific: 0x0
> > > TransactionID: 0x97651d10096
> > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > > 
> > > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> > >  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> > > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
> > > 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L
> > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00  
> > > 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 
> > > And on other occassions where OpenSM reports the 2048 sized MTU:
> > > 
> > > Outgoing MAD:
> > > BaseVersion: 0x1
> > > MgmtClass: 0x3 - SubnAdm
> > > ClassVersion: 0x2
> > > R_Method: 0x12 - SubnAdmGetTable()
> > > Status: 0x0 - NO_ERROR
> > > ClassSpecific: 0x0
> > > TransactionID: 0x97651d1009a
> > > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > > 
> > > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> > >  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a  .vQ.
> > > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > > c0: 00 00 00 00 00 00 00 00 00 00 0

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-16 Thread Nitin Hande
On Wed, 2005-02-16 at 13:26, Hal Rosenstock wrote:
> On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> > Hal,
[snip..]
> > 
> > 
> > Here is the trace of 256 sized MTU:
> > 
> > Outgoing MAD:
> > BaseVersion: 0x1
> > MgmtClass: 0x3 - SubnAdm
> > ClassVersion: 0x2
> > R_Method: 0x12 - SubnAdmGetTable()
> > Status: 0x0 - NO_ERROR
> > ClassSpecific: 0x0
> > TransactionID: 0x97651d10096
> > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 
> > Incoming MAD:
> > BaseVersion: 0x1
> > MgmtClass: 0x3 - SubnAdm
> > ClassVersion: 0x2
> > R_Method: 0x92 -
> > Status: 0x0 - NO_ERROR
> > ClassSpecific: 0x0
> > TransactionID: 0x97651d10096
> > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> > 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
> > 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L
> > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00  
> > 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 
> > And on other occassions where OpenSM reports the 2048 sized MTU:
> > 
> > Outgoing MAD:
> > BaseVersion: 0x1
> > MgmtClass: 0x3 - SubnAdm
> > ClassVersion: 0x2
> > R_Method: 0x12 - SubnAdmGetTable()
> > Status: 0x0 - NO_ERROR
> > ClassSpecific: 0x0
> > TransactionID: 0x97651d1009a
> > AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> > 
> > 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
> >  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a  .vQ.
> > 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> > 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-16 Thread Hal Rosenstock
On Wed, 2005-02-16 at 16:08, Nitin Hande wrote:
> Hal,
> 
> On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote:
> > On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> > > I have a hunch for whats happening here, but before I jump into any
> > > conclusions, I am seeing some other issue between Solaris IPoIB driver
> > > and OpenSM. After joining the Broadcast group, the PathRecord Response
> > > coming from OpenSM signals an error with Invalid GUID. 
> > 
> > Is the MTU from the PathRecord used ? Is that the theory ? So these are
> > one and the same issue. Thanks.
> No, I was more of thinking of an endian issue between IBD and the layer
> beneath it during the MCMemberRecord response. The mtu is not dependant
> on PathRecord Response. Thanks to Tom, we have figured out a way of
> consistently reproducing this on our systems here. The way to reproduce
> is (basically start everything fresh):
> 1. rmmod {ib_mthca, umad and ipoib}, stop opensm
> 2. unplumb ibd driver and modunload ibd on Solaris,
> 3. modprobe and restart opensm
> 4. plumb ibd interface.
> You should see ibd setting the mtu size to 252. Some of the above steps
> maybe unecessary. From the trace, it looks like OpenSM is reporting 256
> bytes of MTU to ipoib for MCMemberRecord response.
> 
> Here is the trace of 256 sized MTU:
> 
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10096
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
> 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x92 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10096
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
> 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
> 10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
> 20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L
> 30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> 40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
> 50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00  
> 60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00   ...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> And on other occassions where OpenSM reports the 2048 sized MTU:
> 
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d1009a
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
> 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a  .vQ.
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-16 Thread Nitin Hande
Hal,

On Wed, 2005-02-16 at 06:27, Hal Rosenstock wrote:
> On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> > I have a hunch for whats happening here, but before I jump into any
> > conclusions, I am seeing some other issue between Solaris IPoIB driver
> > and OpenSM. After joining the Broadcast group, the PathRecord Response
> > coming from OpenSM signals an error with Invalid GUID. 
> 
> Is the MTU from the PathRecord used ? Is that the theory ? So these are
> one and the same issue. Thanks.
No, I was more of thinking of an endian issue between IBD and the layer
beneath it during the MCMemberRecord response. The mtu is not dependant
on PathRecord Response. Thanks to Tom, we have figured out a way of
consistently reproducing this on our systems here. The way to reproduce
is (basically start everything fresh):
1. rmmod {ib_mthca, umad and ipoib}, stop opensm
2. unplumb ibd driver and modunload ibd on Solaris,
3. modprobe and restart opensm
4. plumb ibd interface.
You should see ibd setting the mtu size to 252. Some of the above steps
maybe unecessary. From the trace, it looks like OpenSM is reporting 256
bytes of MTU to ipoib for MCMemberRecord response.

Here is the trace of 256 sized MTU:

Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x12 - SubnAdmGetTable()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d10096
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
60: ff ff 00 00 00 00 00 00 20 00 00 00 00 00 00 00   ...
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

Incoming MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x92 -
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d10096
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 92 00 00 00 00 09 76 51 d1 00 00 00 96  .vQ.
10: 00 38 00 00 ff ff ff ff 01 01 77 00 00 00 00 01  .8w.
20: 00 00 00 4c 00 00 00 00 00 00 00 00 00 07 00 00  ...L
30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
50: 00 00 00 00 00 00 00 00 00 00 0b 1b c0 00 01 00  
60: ff ff 03 12 00 00 00 00 20 00 00 00 00 00 00 00   ...
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

And on other occassions where OpenSM reports the 2048 sized MTU:

Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x12 - SubnAdmGetTable()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d1009a
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 9a  .vQ.
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 00 80 81 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
40: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ..

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-16 Thread Hal Rosenstock
On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> I have a hunch for whats happening here, but before I jump into any
> conclusions, I am seeing some other issue between Solaris IPoIB driver
> and OpenSM. After joining the Broadcast group, the PathRecord Response
> coming from OpenSM signals an error with Invalid GUID. 

Is the MTU from the PathRecord used ? Is that the theory ? So these are
one and the same issue. Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Nitin Hande
On Tue, 2005-02-15 at 15:57, Hal Rosenstock wrote:
> On Tue, 2005-02-15 at 17:45, Nitin Hande wrote:
> > Here is the osm log, I think we may have a lead, the dest GID is wrong: 
> > :
> > 
> > Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
> > joining MLID 0xC001.
> > Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> > with GUID = 0x.
> > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic
> > Notice type:3 num:66 from LID:0x0001
> > GID:0xfe80,0x0002c9010a99e031
> > Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
> > joining MLID 0xC011.
> > Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> > with GUID = 0x.
> > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> > with GUID = 0x.
> > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> > with GUID = 0x.
> > Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> > with GUID = 0x.
> > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> > Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> > Join State != FullMember - required for create.
> 
> Are you sure ? 
No, I was looking at some other location. 
> 
> It looks to me like it is trying to get a PathRecord to the broadcast
> MGID as DGID:
>   AttributeID: 0x35 - SA_PATHRECORD_ATTRID
> ...
> 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff
> 
> I'm not sure this is necessary as aren't all the parameters returned in
> the SA GetResp MCMemberRecord in response to the Set ? Anyhow it is
> perfectly legal. OpenSM just doesn't support it right now.
Yes ,sure. Will wait for the patches..


Thanks
Nitin

> 
> -- Hal
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Hal Rosenstock
On Tue, 2005-02-15 at 17:45, Nitin Hande wrote:
> Here is the osm log, I think we may have a lead, the dest GID is wrong: 
> :
> 
> Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
> joining MLID 0xC001.
> Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> with GUID = 0x.
> Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic
> Notice type:3 num:66 from LID:0x0001
> GID:0xfe80,0x0002c9010a99e031
> Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
> joining MLID 0xC011.
> Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> with GUID = 0x.
> Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> with GUID = 0x.
> Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> with GUID = 0x.
> Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
> with GUID = 0x.
> Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.
> Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
> Join State != FullMember - required for create.

Are you sure ? 

It looks to me like it is trying to get a PathRecord to the broadcast
MGID as DGID:
  AttributeID: 0x35 - SA_PATHRECORD_ATTRID
...
40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff

I'm not sure this is necessary as aren't all the parameters returned in
the SA GetResp MCMemberRecord in response to the Set ? Anyhow it is
perfectly legal. OpenSM just doesn't support it right now.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Nitin Hande
On Tue, 2005-02-15 at 13:45, Hal Rosenstock wrote:
> Hi again Nitin,
> 
> On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> > After joining the Broadcast group, the PathRecord Response
> > coming from OpenSM signals an error with Invalid GUID. I wonder why,
> 
> There appear to be only 2 places in the code (I'm not saying the code is
> right) where this can occur.
> 
> osm_sa_path_record.c:
> 
> if( p_sa_mad->comp_mask & IB_PR_COMPMASK_DGID )
> ...
>   /*
> This 'error' is the client's fault (bad gid) so
> don't enter it as an error in our own log.
> Return an error response to the client.
>   */
>   osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
>"__osm_pr_rcv_get_end_points: "
>"No dest port with GUID = 0x%016" PRIx64 ".\n",
>cl_ntoh64( p_pr->dgid.unicast.interface_id) );
> 
>   sa_status = IB_SA_MAD_STATUS_INVALID_GID;
> 
> and a similar thing for SGID
> 
>   if( p_sa_mad->comp_mask & IB_PR_COMPMASK_SGID )
>   {
> ...
> 
>   /*
> This 'error' is the client's fault (bad gid) so
> don't enter it as an error in our own log.
> Return an error response to the client.
>   */
>   osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
>"__osm_pr_rcv_get_end_points: "
>"No source port with GUID = 0x%016" PRIx64 ".\n",
>cl_ntoh64( p_pr->sgid.unicast.interface_id) );
> 
> Can you look in the osm.log to see if the source or dest GID is
> implicated ? This will help me chase it down. Thanks.
Here is the osm log, I think we may have a lead, the dest GID is wrong: 
:

Feb 15 23:29:57 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
joining MLID 0xC001.
Feb 15 23:29:57 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
with GUID = 0x.
Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:29:58 [43005960] -> osm_report_notice: Reporting Generic
Notice type:3 num:66 from LID:0x0001
GID:0xfe80,0x0002c9010a99e031
Feb 15 23:29:58 [43005960] -> osm_sm_mcgrp_join: Port 0x0002c901097651d1
joining MLID 0xC011.
Feb 15 23:29:58 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
with GUID = 0x.
Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:29:58 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:29:59 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
with GUID = 0x.
Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:30:01 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:30:01 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
with GUID = 0x.
Feb 15 23:30:03 [43005960] -> __osm_pr_rcv_get_end_points: No dest port
with GUID = 0x.
Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.
Feb 15 23:30:04 [43005960] -> osm_mcmr_rcv_join_mgrp: ERR 1B10: Provided
Join State != FullMember - required for create.

thanks
Nitin

> 
> -- Hal
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Hal Rosenstock
Hi Nitin,

On Tue, 2005-02-15 at 16:45, Hal Rosenstock wrote:
> Can you look in the osm.log to see if the source or dest GID is
> implicated ? This will help me chase it down. Thanks.

Both SGID and DGID are in the component mask but my bet is on the DGID.
OpenSM does not currently support PathRecords for MGIDs. I will working
on fixing this.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Hal Rosenstock
Hi again Nitin,

On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> After joining the Broadcast group, the PathRecord Response
> coming from OpenSM signals an error with Invalid GUID. I wonder why,

There appear to be only 2 places in the code (I'm not saying the code is
right) where this can occur.

osm_sa_path_record.c:

if( p_sa_mad->comp_mask & IB_PR_COMPMASK_DGID )
...
  /*
This 'error' is the client's fault (bad gid) so
don't enter it as an error in our own log.
Return an error response to the client.
  */
  osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
   "__osm_pr_rcv_get_end_points: "
   "No dest port with GUID = 0x%016" PRIx64 ".\n",
   cl_ntoh64( p_pr->dgid.unicast.interface_id) );

  sa_status = IB_SA_MAD_STATUS_INVALID_GID;

and a similar thing for SGID

  if( p_sa_mad->comp_mask & IB_PR_COMPMASK_SGID )
  {
...

  /*
This 'error' is the client's fault (bad gid) so
don't enter it as an error in our own log.
Return an error response to the client.
  */
  osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
   "__osm_pr_rcv_get_end_points: "
   "No source port with GUID = 0x%016" PRIx64 ".\n",
   cl_ntoh64( p_pr->sgid.unicast.interface_id) );

Can you look in the osm.log to see if the source or dest GID is
implicated ? This will help me chase it down. Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Hal Rosenstock
On Tue, 2005-02-15 at 16:36, Nitin Hande wrote:
> I have a hunch for whats happening here,

Glad to hear this as I don't have a clue :-)

>  but before I jump into any
> conclusions, I am seeing some other issue between Solaris IPoIB driver
> and OpenSM. After joining the Broadcast group, the PathRecord Response
> coming from OpenSM signals an error with Invalid GUID. I wonder why,

I missed this as my decode is manual being 1.0.a style. I will look at
this and my traces and get back to you later on this.

> Here is the mad trace:
> 
> 
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x2 - SubnAdmSet()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10034
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
>  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
>  0: 01 03 02 02 00 00 00 00 09 76 51 d1 00 00 00 34  .vQ4
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00  
> 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b 00 00 00 00  .vQ.
> 60: ff ff 00 00 00 00 00 00 21 00 00 00 00 00 00 00  !...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> Incoming MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x81 -
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d10034
> AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID
> 
> 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 81 00 00 00 00 09 76 51 d1 00 00 00 34  .vQ4
> 10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00  
> 30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
> 40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00  
> 50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b c0 00 04 00  .vQ.
> 60: ff ff 03 12 00 00 00 00 21 00 00 00 00 00 00 00  !...
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> OpenSM responds positively to MCMEMBERRECORD and then:
> 
> Outgoing MAD:
> BaseVersion: 0x1
> MgmtClass: 0x3 - SubnAdm
> ClassVersion: 0x2
> R_Method: 0x12 - SubnAdmGetTable()
> Status: 0x0 - NO_ERROR
> ClassSpecific: 0x0
> TransactionID: 0x97651d1003a
> AttributeID: 0x35 - SA_PATHRECORD_ATTRID
> 
> 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
>  0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 3a  .vQ:
> 10: 00 35 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .5..
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 30: 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 00 00  
> 40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff  [EMAIL PROTECTED]
> 50: fe 80 00 00 00 00 00 00 00 02 c9 01 09 76 51 d1  .vQ.
> 60: 00 00 00 00 00 00 00 00 00 81 00 00 00 00 00 00  
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> c0: 00 00 00 00 00

Re: [openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Nitin Hande
I have a hunch for whats happening here, but before I jump into any
conclusions, I am seeing some other issue between Solaris IPoIB driver
and OpenSM. After joining the Broadcast group, the PathRecord Response
coming from OpenSM signals an error with Invalid GUID. I wonder why,
Here is the mad trace:


Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x2 - SubnAdmSet()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d10034
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
 0: 01 03 02 02 00 00 00 00 09 76 51 d1 00 00 00 34  .vQ4
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00  
50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b 00 00 00 00  .vQ.
60: ff ff 00 00 00 00 00 00 21 00 00 00 00 00 00 00  !...
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

Incoming MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x81 -
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d10034
AttributeID: 0x38 - SA_MCMEMBERRECORD_ATTRID

0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 81 00 00 00 00 09 76 51 d1 00 00 00 34  .vQ4
10: 00 38 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .8..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 00  
30: 00 00 00 00 00 01 b0 c7 ff 12 40 1b ff ff 00 00  [EMAIL PROTECTED]
40: 00 00 00 00 ff ff ff ff fe 80 00 00 00 00 00 00  
50: 00 02 c9 01 09 76 51 d1 00 00 0b 1b c0 00 04 00  .vQ.
60: ff ff 03 12 00 00 00 00 21 00 00 00 00 00 00 00  !...
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

OpenSM responds positively to MCMEMBERRECORD and then:

Outgoing MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x12 - SubnAdmGetTable()
Status: 0x0 - NO_ERROR
ClassSpecific: 0x0
TransactionID: 0x97651d1003a
AttributeID: 0x35 - SA_PATHRECORD_ATTRID

0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
 0: 01 03 02 12 00 00 00 00 09 76 51 d1 00 00 00 3a  .vQ:
10: 00 35 00 00 ff ff ff ff 00 00 00 00 00 00 00 00  .5..
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
30: 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 00 00  
40: ff 12 40 1b ff ff 00 00 00 00 00 00 ff ff ff ff  [EMAIL PROTECTED]
50: fe 80 00 00 00 00 00 00 00 02 c9 01 09 76 51 d1  .vQ.
60: 00 00 00 00 00 00 00 00 00 81 00 00 00 00 00 00  
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  

Incoming MAD:
BaseVersion: 0x1
MgmtClass: 0x3 - SubnAdm
ClassVersion: 0x2
R_Method: 0x92 -
Status: 0x5

[openib-general] Solaris IPoIB MTU with OpenSM

2005-02-15 Thread Hal Rosenstock
Hi,

Unfortunately, the Solaris 10 IPoIB MTU with OpenSM is back to the
maximum size of 252 again :-( I'm not sure whether this was ever really
fixed although I do now see the packets indicating an exact MTU of 4
(2048 bytes). I'm not sure what Solaris doesn't like about the OpenSM
response to the MCMemberRecord.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general