Re: [RFC PATCH] Set HopLimit based on off subnet Re: SRP issues with OpenSM 3.3.3
On 18:18 Thu 17 Dec , Ira Weiny wrote: > > It might be faster but it would mean a bigger change. It looks like having > the dgid or p_pr structures in some internal opensm structure to carry that > information along would be best but that is a lot of work. Actually now dgid parameter is never used when intra-subnet PR is requested - maybe those parameters (dgid, is_off_subnet, etc.) could be merged into some context struct which is initialized once and passed down to the functions. Likely some general cleanup of osm_sa_path_record.c is needed to see a most obvious coding solution. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Set HopLimit based on off subnet Re: SRP issues with OpenSM 3.3.3
On Thu, Dec 17, 2009 at 9:18 PM, Ira Weiny wrote: > On Wed, 16 Dec 2009 08:29:43 -0500 > Hal Rosenstock wrote: > >> On Tue, Dec 15, 2009 at 9:55 PM, Ira Weiny wrote: > > [snip] > >> > >> > diff --git a/opensm/opensm/osm_sa_path_record.c >> > b/opensm/opensm/osm_sa_path_record.c >> > index be0cd71..1fa83a1 100644 >> > --- a/opensm/opensm/osm_sa_path_record.c >> > +++ b/opensm/opensm/osm_sa_path_record.c >> > @@ -757,6 +757,14 @@ Exit: >> > return (status); >> > } >> > >> > +static int gid_is_off_subnet(IN osm_sa_t * sa, IN const ib_gid_t * p_dgid) >> > +{ >> > + return (!ib_gid_is_link_local(p_dgid) && >> > + !ib_gid_is_multicast(p_dgid) && >> > + ib_gid_get_subnet_prefix(p_dgid) != >> > + sa->p_subn->opt.subnet_prefix); >> > +} >> > + >> > /** >> > **/ >> > static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * >> > p_src_port, >> > @@ -770,16 +778,12 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN >> > const osm_port_t * p_src_port, >> > { >> > const osm_physp_t *p_src_physp; >> > const osm_physp_t *p_dest_physp; >> > - boolean_t is_nonzero_gid = 0; >> > >> > OSM_LOG_ENTER(sa->p_log); >> > >> > p_src_physp = p_src_port->p_physp; >> > >> > if (p_dgid) >> > - is_nonzero_gid = ib_gid_is_notzero(p_dgid); >> > - >> > - if (is_nonzero_gid) >> > p_pr->dgid = *p_dgid; >> > else { >> > p_dest_physp = p_dest_port->p_physp; >> > @@ -799,7 +803,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const >> > osm_port_t * p_src_port, >> > p_pr->hop_flow_raw &= cl_hton32(1 << 31); >> > >> > /* Only set HopLimit if going through a router */ >> > - if (is_nonzero_gid) >> > + if (gid_is_off_subnet(sa, &p_pr->dgid)) >> > p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); >> > >> > p_pr->pkey = p_parms->pkey; >> > @@ -1248,10 +1252,7 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t >> > * sa, >> > memset(p_dgid, 0, sizeof(*p_dgid)); >> > >> > if (comp_mask & IB_PR_COMPMASK_DGID) { >> > - if (!ib_gid_is_link_local(&p_pr->dgid) && >> > - !ib_gid_is_multicast(&p_pr->dgid) && >> > - ib_gid_get_subnet_prefix(&p_pr->dgid) != >> > - sa->p_subn->opt.subnet_prefix) { >> > + if (gid_is_off_subnet(sa, &p_pr->dgid)) { >> > dest_guid = find_router(sa, >> > p_pr->dgid.unicast.prefix); >> > if (!dest_guid) { >> > char gid_str[INET6_ADDRSTRLEN]; >> > >> > >> > This maintains the p_dgid found in pr_rcv_get_end_points but appropriately >> > set's the HopLimit when the DGID is off subnet. >> >> Yes, this looks right to me. Would it be better to only determine >> gid_is_off_subnet once by saving this in a variable and passing it as >> a parameter where needed ? >> > > It might be faster but it would mean a bigger change. Yes. > It looks like having > the dgid or p_pr structures in some internal opensm structure to carry that > information along would be best but that is a lot of work. It could also be done by passing this as an additional parameter through the necessary routines. > Do you think this will adversely affect performance on a big system? I don't know as I haven't measured the difference in the two approaches. Just trying to minimize additions to computing PRs. -- Hal > I plan > to roll this out internally and we will have a chance to test on bigger > systems here. But there are larger out there now. > > Ira > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > wei...@llnl.gov > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Set HopLimit based on off subnet Re: SRP issues with OpenSM 3.3.3
On Wed, 16 Dec 2009 08:29:43 -0500 Hal Rosenstock wrote: > On Tue, Dec 15, 2009 at 9:55 PM, Ira Weiny wrote: [snip] > > > > diff --git a/opensm/opensm/osm_sa_path_record.c > > b/opensm/opensm/osm_sa_path_record.c > > index be0cd71..1fa83a1 100644 > > --- a/opensm/opensm/osm_sa_path_record.c > > +++ b/opensm/opensm/osm_sa_path_record.c > > @@ -757,6 +757,14 @@ Exit: > > return (status); > > } > > > > +static int gid_is_off_subnet(IN osm_sa_t * sa, IN const ib_gid_t * p_dgid) > > +{ > > + return (!ib_gid_is_link_local(p_dgid) && > > + !ib_gid_is_multicast(p_dgid) && > > + ib_gid_get_subnet_prefix(p_dgid) != > > + sa->p_subn->opt.subnet_prefix); > > +} > > + > > /** > > **/ > > static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * > > p_src_port, > > @@ -770,16 +778,12 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN > > const osm_port_t * p_src_port, > > { > > const osm_physp_t *p_src_physp; > > const osm_physp_t *p_dest_physp; > > - boolean_t is_nonzero_gid = 0; > > > > OSM_LOG_ENTER(sa->p_log); > > > > p_src_physp = p_src_port->p_physp; > > > > if (p_dgid) > > - is_nonzero_gid = ib_gid_is_notzero(p_dgid); > > - > > - if (is_nonzero_gid) > > p_pr->dgid = *p_dgid; > > else { > > p_dest_physp = p_dest_port->p_physp; > > @@ -799,7 +803,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > > osm_port_t * p_src_port, > > p_pr->hop_flow_raw &= cl_hton32(1 << 31); > > > > /* Only set HopLimit if going through a router */ > > - if (is_nonzero_gid) > > + if (gid_is_off_subnet(sa, &p_pr->dgid)) > > p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > > > > p_pr->pkey = p_parms->pkey; > > @@ -1248,10 +1252,7 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t > > * sa, > > memset(p_dgid, 0, sizeof(*p_dgid)); > > > > if (comp_mask & IB_PR_COMPMASK_DGID) { > > - if (!ib_gid_is_link_local(&p_pr->dgid) && > > - !ib_gid_is_multicast(&p_pr->dgid) && > > - ib_gid_get_subnet_prefix(&p_pr->dgid) != > > - sa->p_subn->opt.subnet_prefix) { > > + if (gid_is_off_subnet(sa, &p_pr->dgid)) { > > dest_guid = find_router(sa, > > p_pr->dgid.unicast.prefix); > > if (!dest_guid) { > > char gid_str[INET6_ADDRSTRLEN]; > > > > > > This maintains the p_dgid found in pr_rcv_get_end_points but appropriately > > set's the HopLimit when the DGID is off subnet. > > Yes, this looks right to me. Would it be better to only determine > gid_is_off_subnet once by saving this in a variable and passing it as > a parameter where needed ? > It might be faster but it would mean a bigger change. It looks like having the dgid or p_pr structures in some internal opensm structure to carry that information along would be best but that is a lot of work. Do you think this will adversely affect performance on a big system? I plan to roll this out internally and we will have a chance to test on bigger systems here. But there are larger out there now. Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Set HopLimit based on off subnet Re: SRP issues with OpenSM 3.3.3
On Tue, Dec 15, 2009 at 9:55 PM, Ira Weiny wrote: > On Tue, 15 Dec 2009 12:59:11 -0500 > Hal Rosenstock wrote: > >> On Tue, Dec 15, 2009 at 12:18 PM, Ira Weiny wrote: > > [snip] > >> >> > The follow on would be that if they specify a >> > DGID in the PathRecord query and the result is subnet local they must still >> > use a GRH even if the packet is going to stay local? >> >> No; with subnet local traffic, GRH is optional. >> > > Ok, if the GRH is optional with local traffic then I think the appropriate fix > is: > > diff --git a/opensm/opensm/osm_sa_path_record.c > b/opensm/opensm/osm_sa_path_record.c > index be0cd71..1fa83a1 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -757,6 +757,14 @@ Exit: > return (status); > } > > +static int gid_is_off_subnet(IN osm_sa_t * sa, IN const ib_gid_t * p_dgid) > +{ > + return (!ib_gid_is_link_local(p_dgid) && > + !ib_gid_is_multicast(p_dgid) && > + ib_gid_get_subnet_prefix(p_dgid) != > + sa->p_subn->opt.subnet_prefix); > +} > + > /** > **/ > static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * > p_src_port, > @@ -770,16 +778,12 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > osm_port_t * p_src_port, > { > const osm_physp_t *p_src_physp; > const osm_physp_t *p_dest_physp; > - boolean_t is_nonzero_gid = 0; > > OSM_LOG_ENTER(sa->p_log); > > p_src_physp = p_src_port->p_physp; > > if (p_dgid) > - is_nonzero_gid = ib_gid_is_notzero(p_dgid); > - > - if (is_nonzero_gid) > p_pr->dgid = *p_dgid; > else { > p_dest_physp = p_dest_port->p_physp; > @@ -799,7 +803,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > osm_port_t * p_src_port, > p_pr->hop_flow_raw &= cl_hton32(1 << 31); > > /* Only set HopLimit if going through a router */ > - if (is_nonzero_gid) > + if (gid_is_off_subnet(sa, &p_pr->dgid)) > p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > > p_pr->pkey = p_parms->pkey; > @@ -1248,10 +1252,7 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * > sa, > memset(p_dgid, 0, sizeof(*p_dgid)); > > if (comp_mask & IB_PR_COMPMASK_DGID) { > - if (!ib_gid_is_link_local(&p_pr->dgid) && > - !ib_gid_is_multicast(&p_pr->dgid) && > - ib_gid_get_subnet_prefix(&p_pr->dgid) != > - sa->p_subn->opt.subnet_prefix) { > + if (gid_is_off_subnet(sa, &p_pr->dgid)) { > dest_guid = find_router(sa, p_pr->dgid.unicast.prefix); > if (!dest_guid) { > char gid_str[INET6_ADDRSTRLEN]; > > > This maintains the p_dgid found in pr_rcv_get_end_points but appropriately > set's the HopLimit when the DGID is off subnet. Yes, this looks right to me. Would it be better to only determine gid_is_off_subnet once by saving this in a variable and passing it as a parameter where needed ? -- Hal > > Ira > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] Set HopLimit based on off subnet Re: SRP issues with OpenSM 3.3.3
On Tue, 15 Dec 2009 12:59:11 -0500 Hal Rosenstock wrote: > On Tue, Dec 15, 2009 at 12:18 PM, Ira Weiny wrote: [snip] > > > The follow on would be that if they specify a > > DGID in the PathRecord query and the result is subnet local they must still > > use a GRH even if the packet is going to stay local? > > No; with subnet local traffic, GRH is optional. > Ok, if the GRH is optional with local traffic then I think the appropriate fix is: diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index be0cd71..1fa83a1 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -757,6 +757,14 @@ Exit: return (status); } +static int gid_is_off_subnet(IN osm_sa_t * sa, IN const ib_gid_t * p_dgid) +{ + return (!ib_gid_is_link_local(p_dgid) && + !ib_gid_is_multicast(p_dgid) && + ib_gid_get_subnet_prefix(p_dgid) != + sa->p_subn->opt.subnet_prefix); +} + /** **/ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port, @@ -770,16 +778,12 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port, { const osm_physp_t *p_src_physp; const osm_physp_t *p_dest_physp; - boolean_t is_nonzero_gid = 0; OSM_LOG_ENTER(sa->p_log); p_src_physp = p_src_port->p_physp; if (p_dgid) - is_nonzero_gid = ib_gid_is_notzero(p_dgid); - - if (is_nonzero_gid) p_pr->dgid = *p_dgid; else { p_dest_physp = p_dest_port->p_physp; @@ -799,7 +803,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port, p_pr->hop_flow_raw &= cl_hton32(1 << 31); /* Only set HopLimit if going through a router */ - if (is_nonzero_gid) + if (gid_is_off_subnet(sa, &p_pr->dgid)) p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); p_pr->pkey = p_parms->pkey; @@ -1248,10 +1252,7 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * sa, memset(p_dgid, 0, sizeof(*p_dgid)); if (comp_mask & IB_PR_COMPMASK_DGID) { - if (!ib_gid_is_link_local(&p_pr->dgid) && - !ib_gid_is_multicast(&p_pr->dgid) && - ib_gid_get_subnet_prefix(&p_pr->dgid) != - sa->p_subn->opt.subnet_prefix) { + if (gid_is_off_subnet(sa, &p_pr->dgid)) { dest_guid = find_router(sa, p_pr->dgid.unicast.prefix); if (!dest_guid) { char gid_str[INET6_ADDRSTRLEN]; This maintains the p_dgid found in pr_rcv_get_end_points but appropriately set's the HopLimit when the DGID is off subnet. Ira -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, Dec 15, 2009 at 12:18 PM, Ira Weiny wrote: > On Tue, 15 Dec 2009 10:15:32 -0700 > Jason Gunthorpe wrote: > >> > > However, I don't understand the comment "Only set HopLimit if going >> > > through a >> > > router"? >> > >> > This is from '#ifdef ROUTER_EXP' days - as far as I could understand >> > HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. >> >> If HopLimit is 0 then no GRH is required, if it is non 0 then the user >> of that path needs to apply a GRH. IIRC there are already a few things >> in the kernel that follow this? > > Is that in the spec somewhere? IBA 1.2.1 vol 1 p.229 8.3.6 for one. > The follow on would be that if they specify a > DGID in the PathRecord query and the result is subnet local they must still > use a GRH even if the packet is going to stay local? No; with subnet local traffic, GRH is optional. -- Hal > I think this makes sense > but I wonder if __everyone__ is following that? > > Ira > >> >> Jason > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > wei...@llnl.gov > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, Dec 15, 2009 at 12:15 PM, Jason Gunthorpe wrote: >> > However, I don't understand the comment "Only set HopLimit if going >> > through a >> > router"? >> >> This is from '#ifdef ROUTER_EXP' days - as far as I could understand >> HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. > > If HopLimit is 0 then no GRH is required, if it is non 0 then the user I think it is: HopLimit 0 or 1 requires no GRH (GRH is optional in these cases) and other than 0 or 1 requires GRH > of that path needs to apply a GRH. IIRC there are already a few things > in the kernel that follow this? Yes, we started down this path. This was all discussed on the list back prior to SC07 days AFAIR. -- Hal > Jason > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, Dec 15, 2009 at 12:09 PM, Ira Weiny wrote: > On Tue, 15 Dec 2009 10:16:42 -0500 > Hal Rosenstock wrote: > >> Ira, >> >> On Mon, Dec 14, 2009 at 7:43 PM, Ira Weiny wrote: >> > Sasha, Hal, >> > >> > I have found that the following patch caused our SRP connected storage to >> > break. >> >> What is causing the SRP target to fail ? Is it a non zero hop limit >> returned in the SA PathRecord ? > > I am not sure exactly because I don't know what the SRP Target is seeing. > (pesky firmware...) :-( > >> >> > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 >> > >> > opensm/osm_sa_path_record.c: separate router guid resolution code >> > >> > Move off subnet destination (router address) resolution code to separate >> > function to improve readability. >> > >> > Signed-off-by: Sasha Khapyorsky >> > >> > I further traced the problem to pr_rcv_build_pr and the patch below fixes >> > the >> > bug. >> >> On quick glance, I'm missing the connection between Sasha's patch and >> this routine setting something bad in the hop limit of the returned >> path record. > > It sets p_dgid even when we are not using a router which results in > is_nonzero_gid being true. Yes, that's the cause of the issue. >> >> > 16:03:34 > git diff >> > diff --git a/opensm/opensm/osm_sa_path_record.c >> > b/opensm/opensm/osm_sa_path_record.c >> > index be0cd71..32c889f 100644 >> > --- a/opensm/opensm/osm_sa_path_record.c >> > +++ b/opensm/opensm/osm_sa_path_record.c >> > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const >> > osm_port_t * p_src_port, >> > >> > /* Only set HopLimit if going through a router */ >> > if (is_nonzero_gid) >> > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); >> > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; >> >> This may fix this issue but it doesn't look right to me and I think >> would break router scenarios and corrupts other field(s). Path record >> fields are in network (not host) order. > > But HopLimit is an 8bit field? Yes but in terms of the ib_path_rec_t definition hop_flow_raw is a 32 bit field. > I might be wrong on this but I was thinking > this set RawTraffic and parts of Flow label to 1's. I should double check > that. Also, I wonder if the FW on the other side is flipping this wrong? :-/ > I have not been able to determine this and I still have to test with our other > vendor's SRP target... > >> >> > p_pr->pkey = p_parms->pkey; >> > ib_path_rec_set_sl(p_pr, p_parms->sl); >> > >> > >> > "hop_flow_raw" is not really a 32bit value but rather 4 fields: >> > Component Length(bits) Offset(bits) >> > RawTraffic 1 352 >> > Reserved 3 353 >> > FlowLabel 20 356 >> > HopLimit 8 384 >> > >> > >> > However, I don't understand the comment "Only set HopLimit if going >> > through a >> > router"? >> >> Well, in a sense the HopLimit is always set. It's already set to 0 >> right above that code snippet in the patch where: >> p_pr->hop_flow_raw &= cl_hton32(1 << 31); > > True, however, I believe the behavior before was that the entire field is 0 > so the above statement really does nothing. For all practical purposes that is true as raw traffic is deprecated. >> This code is only setting the HopLimit to 255 (as this is very >> primitive so far) when going through a router. Maybe it could be >> stated better. > > That is exactly what Sasha's patch changes. This is not only executed when > going through a router but at all times DGID is specified in the CompMask. I > am willing to accept the fact that perhaps p_dgid should not be set unless we > find a router, however, I __don't__ think that is correct. Searching the spec > I attempted to find where HopLimit was supposed to be 0xFF only when going > through a router and I could not find it; therefore my email, I'm confused > again... :-( When going through a router, it needs to be set to something other than 0 or 1. 255 (max) was used as there's nothing currently to scope it down to something more realistic. See IBA 1.2.1 vol 1 p.229 8.3.6 for one on settings for this field. >> >> > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are >> > heading >> > through a router? I don't think it matters if we are going through a router >> > or not does it? >> >> It's used when DGID is selected in the SA PR query via the comp mask >> and in that instance is used for both the router and non router DGID >> cases. > > So we agree that Sasha's patch is correct? Yes; it restores this back the way it used to be when it worked. -- Hal >> >> > If not, I think the comment needs to be removed in the patch above. >> >> I'm not following why you say this. > > Well, if HopLimit must only be set to 0xff when going through a router and > Sasha's patch is correct, then I think we need another way to determine if > this path is going through a router. In that case the comment is correct
Re: SRP issues with OpenSM 3.3.3
On Tue, 15 Dec 2009 10:31:40 -0700 Jason Gunthorpe wrote: [snip] > > The spec provides no guidance whatsoever on how to tell if a packets > generated from a PR need a GRH or not. The only other choice is to > compare the resulting DGID against all the GID prefixs on the port and > see if any match. > > If a router spec is ever written I'm sure it will clarify this matter, > and everyone will follow. > > The SM decides what hop limit should be so the SM decides if it needs > a GRH. I belive the original long ago version of this always set the > hop limit to 0 unless the DLID was a router port. Ok, then I think we need a new way to tell if we are hitting a router port in pr_rcv_build_pr. I don't think having a non-zero GID is the definition of going through a router port. Ira > > Jason -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, Dec 15, 2009 at 09:18:19AM -0800, Ira Weiny wrote: > On Tue, 15 Dec 2009 10:15:32 -0700 > Jason Gunthorpe wrote: > > > > > However, I don't understand the comment "Only set HopLimit if going > > > > through a > > > > router"? > > > > > > This is from '#ifdef ROUTER_EXP' days - as far as I could understand > > > HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. > > > > If HopLimit is 0 then no GRH is required, if it is non 0 then the user > > of that path needs to apply a GRH. IIRC there are already a few things > > in the kernel that follow this? > > Is that in the spec somewhere? The follow on would be that if they specify a > DGID in the PathRecord query and the result is subnet local they must still > use a GRH even if the packet is going to stay local? I think this makes sense > but I wonder if __everyone__ is following that? The spec provides no guidance whatsoever on how to tell if a packets generated from a PR need a GRH or not. The only other choice is to compare the resulting DGID against all the GID prefixs on the port and see if any match. If a router spec is ever written I'm sure it will clarify this matter, and everyone will follow. The SM decides what hop limit should be so the SM decides if it needs a GRH. I belive the original long ago version of this always set the hop limit to 0 unless the DLID was a router port. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, 15 Dec 2009 10:15:32 -0700 Jason Gunthorpe wrote: > > > However, I don't understand the comment "Only set HopLimit if going > > > through a > > > router"? > > > > This is from '#ifdef ROUTER_EXP' days - as far as I could understand > > HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. > > If HopLimit is 0 then no GRH is required, if it is non 0 then the user > of that path needs to apply a GRH. IIRC there are already a few things > in the kernel that follow this? Is that in the spec somewhere? The follow on would be that if they specify a DGID in the PathRecord query and the result is subnet local they must still use a GRH even if the packet is going to stay local? I think this makes sense but I wonder if __everyone__ is following that? Ira > > Jason -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, 15 Dec 2009 19:03:17 +0200 Sasha Khapyorsky wrote: > Hi Ira, > > On 16:43 Mon 14 Dec , Ira Weiny wrote: > > > > I have found that the following patch caused our SRP connected storage to > > break. > > > > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 > > > > opensm/osm_sa_path_record.c: separate router guid resolution code > > > > Move off subnet destination (router address) resolution code to separate > > function to improve readability. > > > > Signed-off-by: Sasha Khapyorsky > > > > I further traced the problem to pr_rcv_build_pr and the patch below fixes > > the > > bug. > > Nice catch and great investigation! > > > > > 16:03:34 > git diff > > diff --git a/opensm/opensm/osm_sa_path_record.c > > b/opensm/opensm/osm_sa_path_record.c > > index be0cd71..32c889f 100644 > > --- a/opensm/opensm/osm_sa_path_record.c > > +++ b/opensm/opensm/osm_sa_path_record.c > > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > > osm_port_t * p_src_port, > > > > /* Only set HopLimit if going through a router */ > > if (is_nonzero_gid) > > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; > > > > p_pr->pkey = p_parms->pkey; > > ib_path_rec_set_sl(p_pr, p_parms->sl); > > > > > > "hop_flow_raw" is not really a 32bit value but rather 4 fields: > > Component Length(bits) Offset(bits) > > RawTraffic1 352 > > Reserved 3 353 > > FlowLabel 20 356 > > HopLimit 8 384 > > This patch doesn't look correct for me. Instead of placing > IB_HOPLIMIT_MAX in proper place this will put it in RawTraffic and > Reserved fields on little-endian machines (such as x86). And this > changes nothing on a big-endian machine. Right? Yea I think this is wrong. As I said perhaps the FW on the other side is interpreting the fields wrong. I will check with the vendor. > > Following your investigation it seems that SRP target breaks when > IB_HOPLIMIT_MAX is set in SA PR query responses for intra-subnet DGIDs. Yes > > > However, I don't understand the comment "Only set HopLimit if going through > > a > > router"? > > This is from '#ifdef ROUTER_EXP' days - as far as I could understand > HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. But I have not found where that is stated. > > > > > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading > > through a router? > > Seems that this was an original logic. > > > I don't think it matters if we are going through a router > > or not does it? > > I thought so too, but as we can see it triggers the bug you discovered. > > So immediate fix for this is to move p_dgid setup back to router > section, like this: > > > diff --git a/opensm/opensm/osm_sa_path_record.c > b/opensm/opensm/osm_sa_path_record.c > index 7855be5..fbeef03 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -1236,6 +1236,8 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * > sa, > sa_status = IB_SA_MAD_STATUS_INVALID_GID; > goto Exit; > } > + if (p_dgid) > + *p_dgid = p_pr->dgid; > } else > dest_guid = p_pr->dgid.unicast.interface_id; > > @@ -1253,9 +1255,6 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * > sa, > sa_status = IB_SA_MAD_STATUS_INVALID_GID; > goto Exit; > } > - > - if (p_dgid) > - *p_dgid = p_pr->dgid; > } else { > *pp_dest_port = 0; > if (comp_mask & IB_PR_COMPMASK_DLID) { > > > BTW, could you test this? Unfortunately I don't have SRP target :( Yes this works and was my original fix. However, it did not seem right and I think Hal agrees. > > And seems that as more solid solution we need to cleanup some logic in > osm_sa_path_record.c code. I think so. Ira > > Sasha -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
> > However, I don't understand the comment "Only set HopLimit if going through > > a > > router"? > > This is from '#ifdef ROUTER_EXP' days - as far as I could understand > HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. If HopLimit is 0 then no GRH is required, if it is non 0 then the user of that path needs to apply a GRH. IIRC there are already a few things in the kernel that follow this? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On Tue, 15 Dec 2009 10:16:42 -0500 Hal Rosenstock wrote: > Ira, > > On Mon, Dec 14, 2009 at 7:43 PM, Ira Weiny wrote: > > Sasha, Hal, > > > > I have found that the following patch caused our SRP connected storage to > > break. > > What is causing the SRP target to fail ? Is it a non zero hop limit > returned in the SA PathRecord ? I am not sure exactly because I don't know what the SRP Target is seeing. (pesky firmware...) :-( > > > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 > > > > opensm/osm_sa_path_record.c: separate router guid resolution code > > > > Move off subnet destination (router address) resolution code to separate > > function to improve readability. > > > > Signed-off-by: Sasha Khapyorsky > > > > I further traced the problem to pr_rcv_build_pr and the patch below fixes > > the > > bug. > > On quick glance, I'm missing the connection between Sasha's patch and > this routine setting something bad in the hop limit of the returned > path record. It sets p_dgid even when we are not using a router which results in is_nonzero_gid being true. > > > 16:03:34 > git diff > > diff --git a/opensm/opensm/osm_sa_path_record.c > > b/opensm/opensm/osm_sa_path_record.c > > index be0cd71..32c889f 100644 > > --- a/opensm/opensm/osm_sa_path_record.c > > +++ b/opensm/opensm/osm_sa_path_record.c > > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > > osm_port_t * p_src_port, > > > > /* Only set HopLimit if going through a router */ > > if (is_nonzero_gid) > > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; > > This may fix this issue but it doesn't look right to me and I think > would break router scenarios and corrupts other field(s). Path record > fields are in network (not host) order. But HopLimit is an 8bit field? I might be wrong on this but I was thinking this set RawTraffic and parts of Flow label to 1's. I should double check that. Also, I wonder if the FW on the other side is flipping this wrong? :-/ I have not been able to determine this and I still have to test with our other vendor's SRP target... > > > p_pr->pkey = p_parms->pkey; > > ib_path_rec_set_sl(p_pr, p_parms->sl); > > > > > > "hop_flow_raw" is not really a 32bit value but rather 4 fields: > > Component Length(bits) Offset(bits) > > RawTraffic 1 352 > > Reserved 3 353 > > FlowLabel 20 356 > > HopLimit 8 384 > > > > > > However, I don't understand the comment "Only set HopLimit if going through > > a > > router"? > > Well, in a sense the HopLimit is always set. It's already set to 0 > right above that code snippet in the patch where: > p_pr->hop_flow_raw &= cl_hton32(1 << 31); True, however, I believe the behavior before was that the entire field is 0 so the above statement really does nothing. > This code is only setting the HopLimit to 255 (as this is very > primitive so far) when going through a router. Maybe it could be > stated better. That is exactly what Sasha's patch changes. This is not only executed when going through a router but at all times DGID is specified in the CompMask. I am willing to accept the fact that perhaps p_dgid should not be set unless we find a router, however, I __don't__ think that is correct. Searching the spec I attempted to find where HopLimit was supposed to be 0xFF only when going through a router and I could not find it; therefore my email, I'm confused again... :-( > > > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading > > through a router? I don't think it matters if we are going through a router > > or not does it? > > It's used when DGID is selected in the SA PR query via the comp mask > and in that instance is used for both the router and non router DGID > cases. So we agree that Sasha's patch is correct? > > > If not, I think the comment needs to be removed in the patch above. > > I'm not following why you say this. Well, if HopLimit must only be set to 0xff when going through a router and Sasha's patch is correct, then I think we need another way to determine if this path is going through a router. In that case the comment is correct but the logic is wrong. Ira > > -- Hal > > > Thanks, > > Ira > > > > -- > > Ira Weiny > > Math Programmer/Computer Scientist > > Lawrence Livermore National Lab > > 925-423-8008 > > wei...@llnl.gov > > -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
On 10:16 Tue 15 Dec , Hal Rosenstock wrote: > > > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 > > > > opensm/osm_sa_path_record.c: separate router guid resolution code > > > > Move off subnet destination (router address) resolution code to separate > > function to improve readability. > > > > Signed-off-by: Sasha Khapyorsky > > > > I further traced the problem to pr_rcv_build_pr and the patch below fixes > > the > > bug. > > On quick glance, I'm missing the connection between Sasha's patch and > this routine setting something bad in the hop limit of the returned > path record. In this patch I moved *p_dgid initialization out from router only section, as side effect it turns on this section: /* Only set HopLimit if going through a router */ if (is_nonzero_gid) p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); for all SA PR queries where DGID was selected (local subnet or not). Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
Hi Ira, On 16:43 Mon 14 Dec , Ira Weiny wrote: > > I have found that the following patch caused our SRP connected storage to > break. > > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 > > opensm/osm_sa_path_record.c: separate router guid resolution code > > Move off subnet destination (router address) resolution code to separate > function to improve readability. > > Signed-off-by: Sasha Khapyorsky > > I further traced the problem to pr_rcv_build_pr and the patch below fixes the > bug. Nice catch and great investigation! > > 16:03:34 > git diff > diff --git a/opensm/opensm/osm_sa_path_record.c > b/opensm/opensm/osm_sa_path_record.c > index be0cd71..32c889f 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > osm_port_t * p_src_port, > > /* Only set HopLimit if going through a router */ > if (is_nonzero_gid) > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; > > p_pr->pkey = p_parms->pkey; > ib_path_rec_set_sl(p_pr, p_parms->sl); > > > "hop_flow_raw" is not really a 32bit value but rather 4 fields: > Component Length(bits) Offset(bits) > RawTraffic1 352 > Reserved 3 353 > FlowLabel 20 356 > HopLimit 8 384 This patch doesn't look correct for me. Instead of placing IB_HOPLIMIT_MAX in proper place this will put it in RawTraffic and Reserved fields on little-endian machines (such as x86). And this changes nothing on a big-endian machine. Right? Following your investigation it seems that SRP target breaks when IB_HOPLIMIT_MAX is set in SA PR query responses for intra-subnet DGIDs. > However, I don't understand the comment "Only set HopLimit if going through a > router"? This is from '#ifdef ROUTER_EXP' days - as far as I could understand HopLimit should be set to IB_HOPLIMIT_MAX for inter-subnet MADs. > > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading > through a router? Seems that this was an original logic. > I don't think it matters if we are going through a router > or not does it? I thought so too, but as we can see it triggers the bug you discovered. So immediate fix for this is to move p_dgid setup back to router section, like this: diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 7855be5..fbeef03 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1236,6 +1236,8 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * sa, sa_status = IB_SA_MAD_STATUS_INVALID_GID; goto Exit; } + if (p_dgid) + *p_dgid = p_pr->dgid; } else dest_guid = p_pr->dgid.unicast.interface_id; @@ -1253,9 +1255,6 @@ static ib_net16_t pr_rcv_get_end_points(IN osm_sa_t * sa, sa_status = IB_SA_MAD_STATUS_INVALID_GID; goto Exit; } - - if (p_dgid) - *p_dgid = p_pr->dgid; } else { *pp_dest_port = 0; if (comp_mask & IB_PR_COMPMASK_DLID) { BTW, could you test this? Unfortunately I don't have SRP target :( And seems that as more solid solution we need to cleanup some logic in osm_sa_path_record.c code. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SRP issues with OpenSM 3.3.3
Ira, On Mon, Dec 14, 2009 at 7:43 PM, Ira Weiny wrote: > Sasha, Hal, > > I have found that the following patch caused our SRP connected storage to > break. What is causing the SRP target to fail ? Is it a non zero hop limit returned in the SA PathRecord ? > patch: 3d20f82edd3246879063b77721d0bcef927bdc48 > > opensm/osm_sa_path_record.c: separate router guid resolution code > > Move off subnet destination (router address) resolution code to separate > function to improve readability. > > Signed-off-by: Sasha Khapyorsky > > I further traced the problem to pr_rcv_build_pr and the patch below fixes the > bug. On quick glance, I'm missing the connection between Sasha's patch and this routine setting something bad in the hop limit of the returned path record. > 16:03:34 > git diff > diff --git a/opensm/opensm/osm_sa_path_record.c > b/opensm/opensm/osm_sa_path_record.c > index be0cd71..32c889f 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const > osm_port_t * p_src_port, > > /* Only set HopLimit if going through a router */ > if (is_nonzero_gid) > - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); > + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; This may fix this issue but it doesn't look right to me and I think would break router scenarios and corrupts other field(s). Path record fields are in network (not host) order. > p_pr->pkey = p_parms->pkey; > ib_path_rec_set_sl(p_pr, p_parms->sl); > > > "hop_flow_raw" is not really a 32bit value but rather 4 fields: > Component Length(bits) Offset(bits) > RawTraffic 1 352 > Reserved 3 353 > FlowLabel 20 356 > HopLimit 8 384 > > > However, I don't understand the comment "Only set HopLimit if going through a > router"? Well, in a sense the HopLimit is always set. It's already set to 0 right above that code snippet in the patch where: p_pr->hop_flow_raw &= cl_hton32(1 << 31); This code is only setting the HopLimit to 255 (as this is very primitive so far) when going through a router. Maybe it could be stated better. > Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading > through a router? I don't think it matters if we are going through a router > or not does it? It's used when DGID is selected in the SA PR query via the comp mask and in that instance is used for both the router and non router DGID cases. > If not, I think the comment needs to be removed in the patch above. I'm not following why you say this. -- Hal > Thanks, > Ira > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > wei...@llnl.gov > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
SRP issues with OpenSM 3.3.3
Sasha, Hal, I have found that the following patch caused our SRP connected storage to break. patch: 3d20f82edd3246879063b77721d0bcef927bdc48 opensm/osm_sa_path_record.c: separate router guid resolution code Move off subnet destination (router address) resolution code to separate function to improve readability. Signed-off-by: Sasha Khapyorsky I further traced the problem to pr_rcv_build_pr and the patch below fixes the bug. 16:03:34 > git diff diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index be0cd71..32c889f 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -800,7 +800,7 @@ static void pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * p_src_port, /* Only set HopLimit if going through a router */ if (is_nonzero_gid) - p_pr->hop_flow_raw |= cl_hton32(IB_HOPLIMIT_MAX); + p_pr->hop_flow_raw |= IB_HOPLIMIT_MAX; p_pr->pkey = p_parms->pkey; ib_path_rec_set_sl(p_pr, p_parms->sl); "hop_flow_raw" is not really a 32bit value but rather 4 fields: Component Length(bits) Offset(bits) RawTraffic1 352 Reserved 3 353 FlowLabel 20 356 HopLimit 8 384 However, I don't understand the comment "Only set HopLimit if going through a router"? Was the intent to only set p_dgid in pr_rcv_get_end_points if we are heading through a router? I don't think it matters if we are going through a router or not does it? If not, I think the comment needs to be removed in the patch above. Thanks, Ira -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html