On Thu, 2006-09-14 at 07:14, Michael S. Tsirkin wrote: > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>: > > Subject: Re: [PATCH] IB/ipoib: use appropriate path selector > > > > On Thu, 2006-09-14 at 00:46, Michael S. Tsirkin wrote: > > > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>: > > > > Subject: Re: [PATCH] IB/ipoib: use appropriate path selector > > > > > > > > On Wed, 2006-09-13 at 18:08, Michael S. Tsirkin wrote: > > > > > Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > > > > > > Subject: Re: [PATCH] IB/ipoib: use appropriate path selector > > > > > > > > > > > > Michael> IPoIB in linux needs 2K MTU. Therefore it must set mtu > > > > > > Michael> selector in path record query accordingly. > > > > > > > > > > > > Umm -- why does it need a 2K MTU? As far as I know it should work > > > > > > fine with any MTU, assuming the SA sets the MTU of the broadcast > > > > > > multicast group correctly. > > > > > > > > > > Hmm, you are right, it is just that existing implementations all > > > > > set that to 2K. > > > > > > > > By default yes. It can be configured. > > > > > > > > > But there is a silent assumption that MTU of any path is >= broadcast > > > > > multicast group MTU, and this is what I want to fix. > > > > > > > > The spec says: > > > > "The value (for IB MTU) assigned to the broadcast-GID must not be > > > > greater than any physical link MTU spanned by the IPoIB subnet". > > > > so if the broadcast group is improperly setup not to follow this, there > > > > will be other issues. > > > > > > Correct. IPoIB uses broadcast group MTU to get the value reported to > > > Linux. If some link has a lower MTU IPoIB can not use it. > > > > > > > It doesn't need to be included in the PR request. > > > > > > I disagree here. If you do not set selector, SA is free to return > > > a path with lower MTU even though physical link allows higher MTU. > > > Does it say otherwise somewhere? > > > > No but isn't this relying on using PRs in a certain way by IPoIB > > implementations (and any other UD application) v. connected apps ? > > Not really. > > Tavor is faster with 1K MTU than with 2K MTU - it does not matter connected or > not. So, for me, it makes sense for SM to choose 1K if Tavor is involved, > unless application requested otherwise. > > If an application (again, no matter connected or UD) needs a specific MTU it > should use mtu selector in path query. If it does not, SM is free to choose > any > MTU supported by link, for best performance. If one end is Tavor, this > happens to > be 1K and not the maximum MTU. > > So what we have here is IPoIB bug - it requires that path mtu >= bcast group > mtu, but does not pass this information in query. This only happens to work > if SM always selects max link MTU for each path query.
> Makes sense? Understood. As I said in a previous email, if it happens that the path MTU < broadcast group MTU, I think there would be join issues for some nodes out there. -- Hal _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
