Ted H. Kim wrote: > There is a extensive revision of the dladm > support for IPoIB coming where Brussels > support and a change in the administrative > model will be dealt with. But that may be > a ways off (est. 2010.Q2?) and in the > meantime, people are screaming for > the performance that Connected Mode gives, > so we don't want to wait for that.
So, lets make the .conf setting Volatile, since we expect to change it in less than a year to a Brussels setting. This will allow people to use it, but with an admonition not to get too fond of the driver.conf setting. Personally, I think Brussels support is so easy to implement that I'm not sure I understand why this can't be done almost immediately. - Garrett > > -ted > > Garrett D'Amore wrote: >> I feel very strongly that I'd prefer to avoid the use of a >> driver.conf for this, and instead handle it as a Brussels property, >> at least on Solaris Nevada. (This will support administration via >> dladm, and ultimately also ndd, though we don't like to say that. ;-) >> >> If you need to use a driver.conf for Solaris 10, that's OK I suppose >> (although an ndd tunable would be better there too, since it doesn't >> require the driver to be unloaded and reloaded to change the setting >> -- which can be very challenging for administrators to figure out.) >> >> I feel TCR-strong on this -- if it were a full case I'd insist that >> this be part of the spec before I'd vote to approve. >> >> Is the project team amenable to making this change, or do they have >> some other reason why driver.conf values need to be used instead. >> >> Also, I'd like the mtu to be set via Brussels as well, if it isn't >> already handled that way. >> >> - Garrett >> >> Ted Kim wrote: >>> Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI >>> This information is Copyright 2009 Sun Microsystems >>> 1. Introduction >>> 1.1. Project/Component Working Name: >>> IPoIB Connected Mode >>> 1.2. Name of Document Author/Supplier: >>> Author: Kevin Ge >>> 1.3 Date of This Document: >>> 30 October, 2009 >>> 4. Technical Description >>> >>> A. Overview >>> ----------- >>> >>> This case proposes changes to the Solaris kernel to provide support >>> for "Connected Mode" in the IPoIB driver ibd(7D) (described in [1] >>> and [2]). >>> >>> The Infiniband Architecture [3] defines multiple "transport service >>> types", including Unreliable Datagram (UD), Reliable Connected (RC) >>> and Unreliable Connected (UC). Current ibd (based on [4]) runs in >>> "Datagram Mode" over the UD transport service type. Connected Mode >>> (described in [5]) can use either UC and/or RC. >>> >>> This IPoIB-CM project uses RC, because of the desire to >>> inter-operate with Linux which also uses RC. The main advantage of >>> Connected Mode is better performance (higher throughput and lower >>> CPU utilization) based on using very large MTUs (see below for more >>> discussion). Connected Mode, though, can have the disadvantage of >>> consuming more resources, especially when scaling up to a large >>> cluster (due to using an InfiniBand connection to each destination). >>> >>> Note that this case only covers all necessary changes to support >>> IPoIB driver running in Connected Mode over RC. Other enhancements >>> are outside the scope of this case. >>> >>> A micro/patch binding is asserted for this proposal. >>> >>> B. Connected Mode IPoIB driver >>> ------------------------------ >>> >>> The revised ibd(7D) driver will support both Connected and Datagram >>> mode. The features from the current Datagram mode ibd driver will >>> be inherited. The remainder of this section discusses interface >>> additions for the Connected mode capable driver. >>> >>> >>> B.1 Switching between datagram and connected mode >>> >>> The existing ibd driver in OpenSolaris and Solaris 10 does not >>> ship with a driver .conf file. However, the Connected Mode support >>> described in this case introduces a new parameter 'enable_rc' that >>> may be set via the ibd driver .conf file. >>> >>> This parameter specifies whether each ibd instance defaults to >>> using Connected Mode over RC or not. >>> >>> # 1: unicast packets will be sent over Reliable Connected Mode >>> # 0: unicast packets will be sent over Unreliable Datagram Mode >>> # >>> # Each element in the list below maps to the corresponding ibd >>> # instance; the first element is for ibd instance 0, the second >>> # element is for instance 1 and so on. >>> # >>> enable_rc=1,1,0,0; >>> >>> Please note that Connected Mode support in IPoIB is optional as per >>> [5]. Therefore, if Connected Mode is not available for a remote >>> node, the Datagram mode will automatically be used for that >>> destination by the ibd driver. Therefore, the only meaning of >>> 'enable_rc' is to decide whether to try Connected Mode first or >>> not, and whether to advertise this as a capability supported by >>> this instance or not. >>> >>> The default value for 'enable_rc' for each instance is 0. Hence >>> without a ibd.conf file, Datagram mode will be used. We intend to >>> ship a driver .conf file for ibd in ONNV (and hence OpenSolaris) >>> with enable_rc set to all ones (enabling Connected Mode by >>> default on all instances) for the best performance. >>> >>> However, for Solaris 10, we have received business guidance to have >>> an "opt-in" approach due to a desire for greater stability in >>> established enterprise environments. We will do this by not >>> shipping the .conf file. Therefore, by default Solaris 10 will be >>> Datagram mode. It will take an explicit administrator action >>> (setting enable_rc) to cause Solaris 10 to use Connected Mode. >>> OFED (Linux IB) originally made Connected Mode opt-in too. >>> However, >>> later OFED made it the default. We don't intend to change it later >>> to be the default in Solaris 10. However, Solaris Next, being >>> descended from ONNV, will have it as default. >>> >>> An edited ibd(7D) manpage documenting this change is in the >>> materials directory. >>> >>> B.2 Change of default MTU size >>> >>> Connected Mode by virtue of using the RC transport service type >>> offers link MTUs of up to 2^31-4 octets in length. Thus, the use of >>> Connected Mode can offer benefits by supporting very large MTUs. >>> Datagram Mode using UD is limited to 4092 (4K-4) octets, though >>> commonly only 2044 (2K-4) is offered. >>> >>> Due to the limits of the TCP/IP protocol, it makes sense to only >>> offer up to 65535 (64K-1) bytes. OFED (i.e. Linux IB) uses 65520 >>> (64K-16) byte MTU for alignment reasons. To inter-operate with >>> OFED at the best performance, we also adopt 65520 as the default >>> MTU of the Connected Mode. >>> >>> >>> C. Interfaces >>> ------------- >>> +-------------------------------------------------------------------+ >>> | Interfaces Exported | >>> +---------------------------+------------------+--------------------+ >>> | Interface Name | Classification | Comment | >>> +---------------------------+------------------+--------------------+ >>> |/kernel/drv/ibd.conf* | Uncommitted | Configuration file | >>> +---------------------------+------------------+--------------------+ >>> * = only for OpenSolaris >>> >>> >>> D. References >>> ------------- >>> >>> [1] IP over InfiniBand, PSARC/2001/289 >>> >>> [2] IPoIB Conversion to GLDv3, PSARC/2007/636 >>> >>> [3] InfiniBand Architecture Specification Volume 1, Release 1.2.1, >>> InfiniBand Trade Association, 2007. >>> >>> http://www.infinibandta.org/content/pages.php?pg=technology_download >>> >>> [4] Transmission of IP over InfiniBand (IPoIB), RFC 4391, IETF, >>> http://www.ietf.org/rfc/rfc4391.txt >>> >>> [5] IP over InfiniBand: Connected Mode, RFC 4755, IETF, >>> http://www.ietf.org/rfc/rfc4755.txt >>> >>> 6. Resources and Schedule >>> 6.4. Steering Committee requested information >>> 6.4.1. Consolidation C-team Name: >>> ON >>> 6.5. ARC review type: FastTrack >>> 6.6. ARC Exposure: open >>> >>> >> >