Ted H. Kim wrote:
> There is a extensive revision of the dladm
> support for IPoIB coming where Brussels
> support and a change in the administrative
> model will be dealt with. But that may be
> a ways off (est. 2010.Q2?) and in the
> meantime, people are screaming for
> the performance that Connected Mode gives,
> so we don't want to wait for that.

So, lets make the .conf setting Volatile, since we expect to change it 
in less than a year to a Brussels setting.  This will allow people to 
use it, but with an admonition not  to get too fond of the driver.conf 
setting.

Personally, I think Brussels support is so easy to implement that I'm 
not sure I understand why this can't be done almost immediately.

    - Garrett

>
> -ted
>
> Garrett D'Amore wrote:
>> I feel very strongly that I'd prefer to avoid the use of a 
>> driver.conf for this, and instead handle it as a Brussels property, 
>> at least on Solaris Nevada.  (This will support administration via 
>> dladm, and ultimately also ndd, though we don't like to say that. ;-)
>>
>> If you need to use a driver.conf for Solaris 10, that's OK I suppose 
>> (although an ndd tunable would be better there too, since it doesn't 
>> require the driver to be unloaded and reloaded to change the setting 
>> -- which can be very challenging for administrators to figure out.)
>>
>> I feel TCR-strong on this -- if it were a full case I'd insist that 
>> this be part of the spec before I'd vote to approve.
>>
>> Is the project team amenable to making this change, or do they have 
>> some other reason why driver.conf values need to be used instead.
>>
>> Also, I'd like the mtu to be set via Brussels as well, if it isn't 
>> already handled that way.
>>
>>    - Garrett
>>
>> Ted Kim wrote:
>>> Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
>>> This information is Copyright 2009 Sun Microsystems
>>> 1. Introduction
>>>     1.1. Project/Component Working Name:
>>>      IPoIB Connected Mode
>>>     1.2. Name of Document Author/Supplier:
>>>      Author:  Kevin Ge
>>>     1.3  Date of This Document:
>>>     30 October, 2009
>>> 4. Technical Description
>>>
>>> A. Overview
>>> -----------
>>>
>>>    This case proposes changes to the Solaris kernel to provide support
>>>    for "Connected Mode" in the IPoIB driver ibd(7D) (described in [1]
>>>    and [2]).
>>>
>>>    The Infiniband Architecture [3] defines multiple "transport service
>>>    types", including Unreliable Datagram (UD), Reliable Connected (RC)
>>>    and Unreliable Connected (UC). Current ibd (based on [4]) runs in
>>>    "Datagram Mode" over the UD transport service type. Connected Mode
>>>    (described in [5]) can use either UC and/or RC.
>>>
>>>    This IPoIB-CM project uses RC, because of the desire to
>>>    inter-operate with Linux which also uses RC. The main advantage of
>>>    Connected Mode is better performance (higher throughput and lower
>>>    CPU utilization) based on using very large MTUs (see below for more
>>>    discussion). Connected Mode, though, can have the disadvantage of
>>>    consuming more resources, especially when scaling up to a large
>>>    cluster (due to using an InfiniBand connection to each destination).
>>>
>>>    Note that this case only covers all necessary changes to support
>>>    IPoIB driver running in Connected Mode over RC. Other enhancements
>>>    are outside the scope of this case.
>>>
>>>    A micro/patch binding is asserted for this proposal.
>>>
>>> B. Connected Mode IPoIB driver
>>> ------------------------------
>>>
>>>    The revised ibd(7D) driver will support both Connected and Datagram
>>>    mode. The features from the current Datagram mode ibd driver will
>>>    be inherited. The remainder of this section discusses interface
>>>    additions for the Connected mode capable driver.
>>>
>>>
>>> B.1 Switching between datagram and connected mode
>>>
>>>    The existing ibd driver in OpenSolaris and Solaris 10 does not
>>>    ship with a driver .conf file. However, the Connected Mode support
>>>    described in this case introduces a new parameter 'enable_rc' that
>>>    may be set via the ibd driver .conf file.
>>>
>>>    This parameter specifies whether each ibd instance defaults to
>>>    using Connected Mode over RC or not.
>>>
>>>        # 1: unicast packets will be sent over Reliable Connected Mode
>>>        # 0: unicast packets will be sent over Unreliable Datagram Mode
>>>        #
>>>        # Each element in the list below maps to the corresponding ibd
>>>        # instance; the first element is for ibd instance 0, the second
>>>        # element is for instance 1 and so on.
>>>        #
>>>        enable_rc=1,1,0,0;
>>>
>>>    Please note that Connected Mode support in IPoIB is optional as per
>>>    [5]. Therefore, if Connected Mode is not available for a remote
>>>    node, the Datagram mode will automatically be used for that
>>>    destination by the ibd driver. Therefore, the only meaning of
>>>    'enable_rc' is to decide whether to try Connected Mode first or
>>>    not, and whether to advertise this as a capability supported by
>>>    this instance or not.
>>>
>>>    The default value for 'enable_rc' for each instance is 0. Hence
>>>    without a ibd.conf file, Datagram mode will be used. We intend to
>>>    ship a driver .conf file for ibd in ONNV (and hence OpenSolaris)
>>>    with enable_rc set to all ones (enabling Connected Mode by
>>>    default on all instances) for the best performance.
>>>
>>>    However, for Solaris 10, we have received business guidance to have
>>>    an "opt-in" approach due to a desire for greater stability in
>>>    established enterprise environments. We will do this by not
>>>    shipping the .conf file. Therefore, by default Solaris 10 will be
>>>    Datagram mode. It will take an explicit administrator action
>>>    (setting enable_rc) to cause Solaris 10 to use Connected Mode.
>>>      OFED (Linux IB) originally made Connected Mode opt-in too. 
>>> However,
>>>    later OFED made it the default. We don't intend to change it later
>>>    to be the default in Solaris 10. However, Solaris Next, being
>>>    descended from ONNV, will have it as default.
>>>
>>>    An edited ibd(7D) manpage documenting this change is in the
>>>    materials directory.
>>>
>>> B.2 Change of default MTU size
>>>
>>>    Connected Mode by virtue of using the RC transport service type
>>>    offers link MTUs of up to 2^31-4 octets in length. Thus, the use of
>>>    Connected Mode can offer benefits by supporting very large MTUs.
>>>    Datagram Mode using UD is limited to 4092 (4K-4) octets, though
>>>    commonly only 2044 (2K-4) is offered.
>>>
>>>    Due to the limits of the TCP/IP protocol, it makes sense to only
>>>    offer up to 65535 (64K-1) bytes. OFED (i.e. Linux IB) uses 65520
>>>    (64K-16) byte MTU for alignment reasons. To inter-operate with
>>>    OFED at the best performance, we also adopt 65520 as the default
>>>    MTU of the Connected Mode.
>>>
>>>
>>> C. Interfaces
>>> -------------
>>> +-------------------------------------------------------------------+
>>> |                     Interfaces Exported                           |
>>> +---------------------------+------------------+--------------------+
>>> |    Interface Name         |  Classification  |      Comment       |
>>> +---------------------------+------------------+--------------------+
>>> |/kernel/drv/ibd.conf*      |   Uncommitted    | Configuration file |
>>> +---------------------------+------------------+--------------------+
>>>  * = only for OpenSolaris
>>>
>>>
>>> D. References
>>> -------------
>>>
>>>    [1] IP over InfiniBand, PSARC/2001/289
>>>
>>>    [2] IPoIB Conversion to GLDv3, PSARC/2007/636
>>>
>>>    [3] InfiniBand Architecture Specification Volume 1, Release 1.2.1,
>>>      InfiniBand Trade Association, 2007.
>>>      
>>> http://www.infinibandta.org/content/pages.php?pg=technology_download
>>>
>>>    [4] Transmission of IP over InfiniBand (IPoIB), RFC 4391, IETF,
>>>        http://www.ietf.org/rfc/rfc4391.txt
>>>
>>>    [5] IP over InfiniBand: Connected Mode, RFC 4755, IETF,
>>>        http://www.ietf.org/rfc/rfc4755.txt
>>>
>>> 6. Resources and Schedule
>>>     6.4. Steering Committee requested information
>>>        6.4.1. Consolidation C-team Name:
>>>         ON
>>>     6.5. ARC review type: FastTrack
>>>     6.6. ARC Exposure: open
>>>
>>>   
>>
>

Reply via email to