okay, volatile it is.

-ted


Garrett D'Amore wrote:
> Ted H. Kim wrote:
>> There is a extensive revision of the dladm
>> support for IPoIB coming where Brussels
>> support and a change in the administrative
>> model will be dealt with. But that may be
>> a ways off (est. 2010.Q2?) and in the
>> meantime, people are screaming for
>> the performance that Connected Mode gives,
>> so we don't want to wait for that.
> 
> So, lets make the .conf setting Volatile, since we expect to change it 
> in less than a year to a Brussels setting.  This will allow people to 
> use it, but with an admonition not  to get too fond of the driver.conf 
> setting.
> 
> Personally, I think Brussels support is so easy to implement that I'm 
> not sure I understand why this can't be done almost immediately.
> 
>    - Garrett
> 
>>
>> -ted
>>
>> Garrett D'Amore wrote:
>>> I feel very strongly that I'd prefer to avoid the use of a 
>>> driver.conf for this, and instead handle it as a Brussels property, 
>>> at least on Solaris Nevada.  (This will support administration via 
>>> dladm, and ultimately also ndd, though we don't like to say that. ;-)
>>>
>>> If you need to use a driver.conf for Solaris 10, that's OK I suppose 
>>> (although an ndd tunable would be better there too, since it doesn't 
>>> require the driver to be unloaded and reloaded to change the setting 
>>> -- which can be very challenging for administrators to figure out.)
>>>
>>> I feel TCR-strong on this -- if it were a full case I'd insist that 
>>> this be part of the spec before I'd vote to approve.
>>>
>>> Is the project team amenable to making this change, or do they have 
>>> some other reason why driver.conf values need to be used instead.
>>>
>>> Also, I'd like the mtu to be set via Brussels as well, if it isn't 
>>> already handled that way.
>>>
>>>    - Garrett
>>>
>>> Ted Kim wrote:
>>>> Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
>>>> This information is Copyright 2009 Sun Microsystems
>>>> 1. Introduction
>>>>     1.1. Project/Component Working Name:
>>>>      IPoIB Connected Mode
>>>>     1.2. Name of Document Author/Supplier:
>>>>      Author:  Kevin Ge
>>>>     1.3  Date of This Document:
>>>>     30 October, 2009
>>>> 4. Technical Description
>>>>
>>>> A. Overview
>>>> -----------
>>>>
>>>>    This case proposes changes to the Solaris kernel to provide support
>>>>    for "Connected Mode" in the IPoIB driver ibd(7D) (described in [1]
>>>>    and [2]).
>>>>
>>>>    The Infiniband Architecture [3] defines multiple "transport service
>>>>    types", including Unreliable Datagram (UD), Reliable Connected (RC)
>>>>    and Unreliable Connected (UC). Current ibd (based on [4]) runs in
>>>>    "Datagram Mode" over the UD transport service type. Connected Mode
>>>>    (described in [5]) can use either UC and/or RC.
>>>>
>>>>    This IPoIB-CM project uses RC, because of the desire to
>>>>    inter-operate with Linux which also uses RC. The main advantage of
>>>>    Connected Mode is better performance (higher throughput and lower
>>>>    CPU utilization) based on using very large MTUs (see below for more
>>>>    discussion). Connected Mode, though, can have the disadvantage of
>>>>    consuming more resources, especially when scaling up to a large
>>>>    cluster (due to using an InfiniBand connection to each destination).
>>>>
>>>>    Note that this case only covers all necessary changes to support
>>>>    IPoIB driver running in Connected Mode over RC. Other enhancements
>>>>    are outside the scope of this case.
>>>>
>>>>    A micro/patch binding is asserted for this proposal.
>>>>
>>>> B. Connected Mode IPoIB driver
>>>> ------------------------------
>>>>
>>>>    The revised ibd(7D) driver will support both Connected and Datagram
>>>>    mode. The features from the current Datagram mode ibd driver will
>>>>    be inherited. The remainder of this section discusses interface
>>>>    additions for the Connected mode capable driver.
>>>>
>>>>
>>>> B.1 Switching between datagram and connected mode
>>>>
>>>>    The existing ibd driver in OpenSolaris and Solaris 10 does not
>>>>    ship with a driver .conf file. However, the Connected Mode support
>>>>    described in this case introduces a new parameter 'enable_rc' that
>>>>    may be set via the ibd driver .conf file.
>>>>
>>>>    This parameter specifies whether each ibd instance defaults to
>>>>    using Connected Mode over RC or not.
>>>>
>>>>        # 1: unicast packets will be sent over Reliable Connected Mode
>>>>        # 0: unicast packets will be sent over Unreliable Datagram Mode
>>>>        #
>>>>        # Each element in the list below maps to the corresponding ibd
>>>>        # instance; the first element is for ibd instance 0, the second
>>>>        # element is for instance 1 and so on.
>>>>        #
>>>>        enable_rc=1,1,0,0;
>>>>
>>>>    Please note that Connected Mode support in IPoIB is optional as per
>>>>    [5]. Therefore, if Connected Mode is not available for a remote
>>>>    node, the Datagram mode will automatically be used for that
>>>>    destination by the ibd driver. Therefore, the only meaning of
>>>>    'enable_rc' is to decide whether to try Connected Mode first or
>>>>    not, and whether to advertise this as a capability supported by
>>>>    this instance or not.
>>>>
>>>>    The default value for 'enable_rc' for each instance is 0. Hence
>>>>    without a ibd.conf file, Datagram mode will be used. We intend to
>>>>    ship a driver .conf file for ibd in ONNV (and hence OpenSolaris)
>>>>    with enable_rc set to all ones (enabling Connected Mode by
>>>>    default on all instances) for the best performance.
>>>>
>>>>    However, for Solaris 10, we have received business guidance to have
>>>>    an "opt-in" approach due to a desire for greater stability in
>>>>    established enterprise environments. We will do this by not
>>>>    shipping the .conf file. Therefore, by default Solaris 10 will be
>>>>    Datagram mode. It will take an explicit administrator action
>>>>    (setting enable_rc) to cause Solaris 10 to use Connected Mode.
>>>>      OFED (Linux IB) originally made Connected Mode opt-in too. 
>>>> However,
>>>>    later OFED made it the default. We don't intend to change it later
>>>>    to be the default in Solaris 10. However, Solaris Next, being
>>>>    descended from ONNV, will have it as default.
>>>>
>>>>    An edited ibd(7D) manpage documenting this change is in the
>>>>    materials directory.
>>>>
>>>> B.2 Change of default MTU size
>>>>
>>>>    Connected Mode by virtue of using the RC transport service type
>>>>    offers link MTUs of up to 2^31-4 octets in length. Thus, the use of
>>>>    Connected Mode can offer benefits by supporting very large MTUs.
>>>>    Datagram Mode using UD is limited to 4092 (4K-4) octets, though
>>>>    commonly only 2044 (2K-4) is offered.
>>>>
>>>>    Due to the limits of the TCP/IP protocol, it makes sense to only
>>>>    offer up to 65535 (64K-1) bytes. OFED (i.e. Linux IB) uses 65520
>>>>    (64K-16) byte MTU for alignment reasons. To inter-operate with
>>>>    OFED at the best performance, we also adopt 65520 as the default
>>>>    MTU of the Connected Mode.
>>>>
>>>>
>>>> C. Interfaces
>>>> -------------
>>>> +-------------------------------------------------------------------+
>>>> |                     Interfaces Exported                           |
>>>> +---------------------------+------------------+--------------------+
>>>> |    Interface Name         |  Classification  |      Comment       |
>>>> +---------------------------+------------------+--------------------+
>>>> |/kernel/drv/ibd.conf*      |   Uncommitted    | Configuration file |
>>>> +---------------------------+------------------+--------------------+
>>>>  * = only for OpenSolaris
>>>>
>>>>
>>>> D. References
>>>> -------------
>>>>
>>>>    [1] IP over InfiniBand, PSARC/2001/289
>>>>
>>>>    [2] IPoIB Conversion to GLDv3, PSARC/2007/636
>>>>
>>>>    [3] InfiniBand Architecture Specification Volume 1, Release 1.2.1,
>>>>      InfiniBand Trade Association, 2007.
>>>>      
>>>> http://www.infinibandta.org/content/pages.php?pg=technology_download
>>>>
>>>>    [4] Transmission of IP over InfiniBand (IPoIB), RFC 4391, IETF,
>>>>        http://www.ietf.org/rfc/rfc4391.txt
>>>>
>>>>    [5] IP over InfiniBand: Connected Mode, RFC 4755, IETF,
>>>>        http://www.ietf.org/rfc/rfc4755.txt
>>>>
>>>> 6. Resources and Schedule
>>>>     6.4. Steering Committee requested information
>>>>        6.4.1. Consolidation C-team Name:
>>>>         ON
>>>>     6.5. ARC review type: FastTrack
>>>>     6.6. ARC Exposure: open
>>>>
>>>>   
>>>
>>
> 

-- 
Ted H. Kim
Sun Microsystems, Inc.                  ted.kim at sun.com
222 North Sepulveda Blvd., 10th Floor   (310) 341-1116
El Segundo, CA  90245                   (310) 341-1120 FAX

Reply via email to