I am sponsoring this fasttrack for Peter Cudhea. Requested binding is 
Patch, timeout is 02/25/2009. There is an IO controller profile 
attributes table, described in section 4.2.3, in the case materials 
directory.

- John

This information is Copyright 2009 Sun Microsystems
1. Introduction
   1.1. Project/Component Working Name:
    COMSTAR Infiniband SRP Target
   1.2. Name of Document Author/Supplier:
    Peter.Cudhea at sun.com
   1.3. Date of This Document:
    02/12/09
4. Technical Description

COMSTAR Infiniband SRP Target
-------------------------
4.1.  Problem

    OpenSolaris currently lacks a target driver for the SCSI RDMA
    Protocol (SRP).  SRP accelerates the SCSI protocol by mapping
    the data transfer phases of SCSI commands to RDMA
    operations. As a result an SRP initiator should be able to
    read and write data from a COMSTAR SRP target at high data
    rates with relatively low CPU utilization.

    SRP is an alternative to iSER (PSARC 2008/395) for accessing
    SCSI storage over an Infiniband fabric. Both protocols are
    seeing demand in the market.  In particular, we need an SRP
    target in COMSTAR (PSARC 2007/523) to enable VMware
    connectivity to OpenSolaris based open storage.  VMware ESX,
    for example, supports only SRP (not iSER) for block-based
    storage connectivity over Infiniband.

4.2. Proposal

    The project will deliver a target implementation of SCSI RDMA
    Protocol represented as a COMSTAR STMF port provider.  We
    include a minimal implementation of the Infiniband Device
    Management Agen as a consumer to IBTF (PSARC 2002/132).  This
    agent allows initiator systems to query the capabilities of
    the target.  The SRP port provider will register its targets
    with the IB DM Agent to allow the targets to be discovered by
    SRP initiators.

4.2.1. COMSTAR SRP Target (srpt)

    When the SRP target service is enabled, it will register as a
    COMSTAR port provider using STMF.  This port provider will use
    the IB transport framework (IBTF) to enumerate all the HCAs on
    the system by GUID.  Each IB HCA will be reflected to STMF as
    a COMSTAR target named 'eui.<HCA-GUID>'.  For example, for an
    IB HCA with a HCA GUID of 0003BA0001002E49 the STMF target
    name will be 'eui.0003BA0001002E49'.

    STMF commands may then be used to assign these targets to host
    groups and to create views that determine which backing stores
    are accessible to which targets. STMF commands may also be
    used to mark each target as either offline or online.

    All of the physical IB ports on an HCA will treated as part of
    the same STMF target.  In IB target terms, each Host Channel
    Adapter (HCA) is treated as a Target Channel Adapter (TCA)
    with a single IO Unit containing a single IO controller.
    Multiple physical ports on the HCA are not exposed as separate
    virtual target-side resources.

    When the SRP service is enabled and the STMF target for a
    particular HCA is marked online, each port on that HCA will be
    configured to listen for incoming connections to the SRP
    service.  A virtual I/O Controller representing the target
    will also be registered with the minimal IB DM Agent as
    described in section 4.2.2.

    The SRP target capability will be represented as an SMF
    service using the FMRI svc:/system/ibsrp/target:default.  This
    service will be disabled by default.  No new 'rights profiles'
    will be defined; each administrative method for this service
    (e.g. start and stop) will use a credential with user=root,
    group=root, and privileges=basic,sys_devices.  The
    ibsrp/target service will be dependent on STMF
    (svc:/system/stmf:default) and on the IB Device Management
    Agent described in section 4.2.2 (svc:/system/ibdma:default).

    The SRP target will be implemented as pseudo device driver
    'srpt' which is a child under the Infiniband 'ib' nexus.

4.2.2 Minimal Infiniband Device-Management Agent (ibdma)

    In order for IB initiators to enumerate the available storage,
    this project also includes a minimal "device management agent"
    for Infiniband, as described in section 16.3 of the Infiniband
    Architecture Spec.  (Section 16.3 corresponds to version 1 of
    the IB Device Management protocol, which is distinct from the
    version 2 Device Management protocol as defined in Annex A8.)

    Infiniband "device management" services are used by IB
    initiator systems to enumerate the IO Units (IOUs) and IO
    Controllers (IOCs) on the target system, and to query the
    services supported on each IO Controller.  For this case, SRP
    initiators in particular use DM services to enumerate all the
    IOCs that are SRP-capable IOCs, and to enumerate the SRP
    services that are available through each IOC.

    SRP is the only target-side service that makes use of IB DM
    services for discovery that the Infiniband group expects
    OpenSolaris to support.  If other target-side services are
    added in the future that require a more fully-functioning DM
    Agent, then this minimal agent can be expanded and extended as
    required.  While the API we introduce in section 4.2.2.3 to
    register new services as IO Controllers is somewhat general,
    and could in principle be used by other target-side services,
    we are keeping the new API interface as Project Private to
    provide maximum flexibility to enhance or extend the details
    in the future.

    We implement only the query-oriented subset of the DM protocol
    that is necessary for SRP discovery.  The full DM protocol,
    among other uses, allows initiator systems to request
    target-side device management services such as testing devices
    and retrieving device diagnostic codes.  The SRP spec in
    section B.5 spells out its requirements:
        The IB I/O unit shall include an IB device management
        agent to provide the IOUnitInfo, IOControllerProfile,
        and ServiceEntries attributes.

    The query-oriented subset we implement falls short of full
    compliance with the IB DM protocol as specified in the IB Arch
    spec.  This subset is consistent with how initiators actually use DM
    Agent services for discovery.  We return appropriate error
    codes (either "method not supported" or "combination of method
    and attribute not supported" for all other DM requests.

    The IB Device Management agent will be represented as an SMF
    service using the FMRI svc:/system/ibdma:default.  This
    service will be disabled by default.  No new 'rights profiles'
    will be defined; each administrative method for this service
    (e.g. start and stop) will use a credential with user=root,
    group=root, and privileges=basic,sys_devices.

    The IB DM agent will be implemented as a pseudo device driver
    'ibdma' under /kernel/misc.

4.2.2.1 Listening for DM agent requests

    To support this case, the IB group will add a new value
    IBT_DMA to the enumeration ibt_clnt_modinfo_t that is used as
    an argument to ibt_attach.  The IBT_DMA value is reserved for
    Sun Internal use only, and is used to directly support the IB
    Device Management Agent.

    The minimal IB DM agent, as specified in section B.5 of the
    SRP spec (SCSI Architecture Mapping) will respond to DevMgtGet
    requests with the following Attribute IDs:

    Attribute ID    Attribute Name    OpenSolaris Comments
    0x01        ClassPortInfo    "HELLO" message to determine
                    protocol class match
    0x02        IOUnitInfo    Enumerate the "virtual IOCs"
                    available on an IO Unit.
                    Each HCA is represented as an
                    IO Unit.
    0x10        IOControllerProfile
                    Retrieve IO Controller Profile
                    information for a specific IOC
    0x12        ServiceEntries    Enumerate Service Names and
                    Service IDs for a given IOC

    The manual page changes for adding IBT_DMA are:

    ------- ibt_attach.9f -------
    75a76,77
    >       IBT_DMA                 For Sun Internal use only.

    ------- ibt_clnt_modinfo_t.9s -------
    55a56,57
    >           IBT_DMA             For Sun Internal use only.


4.2.2.2 Enumeration of HCAs as I/O Units

    Infiniband target services are made available through I/O
    Controllers that are associated with I/O Units. See for
    example Figure B.3 in Annex B of the SRP Specification for an
    SRP-specific picture of this architecture.  In the general IB
    architecture, a particular I/O Unit could be visible through
    several different Target CAs and thus via the different IB
    ports on those CAs.  The IO Controllers and services available
    through an IOU are generally consistent no matter which port
    is used to make a connection to that IOU.  In simple terms,
    once a connection has reached an IO Unit, it can make use of
    the services provided by that I/O Unit.

    For the minimal IB DM Agent, we choose to represent each HCA
    as a separate I/O Unit.  The key to the minimal IB DM Agent is
    that it requires no administration. By using the HCA GUID as
    the I/O Unit GUID, we avoid the need to administer I/O Units
    as separate entities.  Alternative no-administration models
    would be to treat the entire target system as a single I/O
    Unit or to treat each individual port on each HCA as an I/O
    Unit.  The current HCA-as-an-IOU model was chosen because it
    is familiar to current users of IB target-side services on
    other systems such as Linux.  This model works the way users
    of IB services expect it to.

    As soon as the ibdma service is enabled, the target system
    will:
    o Modify the "port profile" for each HCA port to indicate that
      "device management is supported" on that port.  This is part
      of the "port capabilities mask".
    o Listen for IB Device Management MADs and respond with errors
      to those requests that are not supported.
    o Enumerate each HCA on the system as an available IO Unit.
    o Until specific target-side services are registered using the
      API defined in section 4.2.2.3, the IO Units will report
      they have no contained IO Controllers.  Virtual IOCs
      are tied to a specific service and do not exist until a
      specific service is enabled.

4.2.2.3    Registering a Virtual I/O Controller for a Supported Service

    In IB, each IOC is associated with an "IO Controller profile"
    that defines the specific services that are available via that
    IOC.  For example, as described above the Infiniband Annex to
    the SRP specification defines both the IO/Controller profile
    and the service name to use for SRP.

    ibdma_ioc_register    Register a "Virtual IO Controller".
    ibdma_ioc_unregister    Unregister a "Virtual IO Controller"
    ibdma_ioc_update    Modify the characteristics of an
                existing virtual IO Controller.

    To register a new virtual IO Controller, the caller specifies
    the GUID of the parent IO Unit, an IOC profile describing the
    virtual service, and a specific list of target-side Service
    Entries associated with the virtual IOC.  As far as the IB DM
    Agent is concerned, the set of advertised IOCs is arbitrary.
    A single IO Unit could support multiple different target-side
    services each of which would be represented as a separate
    virtual IO Controller.  Similarly, the target-side service can
    decide whether or not to create multiple virtual IOCs
    representing the different physical ports on the HCA.  It is
    up to each service individually to determine the particular IO
    controllers to emulate.   The IB DM Agent simply advertises
    these virtual IO controllers to initiator systems.

4.2.3. The SRP Virtual I/O Controller

    For SRP, a single virtual IO Controller is created for each
    HCA.  Each virtual IOC is marked as being SRP-capable by
    using parameters from the standard SRP IO Controller Profile
    which appears in Annex B of the SRP spec.  The specific
    parameters used in this IO profile are available in the
    materials directory for this case in a document called
    SRPT-IOC-parameters.txt.

4.3. Risks and Assumptions

    The current implementation relies on IB "shared receive
    queues" which are not available in all IB HCAs or drivers.
    All Sun supported HCAs and drivers do support this capability.
    In particular, the tavor, arbel, and (since snv_107) hermon
    drivers do support this capability.

4.4. How will you know when you are done?:
    Linux and VMWare ESX initiators based on the OFED stack can
    reliably access COMSTAR storage through the SRP    target port
    provider.

4.5. Interfaces:

        --------------------------------------------------------------
        EXPORTED INTERFACES
        Interface                 Level            Comments
        --------------------------------------------------------------
        ibdma_ioc_register        Project Private
        ibdma_ioc_unregister      Project Private
        ibdma_ioc_update          Project Private
        eui.<HCA-GUID>            Committed        Naming convention for
                                                   STMF SRP targets
        IOU GUID = HCA GUID       Committed        SRP initiators
                                                   see stable targets
        IOC GUID = HCA GUID       Committed        SRP initiators
                                                   see stable targets
        SCSI RDMA Protocol        Committed        Defined by T10
                                                   standard for SRP
       Device Management Protocol Committed        Section 16.3
       (query subset)                              of IB architecture

       IBT_DMA            Consolidation Private    arg to ibt_attach

        --------------------------------------------------------------
        IMPORTED INTERFACES
        Interface                 Level            Comments
        --------------------------------------------------------------
        IBTF           Consolidation Private
        STMF           Consolidation Private
        IBT_DMA        Consolidation Private    Exported from IBTF
                                                Imported to SRP


   ---------------------------------------------------------------

    4.6. Doc Impact:
    Man pages for srpt and ibdma device drivers.

    IBTF internal man page updates (as described in 4.2.2.1) for
    ibt_attach(9f) and ibt_clnt_modinfo_t(9s).

    We will need to add a section in the OpenSolaris COMSTAR
    Administration Guide that describes how to provision different
    kinds of COMSTAR storage targets.  Currently, this guide
    describes only how to provision Fibre Channel storage targets.
    The new documentation should be coordinated among all the new
    COMSTAR port providers such as iSCSI, iSER, FCoE, and SAS.

    4.7. Admin/Config Impact:
    None

    4.10. Packaging & Delivery:
    Packages
        SUNWsrptr
            x86:
            /kernel/drv/srpt
            /kernel/drv/amd64/srpt
            /kernel/drv/srpt.conf
            /lib/svc/method/svc-srpt
            /var/svc/manifest/system/ibsrp/target.xml
            sparc:
            /kernel/drv/sparcv9/srpt
            /kernel/drv/srpt.conf
            /lib/svc/method/svc-srpt
            /var/svc/manifest/system/ibsrp/target.xml
        SUNWibdmar
            x86:
            /kernel/misc/ibdma
            /kernel/misc/amd64/ibdma
            /lib/svc/method/svc-ibdma
            /var/svc/manifest/system/ibdma.xml
            sparc:
            /kernel/misc/sparcv9/ibdma
            /lib/svc/method/svc-ibdma
            /var/svc/manifest/system/ibdma.xml

5. References
    Infiniband Transport Framework (IBTF) [PSARC 2002/132]
    COMSTAR SCSI Transport Framework (STMF) [PSARC 2007/523]
    SCSI Architecture Model 2 (SAM-2, T10/1157-D) (t10.org)
    SCSI-3 Primary Commands (SPC-3, T10/1416-D) (t10.org)
    SCSI RDMA Protocol revision 16a of the SRP T10/1415-D
        specification of July 3rd, 2002
        (http://www.t10.org/ftp/t10/drafts/srp/srp-r16a.pdf)
    Infiniband Architecture Specification v1.2.1
        (Infiniband Trade Association)
        (http://www.infinibandta.org/specs/)



Reply via email to