Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
         Dynamic Ring Grouping on NICs
    1.2. Name of Document Author/Supplier:
         Author:  Venu Iyer
    1.3  Date of This Document:
        18 September, 2009
4. Technical Description
I'm filing this fasttrack for Venu Iyer. The release binding is patch.
The interface taxonomy is Uncommitted

Background
==========

Project Crossbow (PSARC/2006/357) enables creating hardware-based MAC
clients (some of these MAC clients are data links such as VNICs) both on
the RX and TX side. We define hardware-based MAC clients as having dedicated
hardware resources; a RX hardware-based MAC client will have one or more RX
ring for exclusive use while a TX hardware-based MAC client will have one or
more TX ring for exclusive use.  MAC clients that are not hardware-based
(RX or TX) share hardware resources with other MAC clients, such MAC
clients will not have any TX or RX rings exclusively reserved for them.
MAC clients may be hardware-based on RX, but not on TX (and vice-versa).

Currently, when a NIC registers with MAC it informs MAC if it supports
dedicated hardware RX or TX rings. MAC assigns hardware rings to MAC
clients as groups, where a group may contain 1 or more hardware rings.

dladm show-phys is currently used to show how RX rings are used by MAC
clients.

    # dladm show-phys -H nxge4
    LINK         GROUP    GROUPTYPE RINGS         CLIENTS
    nxge4        0        RX        3             nxge4
    nxge4        1        RX        1             vnic1

which says we have 1 RX hardware-based MAC client - vnic1 with 1 ring.
nxge4, the primary MAC client, is using 3 rings, but will share
these with any other MAC client that is subsequently created on the
data link nxge4 (i.e. if vnic2 is created on nxge4, MAC clients vnic2 and
nxge4 will share group 1, and hence the 3 rings), eg:

    # dladm show-phys -H nxge4
    LINK         GROUP    GROUPTYPE RINGS         CLIENTS
    nxge4        0        RX        3             nxge4,vnic2
    nxge4        1        RX        1             vnic1

Information about TX rings is not shown by the show-phys subcommand.

Today, an administrator can specify that a VNIC must be hardware-based on the
RX side (using the -H option to dladm create-vnic). However, there is
no way for an administrator to specify

    o that a MAC client (VNIC or primary MAC client) should be software
      based, i.e. should not have any dedicated hardware resource,

    o that a MAC client should be hardware or software based on TX.

    o the number of RX or TX rings needed for a MAC client.

Proposal
========

This proposal gives administrative control over whether a MAC client
should be hardware-based or not (RX and TX) and also allows them to
specify the number of RX or TX rings that a MAC client needs, if it is
hardware-based.

We introduce two properties for a link:

    rxringcnt: The number of RX rings needed.
    txringcnt: The number of TX rings needed.

The values for these properties could be:

    0     : This link must not assigned any hardware rings of the
            specified type.

    x > 0 : This link needs x rings.

If the property is not specified for a link, the system will attempt
to maxmize the hardware resource utilization by making this MAC client
hardware-based depending on rings availability.

E.g:

    # dladm create-vnic -p rxringcnt=0 -l nxge0 vnic1

Will create vnic1 which will not be RX hardware-based.

    # dladm create-vnic -p txringcnt=2 -l nxge0 vnic2

Will create vnic2 that will be TX hardware-based with 2 TX rings.

    # dladm create-vnic -p rxringcnt=2,txringcnt=2 vnic3

Will create vnic3 which will be both RX and TX hardware-based with 2
RX and TX rings resp.

Modifying the RX or TX rings assigned to an existing link, say nxge0,
can be done using set-linkprop,

e.g. if nxge0 needs to be given 2 RX rings:

    # dladm set-linkprop -p rxringcnt=2 nxge0

or for a VNIC, say vnic1, as:

    # dladm set-linkprop -p txringcnt=2 vnic1

The rings assigned to a link can be viewed using show-linkprop as:

    # dladm show-linkprop nxge0
    LINK         PROPERTY        PERM VALUE        DEFAULT     POSSIBLE
    ...
    nxge0        rxringscnt      rw   2            --           0-4
    nxge0        txringscnt      rw   5            --           0-6
    ...


These new properties obsolete the -H option of dladm create-vnic (i.e.
the -H option will be removed).

Given that we allow specifying RX and TX rings for links, we need a way
to display how many rings are available for use. Additionally, we need to
provide the number of hardware-based MAC clients that can be created on the RX
and TX side.

We introduce 4 additional read-only properties to display this information:

    rxringavailcnt: The total number of RX rings available for use,
            i.e. not exclusively given to any MAC client.

    txringavailcnt: The total number of TX rings available for use.

    rxhwavailclnt:   The total number of additional RX hardware-based MAC
             clients that can be created. Each  of these
             hardware-based MAC client could have one or more RX
             rings assigned to them.

    txhwavailclnt:  The total number of additional TX hardware-based MAC
            clients that can be created. Each  of these
            hardware-based MAC clients could have one or more TX
            rings assigned to them.

The counts reflects the current utilization/availability of the resources
listed. The counts change as hardware-based MAC clients are created or
destroyed or when their hardware properties (i.e. rxringscnt and txringscnt)
are changed.

e.g:
    # dladm show-linkprop nxge0
    LINK         PROPERTY        PERM VALUE        DEFAULT      POSSIBLE
    ...
    nxge0        rxringavailcnt  r-   3            --           0-4
    nxge0        txringavailcnt  r-   5            --           0-6
    nxge0        rxhwavailclnt   r-   1            --           0-1
    nxge0        txhwavailclnt   r-   5            --           0-5
    ...

This indicates that there are 3 RX ring available (out of a maximum of 4) and
5 TX ring available out of 6. Additionally only one more RX hardware
based MAC client can be created while 5 more TX hardware-based MAC clients
can be created. We provide two separate counters (rxhwavailclnt and
txhwavailclnt) because the hardware virtualizes on RX based on groups
of rings (i.e. MAC address is added to a group) and to keep it consistent
on the TX we do the same.

dladm show-phys will be modified to display the ring information on TX and
RX.

e.g:
    # dladm show-phys -H nxge0
    LINK         RINGTYPE RINGS                CLIENTS
    nxge0        RX       0                    <mcast>
    nxge0        TX       0,5                  vnic1
    nxge0        RX       1-3                  vnic1

which means vnic1 has exclusive use of 3 RX rings and 5 TX rings.

---------------------------------
n.


!      dladm show-phys [-P] [[-p] -o field[,...]] [phys-link]

           Show the physical device and attributes of all  physical
           links,  or  of the named physical link. Without -P, only
--- 414,420 ----
           tion.


!      dladm show-phys [-P] [[-p] -o field[,...]] [-H] [phys-link]

           Show the physical device and attributes of all  physical
           links,  or  of the named physical link. Without -P, only
***************
*** 421,426 ****
--- 421,445 ----
           physical links that are available on the running  system
           are displayed.

+       -H
+            Show hardware resource usage, as returned by the NIC
+            driver. Output from -H displays the following elements:
+
+            LINK
+               A physical device corresponding to a NIC driver.
+
+            RINGTYPE
+               The type of the ring. This is either RX or TX.
+
+            RINGS
+               The ring index. A ring is an hardware resource, which
+               typically maps to a DMA channel, that can be programmed
+               for specific use. E.g. an RX ring can be programmed to
+               receive only packets belonging to a specific MAC address.
+
+            CLIENTS
+               MAC clients using the rings.
+
           -o field, --output=field

               A case-insensitive, comma-separated list  of  output
***************
*** 499,505 ****
               purge the link's configuration from the system.



-
       dladm create-aggr [-t] [-R root-dir] [-P policy] [-L mode]
       [-T time] [-u address] -l ether-link1 [-l ether-link2...]
       aggr-link
--- 518,523 ----
***************
*** 2432,2440 ****
--- 2450,2494 ----
           default is high.


+      rxringavailcnt

+          A read-only property that specifies the number of rings available
+        on the receive side.

+      rxringcnt

+          Specifies the number of receive rings side for the MAC client.
+        A value of 0 means this MAC client should not be assigned any RX
+        ring.  A non-0 value means reserve that many rings for this MAC
+        client, if available, and fail if not. If this property is not
+        specified the MAC client may get one RX ring, if available, or
+        will be software based.
+
+      rxhwavailclnt
+
+          A read-only property that specifies the number of additional
+        RX hardware-based MAC clients that can be created.
+
+      txringavailcnt
+
+          A read-only property that specifies the number of rings available
+        on the transmit side.
+
+      txringcnt
+
+          Specifies the number of transmit rings for the MAC client.
+        A value of 0 means this MAC client should not be assigned need any
+        TX ring. A non-0 value means reserve that many rings for this MAC
+        client, if available, and fail if not. If this property is not
+        specified the MAC client may get one TX ring, if available, or
+        will be software based.
+
+      txhwavailclnt
+
+          A read-only property that specifics the number of additional
+        TX hardware-based MAC clients that can be created.
+
+

  SunOS 5.11          Last change: 16 Mar 2009                   37



6. Resources and Schedule
    6.4. Steering Committee requested information
        6.4.1. Consolidation C-team Name:
                ON
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open

Reply via email to