Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI This information is Copyright 2009 Sun Microsystems 1. Introduction 1.1. Project/Component Working Name: Dynamic Ring Grouping on NICs 1.2. Name of Document Author/Supplier: Author: Venu Iyer 1.3 Date of This Document: 18 September, 2009 4. Technical Description I'm filing this fasttrack for Venu Iyer. The release binding is patch. The interface taxonomy is Uncommitted
Background ========== Project Crossbow (PSARC/2006/357) enables creating hardware-based MAC clients (some of these MAC clients are data links such as VNICs) both on the RX and TX side. We define hardware-based MAC clients as having dedicated hardware resources; a RX hardware-based MAC client will have one or more RX ring for exclusive use while a TX hardware-based MAC client will have one or more TX ring for exclusive use. MAC clients that are not hardware-based (RX or TX) share hardware resources with other MAC clients, such MAC clients will not have any TX or RX rings exclusively reserved for them. MAC clients may be hardware-based on RX, but not on TX (and vice-versa). Currently, when a NIC registers with MAC it informs MAC if it supports dedicated hardware RX or TX rings. MAC assigns hardware rings to MAC clients as groups, where a group may contain 1 or more hardware rings. dladm show-phys is currently used to show how RX rings are used by MAC clients. # dladm show-phys -H nxge4 LINK GROUP GROUPTYPE RINGS CLIENTS nxge4 0 RX 3 nxge4 nxge4 1 RX 1 vnic1 which says we have 1 RX hardware-based MAC client - vnic1 with 1 ring. nxge4, the primary MAC client, is using 3 rings, but will share these with any other MAC client that is subsequently created on the data link nxge4 (i.e. if vnic2 is created on nxge4, MAC clients vnic2 and nxge4 will share group 1, and hence the 3 rings), eg: # dladm show-phys -H nxge4 LINK GROUP GROUPTYPE RINGS CLIENTS nxge4 0 RX 3 nxge4,vnic2 nxge4 1 RX 1 vnic1 Information about TX rings is not shown by the show-phys subcommand. Today, an administrator can specify that a VNIC must be hardware-based on the RX side (using the -H option to dladm create-vnic). However, there is no way for an administrator to specify o that a MAC client (VNIC or primary MAC client) should be software based, i.e. should not have any dedicated hardware resource, o that a MAC client should be hardware or software based on TX. o the number of RX or TX rings needed for a MAC client. Proposal ======== This proposal gives administrative control over whether a MAC client should be hardware-based or not (RX and TX) and also allows them to specify the number of RX or TX rings that a MAC client needs, if it is hardware-based. We introduce two properties for a link: rxringcnt: The number of RX rings needed. txringcnt: The number of TX rings needed. The values for these properties could be: 0 : This link must not assigned any hardware rings of the specified type. x > 0 : This link needs x rings. If the property is not specified for a link, the system will attempt to maxmize the hardware resource utilization by making this MAC client hardware-based depending on rings availability. E.g: # dladm create-vnic -p rxringcnt=0 -l nxge0 vnic1 Will create vnic1 which will not be RX hardware-based. # dladm create-vnic -p txringcnt=2 -l nxge0 vnic2 Will create vnic2 that will be TX hardware-based with 2 TX rings. # dladm create-vnic -p rxringcnt=2,txringcnt=2 vnic3 Will create vnic3 which will be both RX and TX hardware-based with 2 RX and TX rings resp. Modifying the RX or TX rings assigned to an existing link, say nxge0, can be done using set-linkprop, e.g. if nxge0 needs to be given 2 RX rings: # dladm set-linkprop -p rxringcnt=2 nxge0 or for a VNIC, say vnic1, as: # dladm set-linkprop -p txringcnt=2 vnic1 The rings assigned to a link can be viewed using show-linkprop as: # dladm show-linkprop nxge0 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE ... nxge0 rxringscnt rw 2 -- 0-4 nxge0 txringscnt rw 5 -- 0-6 ... These new properties obsolete the -H option of dladm create-vnic (i.e. the -H option will be removed). Given that we allow specifying RX and TX rings for links, we need a way to display how many rings are available for use. Additionally, we need to provide the number of hardware-based MAC clients that can be created on the RX and TX side. We introduce 4 additional read-only properties to display this information: rxringavailcnt: The total number of RX rings available for use, i.e. not exclusively given to any MAC client. txringavailcnt: The total number of TX rings available for use. rxhwavailclnt: The total number of additional RX hardware-based MAC clients that can be created. Each of these hardware-based MAC client could have one or more RX rings assigned to them. txhwavailclnt: The total number of additional TX hardware-based MAC clients that can be created. Each of these hardware-based MAC clients could have one or more TX rings assigned to them. The counts reflects the current utilization/availability of the resources listed. The counts change as hardware-based MAC clients are created or destroyed or when their hardware properties (i.e. rxringscnt and txringscnt) are changed. e.g: # dladm show-linkprop nxge0 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE ... nxge0 rxringavailcnt r- 3 -- 0-4 nxge0 txringavailcnt r- 5 -- 0-6 nxge0 rxhwavailclnt r- 1 -- 0-1 nxge0 txhwavailclnt r- 5 -- 0-5 ... This indicates that there are 3 RX ring available (out of a maximum of 4) and 5 TX ring available out of 6. Additionally only one more RX hardware based MAC client can be created while 5 more TX hardware-based MAC clients can be created. We provide two separate counters (rxhwavailclnt and txhwavailclnt) because the hardware virtualizes on RX based on groups of rings (i.e. MAC address is added to a group) and to keep it consistent on the TX we do the same. dladm show-phys will be modified to display the ring information on TX and RX. e.g: # dladm show-phys -H nxge0 LINK RINGTYPE RINGS CLIENTS nxge0 RX 0 <mcast> nxge0 TX 0,5 vnic1 nxge0 RX 1-3 vnic1 which means vnic1 has exclusive use of 3 RX rings and 5 TX rings. --------------------------------- n. ! dladm show-phys [-P] [[-p] -o field[,...]] [phys-link] Show the physical device and attributes of all physical links, or of the named physical link. Without -P, only --- 414,420 ---- tion. ! dladm show-phys [-P] [[-p] -o field[,...]] [-H] [phys-link] Show the physical device and attributes of all physical links, or of the named physical link. Without -P, only *************** *** 421,426 **** --- 421,445 ---- physical links that are available on the running system are displayed. + -H + Show hardware resource usage, as returned by the NIC + driver. Output from -H displays the following elements: + + LINK + A physical device corresponding to a NIC driver. + + RINGTYPE + The type of the ring. This is either RX or TX. + + RINGS + The ring index. A ring is an hardware resource, which + typically maps to a DMA channel, that can be programmed + for specific use. E.g. an RX ring can be programmed to + receive only packets belonging to a specific MAC address. + + CLIENTS + MAC clients using the rings. + -o field, --output=field A case-insensitive, comma-separated list of output *************** *** 499,505 **** purge the link's configuration from the system. - dladm create-aggr [-t] [-R root-dir] [-P policy] [-L mode] [-T time] [-u address] -l ether-link1 [-l ether-link2...] aggr-link --- 518,523 ---- *************** *** 2432,2440 **** --- 2450,2494 ---- default is high. + rxringavailcnt + A read-only property that specifies the number of rings available + on the receive side. + rxringcnt + Specifies the number of receive rings side for the MAC client. + A value of 0 means this MAC client should not be assigned any RX + ring. A non-0 value means reserve that many rings for this MAC + client, if available, and fail if not. If this property is not + specified the MAC client may get one RX ring, if available, or + will be software based. + + rxhwavailclnt + + A read-only property that specifies the number of additional + RX hardware-based MAC clients that can be created. + + txringavailcnt + + A read-only property that specifies the number of rings available + on the transmit side. + + txringcnt + + Specifies the number of transmit rings for the MAC client. + A value of 0 means this MAC client should not be assigned need any + TX ring. A non-0 value means reserve that many rings for this MAC + client, if available, and fail if not. If this property is not + specified the MAC client may get one TX ring, if available, or + will be software based. + + txhwavailclnt + + A read-only property that specifics the number of additional + TX hardware-based MAC clients that can be created. + + SunOS 5.11 Last change: 16 Mar 2009 37 6. Resources and Schedule 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: ON 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open