Please find attached the revised text with changes addressing your comments. Let me know if I have missed anything.
thanks, -venu On Mon, 28 Sep 2009, Sebastien Roy wrote: > > On Mon, 2009-09-28 at 12:38 -0700, venugopal iyer wrote: >> Thanks, Seb. >> >> On Mon, 28 Sep 2009, Sebastien Roy wrote: >> >>> On Fri, 2009-09-18 at 12:29 -0700, Kais Belgaied wrote: >>>> Background >>>> ========== >>>> >>>> Project Crossbow (PSARC/2006/357) enables creating hardware-based MAC >>>> clients (some of these MAC clients are data links such as VNICs) both on >>>> the RX and TX side. We define hardware-based MAC clients as having >>>> dedicated >>>> hardware resources; a RX hardware-based MAC client will have one or more RX >>>> ring for exclusive use while a TX hardware-based MAC client will have one >>>> or >>>> more TX ring for exclusive use. MAC clients that are not hardware-based >>>> (RX or TX) share hardware resources with other MAC clients, such MAC >>>> clients will not have any TX or RX rings exclusively reserved for them. >>>> MAC clients may be hardware-based on RX, but not on TX (and vice-versa). >>> >>> To me, that last sentence is essentially saying, "MAC clients may not be >>> hardware-based on both RX and TX". Is that what you meant? >> >> No. It was just to note that it *need* not be h/w on both. How about >> "MAC clients could be hardware based on either TX, RX, both or neither" >> or some such. > > Yes, that would be more clear. > >>>> If the property is not specified for a link, the system will attempt >>>> to maxmize the hardware resource utilization by making this MAC client >>>> hardware-based depending on rings availability. >>> >>> This is slightly ambiguous to me. Does that mean that if I neglect to >>> override the default, that the first MAC client gets all of the hardware >>> resources while subsequent clients get none? >> >> >> That's not the case. Today, when we create a VNIC and don't specify the >> -H option we try to give it one ring (assuming we have free rings). >> That won't change, if we don't specify rxringcnt then we will try to >> give the new MAC client 1 ring and make it hardware based. We do >> this till there are free rings available. When there aren't any >> free rings, MAC clients that don't specify this property will be >> software based. > > Okay, that's fine. Please clarify that in the spec because your use of > the word "maximize" could imply differently. > >>> I would think that unless >> i otherwise specified, a sensible default behavior would result in evenly >>> distributing resources among clients. >>> >>>> dladm show-phys will be modified to display the ring information on TX and >>>> RX. >>>> >>>> e.g: >>>> # dladm show-phys -H nxge0 >>>> LINK RINGTYPE RINGS CLIENTS >>>> nxge0 RX 0 <mcast> >>>> nxge0 TX 0,5 vnic1 >>>> nxge0 RX 1-3 vnic1 >>>> >>>> which means vnic1 has exclusive use of 3 RX rings and 5 TX rings. >>> >>> I would interpret the above as "vnic1 has transmit rings 0 and >>> 5" (because of the comma), and "vnic1 has receive rings 1, 2, and >>> 3" (because of the hyphen). Am I reading that right? >> >> yes. > > So it does not have access to 5 TX rings as you state, but rather 2. > >>> >>> How do I view the total number of hardware resources available on nxge0? >> >> the read-only rings property (rxringavailcnt, txringavailcnt) we are >> introducing is for that purpose. >> >> let me know if I missed answering any of your questions. > > Got it, thanks. > > +1 from me. > > -Seb > > > -------------- next part -------------- Background ========== Project Crossbow (PSARC/2006/357) enables creating hardware-based MAC clients (some of these MAC clients are data links such as VNICs) both on the RX and TX side. We define hardware-based MAC clients as having dedicated hardware resources; a RX hardware-based MAC client will have one or more RX ring for exclusive use while a TX hardware-based MAC client will have one or more TX ring for exclusive use. MAC clients that are not hardware-based (RX or TX) share hardware resources with other MAC clients, such MAC clients will not have any TX or RX rings exclusively reserved for them. MAC clients could be hardware based on either TX, RX, both or neither. Currently, when a NIC registers with MAC it informs MAC if it supports dedicated hardware RX or TX rings. MAC assigns hardware rings to MAC clients as groups, where a group may contain 1 or more hardware rings. dladm show-phys is currently used to show how RX rings are used by MAC clients. # dladm show-phys -H nxge4 LINK GROUP GROUPTYPE RINGS CLIENTS nxge4 0 RX 3 nxge4 nxge4 1 RX 1 vnic1 which says we have 1 RX hardware-based MAC client - vnic1 with 1 ring. nxge4, the primary MAC client, is using 3 rings, but will share these with any other MAC client that is subsequently created on the data link nxge4 (i.e. if vnic2 is created on nxge4, MAC clients vnic2 and nxge4 will share group 1, and hence the 3 rings), eg: # dladm show-phys -H nxge4 LINK GROUP GROUPTYPE RINGS CLIENTS nxge4 0 RX 3 nxge4,vnic2 nxge4 1 RX 1 vnic1 Information about TX rings is not shown by the show-phys subcommand. Today, an administrator can specify that a VNIC must be hardware-based on the RX side (using the -H option to dladm create-vnic). However, there is no way for an administrator to specify o that a MAC client (VNIC or primary MAC client) should be software based, i.e. should not have any dedicated hardware resource, o that a MAC client should be hardware or software based on TX. o the number of RX or TX rings needed for a MAC client. Proposal ======== This proposal gives administrative control over whether a MAC client should be hardware-based or not (RX and TX) and also allows them to specify the number of RX or TX rings that a MAC client needs, if it is hardware-based. We introduce two properties for a link: rxrings: The number of RX rings needed. txrings: The number of TX rings needed. The values for these properties could be: 0 : This link must not assigned any hardware rings of the specified type. x > 0 : This link needs x rings. If the property is not specified for a link, the system will attempt to make the MAC client hardware-based depending on availability of rings. If rings are not available the MAC client will be software based. E.g: # dladm create-vnic -p rxrings=0 -l nxge0 vnic1 Will create vnic1 which will not be RX hardware-based. # dladm create-vnic -p txrings=2 -l nxge0 vnic2 Will create vnic2 that will be TX hardware-based with 2 TX rings. # dladm create-vnic -p rxrings=2,txrings=2 vnic3 Will create vnic3 which will be both RX and TX hardware-based with 2 RX and TX rings resp. Modifying the RX or TX rings assigned to an existing link, say nxge0, can be done using set-linkprop, e.g. if nxge0 needs to be given 2 RX rings: # dladm set-linkprop -p rxrings=2 nxge0 or for a VNIC, say vnic1, as: # dladm set-linkprop -p txrings=2 vnic1 The rings assigned to a link can be viewed using show-linkprop as: # dladm show-linkprop nxge0 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE ... nxge0 rxrings rw 2 -- 0-4 nxge0 txrings rw 5 -- 0-6 ... These new properties obsolete the -H option of dladm create-vnic (i.e. the -H option will be removed). Given that we allow specifying RX and TX rings for links, we need a way to display how many rings are available for use. Additionally, we need to provide the number of hardware-based MAC clients that can be created on the RX and TX side. We introduce 4 additional read-only properties to display this information: rxringsavail : The total number of RX rings available for use, i.e. not exclusively given to any MAC client. txringsavail : The total number of TX rings available for use. rxhwclntavail: The total number of additional RX hardware-based MAC clients that can be created. Each of these hardware-based MAC client could have one or more RX rings assigned to them. txhwclntavail: The total number of additional TX hardware-based MAC clients that can be created. Each of these hardware-based MAC clients could have one or more TX rings assigned to them. The counts reflects the current utilization/availability of the resources listed. The counts change as hardware-based MAC clients are created or destroyed or when their hardware properties (i.e. rxrings and txrings) are changed. e.g: # dladm show-linkprop nxge0 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE ... nxge0 rxringsavail r- 3 -- 0-4 nxge0 txringsavail r- 5 -- 0-6 nxge0 rxhwclntavail r- 1 -- 0-1 nxge0 txhwclntavail r- 5 -- 0-5 ... This indicates that there are 3 RX ring available (out of a maximum of 4) and 5 TX ring available out of 6. Additionally only one more RX hardware based MAC client can be created while 5 more TX hardware-based MAC clients can be created. We provide two separate counters (rxhwclntavail and txhwclntavail) because the hardware virtualizes on RX based on groups of rings (i.e. MAC address is added to a group) and to keep it consistent on the TX we do the same. dladm show-phys will be modified to display the ring information on TX and RX. e.g: # dladm show-phys -H nxge0 LINK RINGTYPE RINGS CLIENTS nxge0 RX 0 <mcast> nxge0 TX 0,5 vnic1 nxge0 RX 1-3 vnic1 which means vnic1 has exclusive use of 3 RX rings and 2 TX rings. --------------------------------- dladm(1m) ! dladm show-phys [-P] [[-p] -o field[,...]] [phys-link] Show the physical device and attributes of all physical links, or of the named physical link. Without -P, only --- 414,420 ---- tion. ! dladm show-phys [-P] [[-p] -o field[,...]] [-H] [phys-link] Show the physical device and attributes of all physical links, or of the named physical link. Without -P, only *************** *** 421,426 **** --- 421,445 ---- physical links that are available on the running system are displayed. + -H + Show hardware resource usage, as returned by the NIC + driver. Output from -H displays the following elements: + + LINK + A physical device corresponding to a NIC driver. + + RINGTYPE + The type of the ring. This is either RX or TX. + + RINGS + The ring index. A ring is an hardware resource, which + typically maps to a DMA channel, that can be programmed + for specific use. E.g. an RX ring can be programmed to + receive only packets belonging to a specific MAC address. + + CLIENTS + MAC clients using the rings. + -o field, --output=field A case-insensitive, comma-separated list of output *************** *** 499,505 **** purge the link's configuration from the system. - dladm create-aggr [-t] [-R root-dir] [-P policy] [-L mode] [-T time] [-u address] -l ether-link1 [-l ether-link2...] aggr-link --- 518,523 ---- *************** *** 2432,2440 **** --- 2450,2494 ---- default is high. + rxringsavail + A read-only property that specifies the number of rings available + on the receive side. + rxrings + Specifies the number of receive rings side for the MAC client. + A value of 0 means this MAC client should not be assigned any RX + ring. A non-0 value means reserve that many rings for this MAC + client, if available, and fail if not. If this property is not + specified the MAC client may get one RX ring, if available, or + will be software based. + + rxhwclntavail + + A read-only property that specifies the number of additional + RX hardware-based MAC clients that can be created. + + txringsavail + + A read-only property that specifies the number of rings available + on the transmit side. + + txrings + + Specifies the number of transmit rings for the MAC client. + A value of 0 means this MAC client should not be assigned need any + TX ring. A non-0 value means reserve that many rings for this MAC + client, if available, and fail if not. If this property is not + specified the MAC client may get one TX ring, if available, or + will be software based. + + txhwclntavail + + A read-only property that specifics the number of additional + TX hardware-based MAC clients that can be created. + + SunOS 5.11 Last change: 16 Mar 2009 37