On Mon, Jun 25, 2018 at 11:04 AM, Amritha Nambiar <amritha.namb...@intel.com> wrote: > Signed-off-by: Amritha Nambiar <amritha.namb...@intel.com>
Acked-by: Tom Herbert <t...@quantonium.net> > --- > Documentation/ABI/testing/sysfs-class-net-queues | 11 ++++ > Documentation/networking/scaling.txt | 57 > ++++++++++++++++++---- > 2 files changed, 58 insertions(+), 10 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-class-net-queues > b/Documentation/ABI/testing/sysfs-class-net-queues > index 0c0df91..978b763 100644 > --- a/Documentation/ABI/testing/sysfs-class-net-queues > +++ b/Documentation/ABI/testing/sysfs-class-net-queues > @@ -42,6 +42,17 @@ Description: > network device transmit queue. Possible vaules depend on the > number of available CPU(s) in the system. > > +What: /sys/class/<iface>/queues/tx-<queue>/xps_rxqs > +Date: June 2018 > +KernelVersion: 4.18.0 > +Contact: netdev@vger.kernel.org > +Description: > + Mask of the receive queue(s) currently enabled to participate > + into the Transmit Packet Steering packet processing flow for > this > + network device transmit queue. Possible values depend on the > + number of available receive queue(s) in the network device. > + Default is disabled. > + > What: > /sys/class/<iface>/queues/tx-<queue>/byte_queue_limits/hold_time > Date: November 2011 > KernelVersion: 3.3 > diff --git a/Documentation/networking/scaling.txt > b/Documentation/networking/scaling.txt > index f55639d..8336116 100644 > --- a/Documentation/networking/scaling.txt > +++ b/Documentation/networking/scaling.txt > @@ -366,8 +366,13 @@ XPS: Transmit Packet Steering > > Transmit Packet Steering is a mechanism for intelligently selecting > which transmit queue to use when transmitting a packet on a multi-queue > -device. To accomplish this, a mapping from CPU to hardware queue(s) is > -recorded. The goal of this mapping is usually to assign queues > +device. This can be accomplished by recording two kinds of maps, either > +a mapping of CPU to hardware queue(s) or a mapping of receive queue(s) > +to hardware transmit queue(s). > + > +1. XPS using CPUs map > + > +The goal of this mapping is usually to assign queues > exclusively to a subset of CPUs, where the transmit completions for > these queues are processed on a CPU within this set. This choice > provides two benefits. First, contention on the device queue lock is > @@ -377,12 +382,35 @@ transmit queue). Secondly, cache miss rate on transmit > completion is > reduced, in particular for data cache lines that hold the sk_buff > structures. > > -XPS is configured per transmit queue by setting a bitmap of CPUs that > -may use that queue to transmit. The reverse mapping, from CPUs to > -transmit queues, is computed and maintained for each network device. > -When transmitting the first packet in a flow, the function > -get_xps_queue() is called to select a queue. This function uses the ID > -of the running CPU as a key into the CPU-to-queue lookup table. If the > +2. XPS using receive queues map > + > +This mapping is used to pick transmit queue based on the receive > +queue(s) map configuration set by the administrator. A set of receive > +queues can be mapped to a set of transmit queues (many:many), although > +the common use case is a 1:1 mapping. This will enable sending packets > +on the same queue associations for transmit and receive. This is useful for > +busy polling multi-threaded workloads where there are challenges in > +associating a given CPU to a given application thread. The application > +threads are not pinned to CPUs and each thread handles packets > +received on a single queue. The receive queue number is cached in the > +socket for the connection. In this model, sending the packets on the same > +transmit queue corresponding to the associated receive queue has benefits > +in keeping the CPU overhead low. Transmit completion work is locked into > +the same queue-association that a given application is polling on. This > +avoids the overhead of triggering an interrupt on another CPU. When the > +application cleans up the packets during the busy poll, transmit completion > +may be processed along with it in the same thread context and so result in > +reduced latency. > + > +XPS is configured per transmit queue by setting a bitmap of > +CPUs/receive-queues that may use that queue to transmit. The reverse > +mapping, from CPUs to transmit queues or from receive-queues to transmit > +queues, is computed and maintained for each network device. When > +transmitting the first packet in a flow, the function get_xps_queue() is > +called to select a queue. This function uses the ID of the receive queue > +for the socket connection for a match in the receive queue-to-transmit queue > +lookup table. Alternatively, this function can also use the ID of the > +running CPU as a key into the CPU-to-queue lookup table. If the > ID matches a single queue, that is used for transmission. If multiple > queues match, one is selected by using the flow hash to compute an index > into the set. > @@ -404,11 +432,15 @@ acknowledged. > > XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by > default for SMP). The functionality remains disabled until explicitly > -configured. To enable XPS, the bitmap of CPUs that may use a transmit > -queue is configured using the sysfs file entry: > +configured. To enable XPS, the bitmap of CPUs/receive-queues that may > +use a transmit queue is configured using the sysfs file entry: > > +For selection based on CPUs map: > /sys/class/net/<dev>/queues/tx-<n>/xps_cpus > > +For selection based on receive-queues map: > +/sys/class/net/<dev>/queues/tx-<n>/xps_rxqs > + > == Suggested Configuration > > For a network device with a single transmission queue, XPS configuration > @@ -421,6 +453,11 @@ best CPUs to share a given queue are probably those that > share the cache > with the CPU that processes transmit completions for that queue > (transmit interrupts). > > +For transmit queue selection based on receive queue(s), XPS has to be > +explicitly configured mapping receive-queue(s) to transmit queue(s). If the > +user configuration for receive-queue map does not apply, then the transmit > +queue is selected based on the CPUs map. > + > Per TX Queue rate limitation: > ============================= > >