> > Few comments inline. Thanks for the feedback Ilya.
> > > The 'options:n_rxq_desc' and 'n_txq_desc' fields allow the number of rx > > and tx descriptors for dpdk ports to be modified. By default the values > > are set to 2048, but can be modified to an integer between 1 and 4096 > > that is a power of two. The values can be modified at runtime, however > > require the NIC to restart when changed. > > > > Signed-off-by: Ciara Loftus <ciara.loftus at intel.com> > > > > --- > > v3: > > * Make queue sizes per-port rather than global > > * Check if queue size is power of 2 - fail if so. > > > > v2: > > * Rebase > > > > INSTALL.DPDK-ADVANCED.md | 16 ++++++++++++++-- > > NEWS | 2 ++ > > lib/netdev-dpdk.c | 48 > +++++++++++++++++++++++++++++++++++++++++++----- > > vswitchd/vswitch.xml | 22 ++++++++++++++++++++++ > > 4 files changed, 81 insertions(+), 7 deletions(-) > > > > diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md > > index d7b9873..488e84f 100644 > > --- a/INSTALL.DPDK-ADVANCED.md > > +++ b/INSTALL.DPDK-ADVANCED.md > > @@ -257,7 +257,19 @@ needs to be affinitized accordingly. > > The rx queues are assigned to pmd threads on the same NUMA node in a > > round-robin fashion. > > > > -### 4.4 Exact Match Cache > > +### 4.4 DPDK Physical Port Queue Sizes > > + `ovs-vsctl set Interface dpdk0 options:n_rxq_desc=<integer>` > > + `ovs-vsctl set Interface dpdk0 options:n_txq_desc=<integer>` > > + > > + The command above sets the number of rx/tx descriptors that the NIC > > + associated with dpdk0 will be initialised with. > > + > > + Different 'n_rxq_desc' and 'n_txq_desc' configurations yield different > > + benefits in terms of throughput and latency for different scenarios. > > + Generally, smaller queue sizes can have a positive impact for latency at > the > > + expense of throughput. The opposite is often true for larger queue sizes. > > Here we can mention that increasing the number of rx descriptors may lead > to performance degradation because of using non-vectorized rx functions. > At least this is true for i40e and maybe ixgbe dpdk drivers. Setting > 'n_rxq_desc=4096' for them will lead to disabling of vectorized rx. It seems the same applies for ixgbe (IXGBE_MAX_RING_DESC=4096) http://dpdk.org/doc/guides/nics/ixgbe.html#rx-constraints I will include this info in the next version. > > > + > > +### 4.5 Exact Match Cache > > > > Each pmd thread contains one EMC. After initial flow setup in the > > datapath, the EMC contains a single table and provides the lowest level > > @@ -274,7 +286,7 @@ needs to be affinitized accordingly. > > avoiding datapath classifier lookups is to have multiple pmd threads > > running. This can be done as described in section 4.2. > > > > -### 4.5 Rx Mergeable buffers > > +### 4.6 Rx Mergeable buffers > > > > Rx Mergeable buffers is a virtio feature that allows chaining of multiple > > virtio descriptors to handle large packet sizes. As such, large packets > > diff --git a/NEWS b/NEWS > > index 21ab538..901886d 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -125,6 +125,8 @@ v2.6.0 - xx xxx xxxx > > * Remove dpdkvhostcuse port type. > > * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7) > > * 'dpdkvhostuserclient' port type. > > + * New option 'n_rxq_desc' and 'n_txq_desc' fields for DPDK interfaces > > + which set the number of rx and tx descriptors to use for the given > port. > > - Increase number of registers to 16. > > - ovs-benchmark: This utility has been removed due to lack of use and > > bitrot. > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 89bdc4d..228993f 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -132,8 +132,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / > ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF)) > > > > #define SOCKET0 0 > > > > -#define NIC_PORT_RX_Q_SIZE 2048 /* Size of Physical NIC RX Queue, > Max (n+32<=4096)*/ > > -#define NIC_PORT_TX_Q_SIZE 2048 /* Size of Physical NIC TX Queue, > Max (n+32<=4096)*/ > > +#define NIC_PORT_DEFAULT_RXQ_SIZE 2048 /* Default size of Physical > NIC RXQ */ > > +#define NIC_PORT_DEFAULT_TXQ_SIZE 2048 /* Default size of Physical > NIC TXQ */ > > +#define NIC_PORT_MAX_Q_SIZE 4096 /* Maximum size of Physical > NIC Queue */ > > > > #define OVS_VHOST_MAX_QUEUE_NUM 1024 /* Maximum number of > vHost TX queues. */ > > #define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not > initialized. */ > > @@ -372,6 +373,12 @@ struct netdev_dpdk { > > int requested_mtu; > > int requested_n_txq; > > int requested_n_rxq; > > + int requested_rxq_size; > > + int requested_txq_size; > > + > > + /* Number of rx/tx descriptors for physical devices */ > > + int rxq_size; > > + int txq_size; > > > > /* Socket ID detected when vHost device is brought up */ > > int requested_socket_id; > > @@ -646,7 +653,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > *dev, int n_rxq, int n_txq) > > } > > > > for (i = 0; i < n_txq; i++) { > > - diag = rte_eth_tx_queue_setup(dev->port_id, i, > NIC_PORT_TX_Q_SIZE, > > + diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size, > > dev->socket_id, NULL); > > if (diag) { > > VLOG_INFO("Interface %s txq(%d) setup error: %s", > > @@ -662,7 +669,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > *dev, int n_rxq, int n_txq) > > } > > > > for (i = 0; i < n_rxq; i++) { > > - diag = rte_eth_rx_queue_setup(dev->port_id, i, > NIC_PORT_RX_Q_SIZE, > > + diag = rte_eth_rx_queue_setup(dev->port_id, i, dev->rxq_size, > > dev->socket_id, NULL, > > dev->dpdk_mp->mp); > > if (diag) { > > @@ -837,6 +844,10 @@ netdev_dpdk_init(struct netdev *netdev, > unsigned int port_no, > > netdev->n_txq = NR_QUEUE; > > dev->requested_n_rxq = netdev->n_rxq; > > dev->requested_n_txq = netdev->n_txq; > > + dev->rxq_size = NIC_PORT_DEFAULT_RXQ_SIZE; > > + dev->txq_size = NIC_PORT_DEFAULT_TXQ_SIZE; > > + dev->requested_rxq_size = dev->rxq_size; > > + dev->requested_txq_size = dev->txq_size; > > > > /* Initialize the flow control to NULL */ > > memset(&dev->fc_conf, 0, sizeof dev->fc_conf); > > @@ -1051,6 +1062,8 @@ netdev_dpdk_get_config(const struct netdev > *netdev, struct smap *args) > > smap_add_format(args, "configured_rx_queues", "%d", netdev- > >n_rxq); > > smap_add_format(args, "requested_tx_queues", "%d", dev- > >requested_n_txq); > > smap_add_format(args, "configured_tx_queues", "%d", netdev- > >n_txq); > > + smap_add_format(args, "rxq_descriptors", "%d", dev->rxq_size); > > + smap_add_format(args, "txq_descriptors", "%d", dev->txq_size); > > smap_add_format(args, "mtu", "%d", dev->mtu); > > ovs_mutex_unlock(&dev->mutex); > > > > @@ -1069,6 +1082,21 @@ dpdk_set_rxq_config(struct netdev_dpdk *dev, > const struct smap *args) > > } > > } > > > > +static void > > +dpdk_process_queue_size(struct netdev *netdev, const struct smap > *args, > > + char *flag, int *new_size) > > +{ > > + int queue_size; > > + > > + queue_size = smap_get_int(args, flag, 0); > > + if (queue_size > 0 && queue_size <= NIC_PORT_MAX_Q_SIZE > > + && rte_is_power_of_2(queue_size) > > I'm suggesting to use OVS internal functions to do the check: 'IS_POW2()' > macro or function 'is_pow2()'. > > > + && queue_size != *new_size) { > > + *new_size = queue_size; > > + netdev_request_reconfigure(netdev); > > + } > > Also, some sane error message would be nice in case of wrong value. This is generally avoided for set_config since it is called so often. An error isn't printed in when setting n_rxq for example. If the user sets a bogus value you can end up with many prints of the same log. And these logs again later when you set another value eg. n_rxq. For this reason I don't plan to make this change. Thanks, Ciara > > > +} > > + > > static int > > netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args) > > { > > @@ -1078,6 +1106,11 @@ netdev_dpdk_set_config(struct netdev > *netdev, const struct smap *args) > > > > dpdk_set_rxq_config(dev, args); > > > > + dpdk_process_queue_size(netdev, args, "n_rxq_desc", > > + &dev->requested_rxq_size); > > + dpdk_process_queue_size(netdev, args, "n_txq_desc", > > + &dev->requested_txq_size); > > + > > /* Flow control support is only available for DPDK Ethernet ports. */ > > bool rx_fc_en = false; > > bool tx_fc_en = false; > > @@ -2923,7 +2956,9 @@ netdev_dpdk_reconfigure(struct netdev > *netdev) > > > > if (netdev->n_txq == dev->requested_n_txq > > && netdev->n_rxq == dev->requested_n_rxq > > - && dev->mtu == dev->requested_mtu) { > > + && dev->mtu == dev->requested_mtu > > + && dev->rxq_size == dev->requested_rxq_size > > + && dev->txq_size == dev->requested_txq_size) { > > /* Reconfiguration is unnecessary */ > > > > goto out; > > @@ -2938,6 +2973,9 @@ netdev_dpdk_reconfigure(struct netdev > *netdev) > > netdev->n_txq = dev->requested_n_txq; > > netdev->n_rxq = dev->requested_n_rxq; > > > > + dev->rxq_size = dev->requested_rxq_size; > > + dev->txq_size = dev->requested_txq_size; > > + > > rte_free(dev->tx_q); > > err = dpdk_eth_dev_init(dev); > > netdev_dpdk_alloc_txq(dev, netdev->n_txq); > > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > > index e73023d..1e96ada 100644 > > --- a/vswitchd/vswitch.xml > > +++ b/vswitchd/vswitch.xml > > @@ -2375,6 +2375,28 @@ > > Only supported by dpdkvhostuserclient interfaces. > > </p> > > </column> > > + > > + <column name="options" key="n_rxq_desc" > > + type='{"type": "integer", "minInteger": 1, "maxInteger": > > 4096}'> > > + <p> > > + Specifies the rx queue size (number rx descriptors) for dpdk > > ports. > > + The value must be a multiple of 2, less than 4096 and supported > > s/multiple/power/ > > > + by the hardware of the device being configured. > > + If not specified or an incorrect value is specified, 2048 rx > > + descriptors will be used by default. > > + </p> > > + </column> > > + > > + <column name="options" key="n_txq_desc" > > + type='{"type": "integer", "minInteger": 1, "maxInteger": > > 4096}'> > > + <p> > > + Specifies the tx queue size (number tx descriptors) for dpdk > > ports. > > + The value must be a multiple of 2, less than 4096 and supported > > s/multiple/power/ > > > + by the hardware of the device being configured. > > + If not specified or an incorrect value is specified, 2048 tx > > + descriptors will be used by default. > > + </p> > > + </column> > > </group> > > > > <group title="MTU"> > > -- > > 2.4.3 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev