On 05/29/2017 02:31 PM, Ilya Maximets wrote: > On 29.05.2017 16:26, Kevin Traynor wrote: >> On 05/29/2017 01:22 PM, Ilya Maximets wrote: >>> On 26.05.2017 20:14, Kevin Traynor wrote: >>>> On 05/26/2017 03:55 PM, Ilya Maximets wrote: >>>>> On 10.03.2017 07:27, Daniele Di Proietto wrote: >>>>>> 2017-02-21 6:49 GMT-08:00 Ilya Maximets <i.maxim...@samsung.com>: >>>>>>> Reconfiguration of HW NICs may lead to packet drops. >>>>>>> In current model all physical ports will be reconfigured each >>>>>>> time number of PMD threads changed. Since we not stopping >>>>>>> threads on pmd-cpu-mask changes, this patch will help to further >>>>>>> decrease port's downtime by setting the maximum possible number >>>>>>> of wanted tx queues to avoid unnecessary reconfigurations. >>>>>>> >>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> >>>>>> >>>>>> I haven't thought this through a lot, but the last big series we pushed >>>>>> on master went exactly in the opposite direction, i.e. created one txq >>>>>> for each thread in the datapath. >>>>>> >>>>>> I thought this was a good idea because: >>>>>> >>>>>> * On some systems with hyperthreading we can have a lot of cpus (we >>>>>> received >>>>>> reports of systems with 72 cores). If you want to use only a couple >>>>>> of cores >>>>>> you're still forced to have a lot of unused txqs, which prevent you >>>>>> from having >>>>>> lockless tx. >>>>>> * We thought that reconfiguring the number of pmds would not be a >>>>>> frequent >>>>>> operation. >>>>>> >>>>>> Why do you want to reconfigure the threads that often? Is it to be >>>>>> able to support more throughput quickly? >>>>> >>>>> Yes. >>>>> >>>>>> In this case I think we shouldn't use the number of cpus, >>>>>> but something else coming from the user, so that the administrator can >>>>>> balance how >>>>>> quickly pmd threads can be reconfigured vs how many txqs should be on >>>>>> the system. >>>>>> I'm not sure how to make this user friendly though. >>>>>> >>>>>> What do you think? >>>>> >>>>> Right now I'm thinking about combined solution which will respect >>>>> both issues (too big number of TXQs and frequent device reconfiguration). >>>>> I think, we can implement additional function to get port's limitations. >>>>> For now we can request the maximum number of TX queues from netdev and >>>>> use it while number of cores less then number of queues. >>>>> Something like this: >>>>> >>>>> lib/netdev-dpdk.c: >>>>> uint32_t netdev_dpdk_get_max_txqs(struct netdev *netdev) >>>>> { >>>>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >>>>> struct rte_eth_dev_info dev_info; >>>>> >>>>> ovs_mutex_lock(&dev->mutex); >>>>> rte_eth_dev_info_get(dev->port_id, &dev_info); >>>>> ovs_mutex_unlock(&dev->mutex); >>>>> >>>>> return dev_info.max_tx_queues; >>>>> } >>>>> >>>>> lib/dpif-netdev.c:reconfigure_datapath(): >>>>> >>>>> <-----> >>>>> max_tx_queues = netdev_get_max_txqs(port->netdev); >>>>> number_of_threads = cmap_count(&dp->poll_threads); >>>>> wanted_txqs = MAX(max_tx_queues, number_of_threads); >>>>> <-----> >>>>> >>>>> In this implementation there will be no additional locking if number of >>>>> threads less or equal to maximum possible number of tx queues in HW. >>>>> >>>>> What do you think? Ian? Darrell? >>>>> >>>> >>>> I'm not sure about using rte_eth_dev_info_get() as IIRC there was >>>> problems previously with it reporting a number, but then when you went >>>> to configure them they were not all available depending on mode. That >>>> was why the trial and error queue configure was put in. >>>> >>>> What about replacing 'max_tx_queues' above with a number like 16 that is >>>> likely to be supported by the ports and unlikely be exceeded by >>>> number_of_threads? >>>> >>>> Kevin. >>> >>> Hi Kevin. Thank you for reminding me of this issue. >>> >>> But I think that magic numbers is not a good solution. >>> >>> One issue in my first implementation is that desired number of queues is >>> not actually the same as required number. >>> >>> How about something like this: >>> <-----------------------------------------------------------------> >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>> index 97f72d3..1a18e4f 100644 >>> --- a/lib/dpif-netdev.c >>> +++ b/lib/dpif-netdev.c >>> @@ -3448,7 +3448,7 @@ reconfigure_datapath(struct dp_netdev *dp) >>> { >>> struct dp_netdev_pmd_thread *pmd; >>> struct dp_netdev_port *port; >>> - int wanted_txqs; >>> + int needed_txqs, wanted_txqs; >>> >>> dp->last_reconfigure_seq = seq_read(dp->reconfigure_seq); >>> >>> @@ -3456,7 +3456,15 @@ reconfigure_datapath(struct dp_netdev *dp) >>> * on the system and the user configuration. */ >>> reconfigure_pmd_threads(dp); >>> >>> - wanted_txqs = cmap_count(&dp->poll_threads); >>> + /* We need 1 Tx queue for each thread to avoid locking, but we will try >>> + * to allocate the maximum possible value to minimize the number of >>> port >>> + * reconfigurations. */ >>> + needed_txqs = cmap_count(&dp->poll_threads); >>> + /* (n_cores + 1) is the maximum that we might need to have. >>> + * Additional queue is for non-PMD threads. */ >>> + wanted_txqs = ovs_numa_get_n_cores(); >>> + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); >>> + wanted_txqs++; >>> >>> /* The number of pmd threads might have changed, or a port can be new: >>> * adjust the txqs. */ >>> @@ -3469,9 +3477,13 @@ reconfigure_datapath(struct dp_netdev *dp) >>> >>> /* Check for all the ports that need reconfiguration. We cache this in >>> * 'port->reconfigure', because netdev_is_reconf_required() can change >>> at >>> - * any time. */ >>> + * any time. Also mark for reconfiguration all ports which will likely >>> + * change their 'dynamic_txqs' parameter. It's required to stop using >>> + * them before changing this setting. */ >>> HMAP_FOR_EACH (port, node, &dp->ports) { >>> - if (netdev_is_reconf_required(port->netdev)) { >>> + if (netdev_is_reconf_required(port->netdev) >>> + || (port->dynamic_txqs >>> + != netdev_n_txq(port->netdev) < needed_txqs)) { >> >> I'm not quite sure why there needs to be a reconfigure here. It will >> again try with the same wanted_txqs. Is it not enough to just update the >> dynamic_txqs flag as is done later ? > > It's not enough because reconfiguration will not be requested if we're > trying to set up same number of queues second time. >
My question is - why request a reconfigure at all based on needed_txqs? We already configured requesting wanted_txqs (=ovs_numa_get_n_cores()) and now we would configure again requesting the same num wanted_txqs, regardless of the new needed_txqs value. Could we just update the dynamic_txqs flag (as you have below) to reflect any new difference between configured txqs and needed_txqs. Maybe I missed some part? Kevin. > See lib/netdev-dpdk.c:netdev_dpdk_set_tx_multiq() for details: > > if (dev->requested_n_txq == n_txq) { > > goto out; > > } > > > >>> port->need_reconfigure = true; >>> } >>> } >>> @@ -3505,7 +3517,7 @@ reconfigure_datapath(struct dp_netdev *dp) >>> seq_change(dp->port_seq); >>> port_destroy(port); >>> } else { >>> - port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs; >>> + port->dynamic_txqs = netdev_n_txq(port->netdev) < needed_txqs; >>> } >>> } >>> >>> <-----------------------------------------------------------------> >>> >>> Best regards, Ilya Maximets. >>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Daniele >>>>>> >>>>>>> --- >>>>>>> lib/dpif-netdev.c | 6 +++++- >>>>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >>>>>>> index 6e575ab..e2b4f39 100644 >>>>>>> --- a/lib/dpif-netdev.c >>>>>>> +++ b/lib/dpif-netdev.c >>>>>>> @@ -3324,7 +3324,11 @@ reconfigure_datapath(struct dp_netdev *dp) >>>>>>> * on the system and the user configuration. */ >>>>>>> reconfigure_pmd_threads(dp); >>>>>>> >>>>>>> - wanted_txqs = cmap_count(&dp->poll_threads); >>>>>>> + /* We need 1 Tx queue for each possible cpu core. */ >>>>>>> + wanted_txqs = ovs_numa_get_n_cores(); >>>>>>> + ovs_assert(wanted_txqs != OVS_CORE_UNSPEC); >>>>>>> + /* And 1 Tx queue for non-PMD threads. */ >>>>>>> + wanted_txqs++; >>>>>>> >>>>>>> /* The number of pmd threads might have changed, or a port can be >>>>>>> new: >>>>>>> * adjust the txqs. */ >>>>>>> -- >>>>>>> 2.7.4 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> d...@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>> >>>> >>>> >>>> >>>> >> >> >> >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev