On 08/17/2017 05:21 PM, Jan Scheurich wrote: > Good discussion. Some thoughts: > > 1. Prioritizing queues by assigning them to dedicated PMDs is a simple and > effective but very crude method, considering that you have to reserve an > entire (logical) core for that. So I am all for a more economic and perhaps > slightly less deterministic option! >
Sure - if you have the ability to effectively prioritize traffic on all ports then I agree. At present you would only be able to prioritize traffic from a 2 rxq i40e which would mean any other high priority traffic may get penalized if it lands on the same pmd. I'm not sure that limited a use case would really be useful. Kevin. > 2. Offering the option to prioritize certain queues in OVS-DPDK is a highly > desirable feature. We have at least one important use case in OpenStack > (prioritizing "in-band" infrastructure control plane traffic over tenant > data, in case both are carried on the same physical network). In our case the > traffic separation would be done per VLAN. Can we add this to the list of > supported filters? > > 3. It would be nice to be able to combine priority queues with filters with a > number of RSS queues without filter. Is this a XL710 HW limitation or only a > limitation of the drivers and DPDK APIs? > > BR, Jan > > >> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- >> boun...@openvswitch.org] On Behalf Of O Mahony, Billy >> Sent: Thursday, 17 August, 2017 18:07 >> To: Kevin Traynor <ktray...@redhat.com>; d...@openvswitch.org >> Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic >> >> Hi Kevin, >> >> Thanks for the comments - more inline. >> >> Billy. >> >>> -----Original Message----- >>> From: Kevin Traynor [mailto:ktray...@redhat.com] >>> Sent: Thursday, August 17, 2017 3:37 PM >>> To: O Mahony, Billy <billy.o.mah...@intel.com>; d...@openvswitch.org >>> Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive >>> traffic >>> >>> Hi Billy, >>> >>> I just happened to be about to send a reply to the previous patchset, >>> so adding comments here instead. >>> >>> On 08/17/2017 03:24 PM, Billy O'Mahony wrote: >>>> Hi All, >>>> >>>> v2: Addresses various review comments; Applies cleanly on 0bedb3d6. >>>> >>>> This patch set provides a method to request ingress scheduling on >>> interfaces. >>>> It also provides an implemtation of same for DPDK physical ports. >>>> >>>> This allows specific packet types to be: >>>> * forwarded to their destination port ahead of other packets. >>>> and/or >>>> * be less likely to be dropped in an overloaded situation. >>>> >>>> It was previously discussed >>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2017- >>> May/044395.htm >>>> l >>>> and RFC'd >>>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.html >>>> >>>> Limitations of this patch: >>>> * The patch uses the Flow Director filter API in DPDK and has only >>>> been tested on Fortville (XL710) NIC. >>>> * Prioritization is limited to: >>>> ** eth_type >>>> ** Fully specified 5-tuple src & dst ip and port numbers for UDP & >>>> TCP packets >>>> * ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq >>> prioritization. >>>> * any requirements for a more granular prioritization mechanism >>>> >>> >>> In general I like the idea of splitting priority traffic to a specific >>> queue but I have concerns about the implementation. I shared most of >>> these when we met already but adding here too. Not a detailed review. >> [[BO'M]] No worries. If we get the high-level sorted out first the details >> will >> fall into place :) >>> >>> - It is using deprecated DPDK filter API. >>> http://dpdk.org/doc/guides/rel_notes/deprecation.html >> [[BO'M]] Yes it looks like a move to the shiny new Flow API is in order. >>> >>> - It is an invasive change that seems to be for only one Intel NIC in >>> the DPDK datapath. Even then it is very limited as it only works when >>> that Intel NIC is using exactly 2 rx queues. >> [[BO'M]] That's the current case but is really a limitation of >> FlowDirectorAPI/DPDK/XL710 combination. Maybe Flow API will allow to >> RSS over many queues and place the prioritized traffic on another queue. >>> >>> - It's a hardcoded opaque QoS which will have a negative impact on >>> whichever queues happen to land on the same pmd so it's unpredictable >>> which queues will be affected. It could effect other latency sensitive >>> traffic that cannot by prioritized because of the limitations above. >>> >>> - I guess multiple priority queues could land on the same pmd and >>> starve each other? >> [[BO'M]] Interaction with pmd assignment is definitely an issue that needs >> to be addressed. I know there is work in-flight in that regard so it will be >> easier to address that when the in-flight work lands. >>> >>> I think a more general, less restricted scheme using DPDK rte_flow API >>> with controls on the effects to other traffic is needed. Perhaps if a >>> user is very concerned with latency on traffic from a port, they would >>> be ok with dedicating a pmd to it. >> [[BO'M]] You are proposing to prioritize queues by allocating a single pmd >> to them rather than by changing the pmds read algorithm to favor >> prioritized queues? For sure that could be another implementation of the >> solution. >> >> If we look at the patch set as containing two distinct things as per the >> cover >> letter "the patch set provides a method to request ingress scheduling on >> interfaces. It also provides an implementation of same for DPDK physical >> ports." Then this would change the second part put the first would be still >> valid. Each port type in any case would have to come up with it's own >> implementation - it's just for non-physical ports than cannot offload the >> prioritization decision it not worth the effort - as was noted in an earlier >> RFC. >> >>> >>> thanks, >>> Kevin. >>> >>>> Initial results: >>>> * even when userspace OVS is very much overloaded and >>>> dropping significant numbers of packets the drop rate for prioritized >> traffic >>>> is running at 1/1000th of the drop rate for non-prioritized traffic. >>>> >>>> * the latency profile of prioritized traffic through userspace OVS >>>> is also >>> much >>>> improved >>>> >>>> 1e0 |* >>>> |* >>>> 1e-1 |* | Non-prioritized pkt latency >>>> |* * Prioritized pkt latency >>>> 1e-2 |* >>>> |* >>>> 1e-3 |* | >>>> |* | >>>> 1e-4 |* | | | >>>> |* |* | | >>>> 1e-5 |* |* | | | >>>> |* |* |* | | | >>>> 1e-6 |* |* |* |* | | >>>> |* |* |* |* |* | >>>> 1e-7 |* |* |* |* |* |* >>>> |* |* |* |* |* |* >>>> 1e-8 |* |* |* |* |* |* >>>> 0-1 1-20 20-40 40-50 50-60 60-70 ... 120-400 >>>> Latency (us) >>>> >>>> Proportion of packets per latency bin @ 80% Max Throughput >>>> (Log scale) >>>> >>>> >>>> Regards, >>>> Billy. >>>> >>>> billy O'Mahony (4): >>>> netdev: Add set_ingress_sched to netdev api >>>> netdev-dpdk: Apply ingress_sched config to dpdk phy ports >>>> dpif-netdev: Add rxq prioritization >>>> docs: Document ingress scheduling feature >>>> >>>> Documentation/howto/dpdk.rst | 31 +++++++ >>>> include/openvswitch/ofp-parse.h | 3 + >>>> lib/dpif-netdev.c | 25 ++++-- >>>> lib/netdev-bsd.c | 1 + >>>> lib/netdev-dpdk.c | 192 >>> +++++++++++++++++++++++++++++++++++++++- >>>> lib/netdev-dummy.c | 1 + >>>> lib/netdev-linux.c | 1 + >>>> lib/netdev-provider.h | 10 +++ >>>> lib/netdev-vport.c | 1 + >>>> lib/netdev.c | 22 +++++ >>>> lib/netdev.h | 1 + >>>> vswitchd/bridge.c | 4 + >>>> vswitchd/vswitch.xml | 31 +++++++ >>>> 13 files changed, 315 insertions(+), 8 deletions(-) >>>> >> >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev