>-----Original Message----- >From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Jan >Scheurich >Sent: Thursday, June 16, 2016 2:56 PM >To: dev@openvswitch.org >Subject: [ovs-dev] [RFC Patch] dpif-netdev: Sorted subtable vectors per >in_port in dpcls > >The user-space datapath (dpif-netdev) consists of a first level "exact match >cache" (EMC) matching on 5-tuples and the normal megaflow classifier. With >many parallel packet flows (e.g. TCP connections) the EMC becomes >inefficient and the OVS forwarding performance is determined by the >megaflow classifier. > >The megaflow classifier (dpcls) consists of a variable number of hash tables >(aka subtables), each containing megaflow entries with the same mask of >packet header and metadata fields to match upon. A dpcls lookup matches a >given packet against all subtables in sequence until it hits a match. As >megaflow cache entries are by construction non-overlapping, the first match >is the only match. > >Today the order of the subtables in the dpcls is essentially random so that on >average a dpcsl lookup has to visit N/2 subtables for a hit, when N is the >total >number of subtables. Even though every single hash-table lookup is fast, the >performance of the current dpcls degrades when there are many subtables. > >How does the patch address this issue: > >In reality there is often a strong correlation between the ingress port and a >small subset of subtables that have hits. The entire megaflow cache typically >decomposes nicely into partitions that are hit only by packets entering from a >range of similar ports (e.g. traffic from Phy -> VM vs. traffic from VM -> >Phy). > >Therefore, keeping a separate list of subtables per ingress port, sorted by >frequency of hits, reduces the average number of subtables lookups in the >dpcls to a minimum, even if the total number of subtables gets large.
I like the proposed approach of subtable prioritization for each ingress port there by reducing the lookup time. +1 on this approach. > >The patch introduces 32 subtable vectors per dpcls and hashes the ingress >port to select the subtable vector. The patch also counts matches per 32 slots >in each vector (hashing the subtable pointer to obtain the slot) and sorts the >vectors according to match frequency every second. > >To monitor the effectiveness of the patch we have enhanced the ovs-appctl >dpif-netdev/pmd-stats-show command with an extra line "avg. subtable >lookups per hit" to report the average number of subtable lookup needed for >a megaflow match. Ideally, this should be close to 1 and much smaller than >N/2. > >I have benchmarked a cloud L3 overlay pipeline with a VXLAN overlay mesh. >With pure L3 tenant traffic between VMs on different nodes the resulting >netdev dpcls contains N=4 subtables. > >Disabling the EMC, I have measured a baseline performance (in+out) of ~1.32 >Mpps (64 bytes, 1000 L4 flows). The average number of subtable lookups per >dpcls match is 2.5. > >With the patch the average number of subtable lookups per dpcls match goes >down to 1.25 (apparently there are still two ports of different nature hashed >to the same vector, otherwise it should be exactly one). Even so the >forwarding performance grows by ~30% to 1.72 Mpps. I ran some benchmarks and observed that the patch improves performance even with multiple subtables around. EMC is disabled here and had 5 VMs doing packet forwarding. The flow rules are setup so that 8 subtables are created and the performance improvement of 16% was observed In this case. I would like to try some more complex test scenarios when I get time. Regards, Bhanu Prakash. > >As the number of subtables will often be higher in reality, we can assume that >this is at the lower end of the speed-up one can expect from this >optimization. Just running a parallel ping between the VXLAN tunnel >endpoints increases the number of subtables and hence the average number >of subtable lookups from 2.5 to 3.5 with a corresponding decrease of >throughput to 1.14 Mpps. With the patch the parallel ping has no impact on >average number of subtable lookups and performance. The performance gain >is then ~50%. > >Signed-off-by: Jan Scheurich <jan.scheur...@ericsson.com> > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev