On 18/04/17 23:44, Bodireddy, Bhanuprakash wrote:
Hi Bhanuprakash,
I was doing some Physical to Virtual tests, and whenever the number of flows
reaches the rx batch size performance dropped a lot. I created an
experimental patch where I added an intermediate queue and flush it at the
end of the rx batch.
When I found your patch I decided to give it try to see how it behaves.
I also modified you patch in such a way that it will flush the queue after every
call to dp_netdev_process_rxq_port().
I presume you were doing like below in the pmd_thread_main receive loop?
for (i = 0; i < poll_cnt; i++) {
dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
poll_list[i].port_no);
dp_netdev_drain_txq_ports(pmd);
}
Yes this is exactly what I did. It would be interesting to see what IXIA
thinks
of this change ;)
Here are some pkt forwarding stats for the Physical to Physical scenario, for
two 82599ES 10G port with 64 byte packets being send at wire speed:
Number plain patch +
of flows git clone patch flush
======== ========= ========= =========
10 10727283 13527752 13393844
32 7042253 11285572 11228799
50 7515491 9642650 9607791
100 5838699 9461239 9430730
500 5285066 7859123 7845807
1000 5226477 7146404 7135601
Thanks for sharing the numbers, I do agree with your findings and I saw very
similar results with our v3 patch.
In any case we see significant throughput improvement with the patch.
I do not have an IXIA to do the latency tests you performed, however I do
have a XENA tester which has a basic latency measurement feature.
I used the following script to get the latency numbers:
https://github.com/chaudron/XenaPythonLib/blob/latency/examples/latenc
y.py
Thanks for pointing this, it could be useful for users with no IXIA setup.
As you can see in the numbers below, the default queue introduces quite
some latency, however doing the flush every rx batch brings the latency down
to almost the original values. The results mimics your test case 2, sending 10G
traffic @ wire speed:
===== GIT CLONE
Pkt size min(ns) avg(ns) max(ns)
512 4,631 5,022 309,914
1024 5,545 5,749 104,294
1280 5,978 6,159 45,306
1518 6,419 6,774 946,850
===== PATCH
Pkt size min(ns) avg(ns) max(ns)
512 4,928 492,228 1,995,026
1024 5,761 499,206 2,006,628
1280 6,186 497,975 1,986,175
1518 6,579 494,434 2,005,947
===== PATCH + FLUSH
Pkt size min(ns) avg(ns) max(ns)
512 4,711 5,064 182,477
1024 5,601 5,888 701,654
1280 6,018 6,491 533,037
1518 6,467 6,734 312,471
The latency numbers above are very encouraging indeed. However with RFC2544
tests especially on IXIA, we do have lot of parameters to tune.
I see that the latency stats fluctuate a lot with change in acceptable 'Frame
Loss'. I am not expert of IXIA myself, but trying to figure out acceptable
settings and trying to measure latency/throughput.
I just figured out that XENA also has the RFC2544 tests, and I decided
to give it a shot.
I also noticed that if packets get dropped the results get really off.
In the end I did
the tests with 99% wire speed @10G no packets got lost and the results
are stable.
Here are the results for test 2, 30 flows, 512 byte packets:
Avg Min Max
PLAIN 15.397 5.288 880.598
PATCH 28.521 11.358 925.001
FLUSH 15.958 5.352 917.889
Maybe it will be good to re-run your latency tests with the flush for every rx
batch. This might get ride of your huge latency while still increasing the
performance in the case the rx batch shares the same egress port.
The overall patchset looks fine to me, see some comments inline.
Thanks for reviewing the patch.
+#define MAX_LOOP_TO_DRAIN 128
Is defining this inline ok?
I see that this convention is used in ovs.
NULL,
NULL,
netdev_dpdk_vhost_reconfigure,
- netdev_dpdk_vhost_rxq_recv);
+ netdev_dpdk_vhost_rxq_recv,
+ NULL);
We need this patch even more in the vhost case as there is an even bigger
drop in performance when we exceed the rx batch size. I measured around
40%, when reducing the rx batch size to 4, and using 1 vs 5 flows (single PMD).
Completely Agree. Infact we did a quick patch doing batching for vhost ports as
well and found significant performance improvement(though it's not thoroughly
tested for all corner cases).
We have that in our backlog and we will trying posting that patch as an RFC
atleast to get feedback from the community.
Thanks! looking forward to it. Will definitely review, and test it!
-Bhanuprakash.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev