Re: [vpp-dev] VPP Performance drop from 17.04 to 17.07

Billy McFall Mon, 28 Aug 2017 09:47:54 -0700

On Mon, Aug 28, 2017 at 8:53 AM, Maciek Konstantynowicz (mkonstan) <
mkons...@cisco.com> wrote:


> + csit-dev
>
> Billy,
>
> Per the last week CSIT project call, from CSIT perspective, we
> classified your reported issue as Test coverage escape.
>
> Summary
> =======
> CSIT test coverage got fixed, see more detail below. The CSIT tests
> uncovered regression for L2BD with MAC learning with higher total number
> of MACs in L2FIB, >>10k MAC, for multi-threaded configurations. Single-
> threaded configurations seem to be not impacted.
>
> Billy, Karl, Can you confirm this aligns with your findings?
>

When you say "multi-threaded configuration", I assume you mean multiple
worker threads? Karl's tests had 4 workers, one for each NIC (physical and
vhost-user). He only tested multi-threaded, so we can not confirm that
single-threaded
configurations seem to be not impacted.

Our numbers are a little different from yours, but we are both seeing drops
between releases. We had a bigger drop off with 10k flows, but seems to be
similar with the million flow tests.

I was a little disappointed the MAC limit change by John Lo on 8/23 didn't
improve master number some.

Thanks for all the hard work and adding these additional test cases.

Billy


> More detail
> ===========
> MAC scale tests have been now added L2BD and L2BD+vhost CSIT suites, as
> a simple extension to existing L2 testing suites. Some known issues with
> TG prevented CSIT to add those tests in the past, but now as TG issues
> have been addressed, the tests could be added swiftly. The complete list
> of added tests is listed in [1] - thanks to Peter Mikus for great work
> there!
>
> Results from running those tests multiple times within FD.io
> <http://fd.io> CSIT lab
> infra can be glanced over by checking dedicated test trigger commits
> [2][3][4], summary graphs in linked xls [5]. The results confirm there
> is regression in VPP l2fib code affecting all scaled up MAC tests in
> multi-thread configuration. Single-thread configurations seems not be
> impacted.
>
> The tests in commit [1] are not merged yet, as they're waiting for
> TG/TRex team to fix TRex issue with mis-calculating Ethernet FCS with
> large number of L2 MAC flows (>10k MAC flows). Issue is tracked by [6],
> TRex v2.29 with the fix ETA is w/e 1-Sep i.e. this week. Reported CSIT test
> results are using Ethernet frames with UDP headers that's masking the
> TRex issue.
>
> We have also vpp git bisected the problem between v17.04 (good) and
> v17.07 (bad) in a separate IXIA based lab in SJC, and found the culprit
> vpp patch [7]. Awaiting fix from vpp-dev, jira ticket raised [8].
>
> Many thanks for reporting this regression and working with CSIT to plug
> this hole in testing.
>
> -Maciek
>
> [1] CSIT-786 L2FIB scale testing [https://gerrit.fd.io/r/#/c/8145/
> ge8145] [https://jira.fd.io/browse/CSIT-786 CSIT-786];
>     L2FIB scale testing for 10k, 100k, 1M FIB entries
>      ./l2:
>      10ge2p1x520-eth-l2bdscale10kmaclrn-ndrpdrdisc.robot
>      10ge2p1x520-eth-l2bdscale100kmaclrn-ndrpdrdisc.robot
>      10ge2p1x520-eth-l2bdscale1mmaclrn-ndrpdrdisc.robot
>      10ge2p1x520-eth-l2bdscale10kmaclrn-eth-2vhostvr1024-1vm-
> cfsrr1-ndrpdrdisc
>      10ge2p1x520-eth-l2bdscale100kmaclrn-eth-2vhostvr1024-1vm-
> cfsrr1-ndrpdrdisc
>      10ge2p1x520-eth-l2bdscale1mmaclrn-eth-2vhostvr1024-1vm-
> cfsrr1-ndrpdrdisc
> [2] VPP master branch [https://gerrit.fd.io/r/#/c/8173/ ge8173];
> [3] VPP stable/1707 [https://gerrit.fd.io/r/#/c/8167/ ge8167];
> [4] VPP stable/1704 [https://gerrit.fd.io/r/#/c/8172/ ge8172];
> [5] CSIT-794 VPP v17.07 L2BD yields lower NDR and PDR performance vs.
> v17.04, 20170825_l2fib_regression_10k_100k_1M.xlsx, [
> https://jira.fd.io/browse/CSIT-794 CSIT-794];
> [6] TRex v2.28 Ethernet FCS mis-calculation issue [
> https://jira.fd.io/browse/CSIT-793 CSIT-793];
> [7] commit 25ff2ea3a31e422094f6d91eab46222a29a77c4b;
> [8] VPP v17.07 L2BD NDR and PDR multi-thread performance broken [
> https://jira.fd.io/browse/VPP-963 VPP-963];
>
> On 14 Aug 2017, at 23:40, Billy McFall <bmcf...@redhat.com> wrote:
>
> In the last VPP call, I reported some internal Red Hat performance testing
> was showing a significant drop in performance between releases 17.04 to
> 17.07. This with l2-bridge testing - PVP - 0.002% Drop Rate:
>    VPP-17.04: 256 Flow 7.8 MP/s 10k Flow 7.3 MP/s 1m Flow 5.2 MP/s
>    VPP-17.07: 256 Flow 7.7 MP/s 10k Flow 2.7 MP/s 1m Flow 1.8 MP/s
>
> The performance team re-ran some of the tests for me with some additional
> data collected. Looks like the size of the L2 FIB table was reduced in
> 17.07. Below are the number of entries in the MAC Table after the tests are
> run:
>    17.04:
>      show l2fib
>      4000008 l2fib entries
>    17.07:
>      show l2fib
>      1067053 l2fib entries with 1048576 learned (or non-static) entries
>
> This caused more packets to be flooded (see out of 'show node counters'
> below). I looked but couldn't find anything. Is the size of the L2 FIB
> Table table configurable?
>
> Thanks,
> Billy McFall
>
>
> 17.04:
>
> show node counters
>    Count                    Node                  Reason
> :
>  313035313                l2-input                L2 input packets
>     555726                l2-flood                L2 flood packets
> :
>  310115490                l2-input                L2 input packets
>     824859                l2-flood                L2 flood packets
> :
>  313508376                l2-input                L2 input packets
>    1041961                l2-flood                L2 flood packets
> :
>  313691024                l2-input                L2 input packets
>     698968                l2-flood                L2 flood packets
>
> 17.07:
>
> show node counters
>    Count                    Node                  Reason
> :
>   97810569                l2-input                L2 input packets
>   72557612                l2-flood                L2 flood packets
> :
>   97830674                l2-input                L2 input packets
>   72478802                l2-flood                L2 flood packets
> :
>   97714888                l2-input                L2 input packets
>   71655987                l2-flood                L2 flood packets
> :
>   97710374                l2-input                L2 input packets
>   70058006                l2-flood                L2 flood packets
>
>
> --
> *Billy McFall*
> SDN Group
> Office of Technology
> *Red Hat*
> _______________________________________________
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev
>
>
>


-- 
*Billy McFall*
SDN Group
Office of Technology
*Red Hat*

_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] VPP Performance drop from 17.04 to 17.07

Reply via email to