On Thu, May 3, 2018 at 10:49 AM, Mark Michelson <mmich...@redhat.com> wrote:
> On 05/03/2018 12:58 PM, Mark Michelson wrote: > >> Hi Han, >> >> (cc'ed ovs-discuss) >> >> I have some test results here for your incremental branch[0]. >> >> Browbeat was the test orchestrator for the test, and it uses >> ovn-scale-test[1] to configure the test parameters and run the test. >> >> The test use one central node on which ovn-northd runs. There are three >> farm nodes on which sandboxes are run for ovn-controller. Each farm node >> runs 24 sandboxes, for a total of 72 sandboxes (and thus 72 ovn-controller >> processes). >> >> The test uses the datapath context[2] to set up 72 logical switches and >> one logical router in advance. Then during each test iteration, a logical >> switch port is added to one of the logical switches and bound on one of the >> sandboxes. The next iteration does not start until the previous iteration >> is 100% complete (i.e. we see that the logical switch port is "up" in the >> northbound db). The total number of logical switch ports added during the >> tests is 3312. >> >> During the test, I ran `perf record` on one of the ovn-controller >> processes and then created a flame graph[3] from the results. I have >> attached the flame graph to this e-mail. I think this can give us a good >> jumping off point for determining more optimizations to make to >> ovn-controller. >> >> [0] https://github.com/hzhou8/ovs/tree/ip7 >> [1] https://github.com/openvswitch/ovn-scale-test >> [2] https://github.com/openvswitch/ovn-scale-test/pull/165 >> [3] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs >> > > > From the IRC meeting, it was requested to see a flame graph of performance > on OVS master. I am attaching that on this e-mail. > > One difference in this test run is that the number of switch ports was > fewer (I'm not 100% sure of the exact number), so the number of samples in > perf record is less than in the flame graph I previously sent. > > The vast majority of the time is spent in lflow_run(). Based on this flame > graph, our initial take on the matter was that we could get improved > performance by reducing the number of logical flows to process. The > incremental branch seemed like a good testing target to that end. > Thanks Mark for sharing the results! It seems you have sent the wrong attachment perf-master.svg in your second email, which is still the same one as in the first email. Would you mind sending the right one? Also, please if you could share total CPU cost for incremental v.s. master, when you have the data. >From your text description, it is improved as expected, since the bottleneck moved from lflow_run() to ofctrl_put(). For the new bottleneck ofctrl_put(), it's a good finding, and I think I have some ideas to improve that, too. Basically, when we are able to do incremental computing, we don't need to go through and compare the installed v.s. desired flows every time, but instead we can maintain a list of changes made in the iteration and then send them to OVS. This would reduce a lot of ovn_flow_lookup() calls which is currently shown as the hotspot in ofctrl_put(). I will try this as soon as I fix some corner cases and get confident about the correctness. Thanks, Han
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss