On Sun, Feb 4, 2024 at 9:26 PM Numan Siddique <num...@ovn.org> wrote: > > On Sun, Feb 4, 2024 at 9:53 PM Han Zhou <hz...@ovn.org> wrote: > > > > On Sun, Feb 4, 2024 at 5:46 AM Ilya Maximets <i.maxim...@ovn.org> wrote: > > > > > > >>> > > > >>> > 35 files changed, 9681 insertions(+), 4645 deletions(-) > > > >>> > > > >>> I had another look at this series and acked the remaining patches. I > > > >>> just had some minor comments that can be easily fixed when applying > > the > > > >>> patches to the main branch. > > > >>> > > > >>> Thanks for all the work on this! It was a very large change but it > > > >>> improves northd performance significantly. I just hope we don't > > > >>> introduce too many bugs. Hopefully the time we have until release > > will > > > >>> allow us to further test this change on the 24.03 branch. > > > >>> > > > >>> Regards, > > > >>> Dumitru > > > >> > > > >> > > > >> > > > >> Thanks a lot Dumitru and Han for the reviews and patience. > > > >> > > > >> I addressed the comments and applied the patches to main and also to > > > > branch-24.03. > > > >> > > > >> @Han - I know you wanted to take another look in to v6. I didn't want > > to > > > > delay further as branch-24.03 was created. I'm more than happy to > > submit > > > > follow up patches if you have any comments to address. Please let me > > know. > > > >> > > > > > > > > Hi Numan, > > > > > > > > I was writing the reply and saw your email just now. Thanks a lot for > > > > taking a huge effort to achieve the great optimization. I only left one > > > > comment on the implicit dependency left for the en_lrnat -> en_lflow. > > Feel > > > > free to address it with a followup and no need to block the branching. > > And > > > > take my Ack for the series with that addressed. > > > > > > > > Acked-by: Han Zhou <hzhou at ovn.org> > > > > > > > > > Hi, Numan, Dumitru and Han. > > > > > > I see a huge negative performance impact, most likely from this set, on > > > ovn-heater's cluster-density tests. The memory consumption on northd > > > jumped about 4x and it constantly recomputes due to failures of port_group > > > handler: > > > > > > 2024-02-03T11:09:12.441Z|01680|inc_proc_eng|INFO|node: lflow, recompute > > (failed handler for input port_group) took 9762ms > > > 2024-02-03T11:09:12.444Z|01681|timeval|WARN|Unreasonably long 9898ms poll > > interval (5969ms user, 1786ms system) > > > ... > > > 2024-02-03T11:09:23.770Z|01690|inc_proc_eng|INFO|node: lflow, recompute > > (failed handler for input port_group) took 9014ms > > > 2024-02-03T11:09:23.773Z|01691|timeval|WARN|Unreasonably long 9118ms poll > > interval (5376ms user, 1515ms system) > > > ... > > > 2024-02-03T11:09:36.692Z|01699|inc_proc_eng|INFO|node: lflow, recompute > > (failed handler for input port_group) took 10695ms > > > 2024-02-03T11:09:36.696Z|01700|timeval|WARN|Unreasonably long 10890ms > > poll interval (6085ms user, 2745ms system) > > > ... > > > 2024-02-03T11:09:49.133Z|01708|inc_proc_eng|INFO|node: lflow, recompute > > (failed handler for input port_group) took 9985ms > > > 2024-02-03T11:09:49.137Z|01709|timeval|WARN|Unreasonably long 10108ms > > poll interval (5521ms user, 2440ms system) > > > > > > That increases 95%% ovn-installed latency in 500node cluster-density from > > > 3.6 seconds last week to 21.5 seconds this week. > > > > > > I think, this should be a release blocker. > > > > > > Memory usage is also very concerning. Unfortunately it is not tied to the > > > cluster-density test. The same 4-5x RSS jump is also seen in other test > > > like density-heavy. Last week RSS of ovn-northd in cluster-density 500 > > node > > > was between 1.5 and 2.5 GB, this week we have a range between 5.5 and 8.5 > > GB. > > > > > > I would consider this as a release blocker as well. > > > > > > > > > I don't have direct evidence that this particular series is a culprit, but > > > it looks like the most likely candidate. I can dig more into > > investigation > > > on Monday. > > > > > > Best regards, Ilya Maximets. > > > > Thanks Ilya for reporting this. 95% latency and 4x RSS increase is a little > > surprising to me. I did test this series with my scale test scripts for > > recompute performance regression. It was 10+% increase in latency. I even > > digged a little into it, and noticed ~5% increase caused by the hmap used > > to maintain the lflows in each lflow_ref. This was discussed in the code > > review for an earlier version (v2/v3). Overall it looked not very bad, if > > we now handle most common scenarios incrementally, and it is reasonable to > > have some cost for maintaining the references/index for incremental > > processing. I wonder if my test scenario was too simple (didn't have LBs > > included) to find the problems, so today I did another test by including a > > LB group with 1k LBs applied to 100 node-LS & GR, and another 1K LBs per > > node-LS & GR (101K LBs in total), and I did see more performance penalty > > but still within ~20%. While for memory I didn't notice a significant > > increase (<10%). I believe I am missing some specific scenario that had the > > big impact in the ovn-heater's tests. Please share if you dig out more > > clues . > > Hi Ilya, > > Thanks for reporting these details. > > I had a look at this regression. There is a significant increase in > the lflow recompute > time (around 4x) in my local testing and this definitely not acceptable. > > In this particular cluster density test, whenever a port group is > created it results in a full recompute > and now since recompute time has increased, it has a cumulative effect > on the latency of the test. > > The dp reference counting [1] added in the v4 of this series has > introduced this regression (both CPU and memory). > I'm working on this fix and I think I should be able to address this soon. > > [1] - https://github.com/ovn-org/ovn/blob/main/northd/lflow-mgr.c#L696 > > Indeed, the same concern I mentioned when reviewing v5: https://mail.openvswitch.org/pipermail/ovs-dev/2024-January/411194.html
I just realized that my earlier test had too many individual LBs which hide this bottleneck, and now with a small change to my script by using just the single LB group that contains 1k LBs applied to 1k node-LS & GRs I did reproduce the performance regression with >50% increase in both latency and memory during recompute. The ovn-heater test you ran may have more LB groups which may have made it even worse. Looking forward to the solution. Thanks, Han > Thanks > Numan > > > > > Thanks, > > Han > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev