On Sun, Feb 4, 2024 at 9:26 PM Numan Siddique <num...@ovn.org> wrote:
>
> On Sun, Feb 4, 2024 at 9:53 PM Han Zhou <hz...@ovn.org> wrote:
> >
> > On Sun, Feb 4, 2024 at 5:46 AM Ilya Maximets <i.maxim...@ovn.org> wrote:
> > >
> > > >>>
> > > >>> >  35 files changed, 9681 insertions(+), 4645 deletions(-)
> > > >>>
> > > >>> I had another look at this series and acked the remaining
patches.  I
> > > >>> just had some minor comments that can be easily fixed when
applying
> > the
> > > >>> patches to the main branch.
> > > >>>
> > > >>> Thanks for all the work on this!  It was a very large change but
it
> > > >>> improves northd performance significantly.  I just hope we don't
> > > >>> introduce too many bugs.  Hopefully the time we have until release
> > will
> > > >>> allow us to further test this change on the 24.03 branch.
> > > >>>
> > > >>> Regards,
> > > >>> Dumitru
> > > >>
> > > >>
> > > >>
> > > >> Thanks a lot Dumitru and Han for the reviews and patience.
> > > >>
> > > >> I addressed the comments and applied the patches to main and also
to
> > > > branch-24.03.
> > > >>
> > > >> @Han - I know you wanted to take another look in to v6.  I didn't
want
> > to
> > > > delay further as branch-24.03 was created.  I'm more than happy to
> > submit
> > > > follow up patches if you have any comments to address.  Please let
me
> > know.
> > > >>
> > > >
> > > > Hi Numan,
> > > >
> > > > I was writing the reply and saw your email just now. Thanks a lot
for
> > > > taking a huge effort to achieve the great optimization. I only left
one
> > > > comment on the implicit dependency left for the en_lrnat ->
en_lflow.
> > Feel
> > > > free to address it with a followup and no need to block the
branching.
> > And
> > > > take my Ack for the series with that addressed.
> > > >
> > > > Acked-by: Han Zhou <hzhou at ovn.org>
> > >
> > >
> > > Hi, Numan, Dumitru and Han.
> > >
> > > I see a huge negative performance impact, most likely from this set,
on
> > > ovn-heater's cluster-density tests.  The memory consumption on northd
> > > jumped about 4x and it constantly recomputes due to failures of
port_group
> > > handler:
> > >
> > > 2024-02-03T11:09:12.441Z|01680|inc_proc_eng|INFO|node: lflow,
recompute
> > (failed handler for input port_group) took 9762ms
> > > 2024-02-03T11:09:12.444Z|01681|timeval|WARN|Unreasonably long 9898ms
poll
> > interval (5969ms user, 1786ms system)
> > > ...
> > > 2024-02-03T11:09:23.770Z|01690|inc_proc_eng|INFO|node: lflow,
recompute
> > (failed handler for input port_group) took 9014ms
> > > 2024-02-03T11:09:23.773Z|01691|timeval|WARN|Unreasonably long 9118ms
poll
> > interval (5376ms user, 1515ms system)
> > > ...
> > > 2024-02-03T11:09:36.692Z|01699|inc_proc_eng|INFO|node: lflow,
recompute
> > (failed handler for input port_group) took 10695ms
> > > 2024-02-03T11:09:36.696Z|01700|timeval|WARN|Unreasonably long 10890ms
> > poll interval (6085ms user, 2745ms system)
> > > ...
> > > 2024-02-03T11:09:49.133Z|01708|inc_proc_eng|INFO|node: lflow,
recompute
> > (failed handler for input port_group) took 9985ms
> > > 2024-02-03T11:09:49.137Z|01709|timeval|WARN|Unreasonably long 10108ms
> > poll interval (5521ms user, 2440ms system)
> > >
> > > That increases 95%% ovn-installed latency in 500node cluster-density
from
> > > 3.6 seconds last week to 21.5 seconds this week.
> > >
> > > I think, this should be a release blocker.
> > >
> > > Memory usage is also very concerning.  Unfortunately it is not tied
to the
> > > cluster-density test.  The same 4-5x RSS jump is also seen in other
test
> > > like density-heavy.  Last week RSS of ovn-northd in cluster-density
500
> > node
> > > was between 1.5 and 2.5 GB, this week we have a range between 5.5 and
8.5
> > GB.
> > >
> > > I would consider this as a release blocker as well.
> > >
> > >
> > > I don't have direct evidence that this particular series is a
culprit, but
> > > it looks like the most likely candidate.  I can dig more into
> > investigation
> > > on Monday.
> > >
> > > Best regards, Ilya Maximets.
> >
> > Thanks Ilya for reporting this. 95% latency and 4x RSS increase is a
little
> > surprising to me. I did test this series with my scale test scripts for
> > recompute performance regression. It was 10+% increase in latency. I
even
> > digged a little into it, and noticed ~5% increase caused by the hmap
used
> > to maintain the lflows in each lflow_ref. This was discussed in the code
> > review for an earlier version (v2/v3). Overall it looked not very bad,
if
> > we now handle most common scenarios incrementally, and it is reasonable
to
> > have some cost for maintaining the references/index for incremental
> > processing. I wonder if my test scenario was too simple (didn't have LBs
> > included) to find the problems, so today I did another test by
including a
> > LB group with 1k LBs applied to 100 node-LS & GR, and another 1K LBs per
> > node-LS & GR (101K LBs in total), and I did see more performance penalty
> > but still within ~20%. While for memory I didn't notice a significant
> > increase (<10%). I believe I am missing some specific scenario that had
the
> > big impact in the ovn-heater's tests. Please share if you dig out more
> > clues .
>
> Hi Ilya,
>
> Thanks for reporting these details.
>
> I had a look at this regression.   There is a significant increase in
> the lflow recompute
> time (around 4x) in my local testing and this definitely not acceptable.
>
> In this particular cluster density test, whenever a port group is
> created it results in a full recompute
> and now since recompute time has increased, it has a cumulative effect
> on the latency of the test.
>
> The dp reference counting [1] added in the v4 of this series has
> introduced this regression (both CPU and memory).
> I'm working on this fix and I think I should be able to address this soon.
>
> [1] - https://github.com/ovn-org/ovn/blob/main/northd/lflow-mgr.c#L696
>
>
Indeed, the same concern I mentioned when reviewing v5:
https://mail.openvswitch.org/pipermail/ovs-dev/2024-January/411194.html

I just realized that my earlier test had too many individual LBs which hide
this bottleneck, and now with a small change to my script by using just the
single LB group that contains 1k LBs applied to 1k node-LS & GRs I did
reproduce the performance regression with >50% increase in both latency and
memory during recompute. The ovn-heater test you ran may have more LB
groups which may have made it even worse.

Looking forward to the solution.

Thanks,
Han

> Thanks
> Numan
>
> >
> > Thanks,
> > Han
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to