On Fri, Aug 18, 2023 at 2:33 PM Numan Siddique <num...@ovn.org> wrote:
>
> On Fri, Aug 18, 2023 at 11:38 AM Han Zhou <hz...@ovn.org> wrote:
> >
> > On Wed, Aug 9, 2023 at 9:38 AM <num...@ovn.org> wrote:
> > >
> > > From: Numan Siddique <num...@ovn.org>
> > >
> > Thanks Numan for the enhancement.
> >
> > > Instead of maintaing the lport to lflow references using
> > > 'struct lflow_ref_list', this patch makes use of the existing
> > > objdep APIs.  Since objdep_mgr is not thread safe, an instance
> > > of objdep_mgr is maintained for each 'ovn_port'.
> > >
> > > Once we add the thread safe support to objdep APIs we can move
> > > the objdep_mgr to 'lflow' engine date.
> > >
> > > Using objdep APIs does come with a cost in memory.  We now
> > > maintain an additional hmap of ovn_lflow's to look up using
> > > uuid.  But this will be useful to handle datapath and load
> > > balancer changes in the lflow engine node.
> > >
> >
> > I understand that the 'struct lflow_ref_list' may not satisfy the needs of
> > more complex I-P, but I wonder if objdep_mgr (without modification) is the
> > solution. It seems to me more complex at least for this patch without
> > adding obvious benefits. Since I am still reviewing, maybe I missed some
> > points which would show the value from the rest of the patches, and please
> > forgive me for my slow review.
>
> Sure.   No worries.
>
> Can you please continue reviewing from the latest v6 ?
>
> http://patchwork.ozlabs.org/project/ovn/list/?series=369384
>
> I think we can discuss more once you go through all the patches in the
> series (or until patch 14).
> I think that would give a better idea of how it is used and we can
> discuss if it's worth using obj dep mgr.
>
> Also I think the patches P1 - P8 make one set and can be considered
> first.  I submitted all these patches
> as one series mainly because
>    -  all the patches can be applied easily
>    -  the soft freeze was near.
>
> The patches can be found here too -
> https://github.com/numansiddique/ovn/tree/northd_ip_lb_ip_v6
>
> Thanks
> Numan
>
> >
> > While I am still reviewing the patches, I did a performance test with
> > https://github.com/hzhou8/ovn-test-script for the 200 chassis x 50 lsp
> > scale of simulated ovn-k8s topology.
> >
> > - Before this patch, a recompute takes 1293ms
> > - After this patch, a recompute takes 1536ms, which is 20% more time.
> > - I also did the same test with all the rest of the patches of this series,
> > which takes 1690ms.
> >

I did some testing using the OVN dbs of  the ovn-heater density heavy run.
A triggered forced recompute with the present main of ovn-northd takes
around 4389ms
where as with the patch 9 of this series,  it is taking around 4838ms.

I hacked the 'engine_compute_log_timeout_msec' [1] value from 500ms to
50ms so that
inc-eng can log the engine nodes taking more than 50ms in the recompute.

With main
-------
2023-08-18T11:28:20.714Z|00117|inc_proc_eng|INFO|User triggered force recompute.
2023-08-18T11:28:23.189Z|00118|inc_proc_eng|INFO|node: northd,
recompute (forced) took 2475ms
2023-08-18T11:28:25.102Z|00119|inc_proc_eng|INFO|node: lflow,
recompute (forced) took 1887ms
2023-08-18T11:28:25.102Z|00120|timeval|WARN|Unreasonably long 4389ms
poll interval (4373ms user, 3ms system)


And with patch 9
---------------------
2023-08-18T11:28:10.558Z|00220|inc_proc_eng|INFO|User triggered force recompute.
2023-08-18T11:28:10.883Z|00221|inc_proc_eng|INFO|node: lb_data,
recompute (forced) took 325ms
2023-08-18T11:28:13.163Z|00222|inc_proc_eng|INFO|node: northd,
recompute (forced) took 2280ms
2023-08-18T11:28:13.431Z|00223|inc_proc_eng|INFO|node: sync_to_sb_lb,
recompute (forced) took 241ms
2023-08-18T11:28:15.396Z|00224|inc_proc_eng|INFO|node: lflow,
recompute (forced) took 1956ms
2023-08-18T11:28:15.397Z|00225|timeval|WARN|Unreasonably long 4838ms
poll interval (4819ms user, 3ms system)


As you can see the lflow recompute time is not very significant
(1887 ms vs 1956 ms)

It is the lb_data engine node which is taking 325ms more.  But if you
see the lb_data engine code, the lb_data handlers never return false.
So the lb_data full engine recompute cost can be ignored as it will be
triggered when engine aborts or when the user triggers a recompute.

I don't think the lflow engine recompute cost due to objdep is
significant if lflow recomputes are less.

Thanks
Numan

> > So the major increase of the recompute time is from this patch. I didn't
> > debug what's the exact cause of this increase, but I think it might be
> > related to the objdep usage.
> >
> > > This patch does few more changes which are significant:
> > >   -  Logical flows are now hashed without the logical
> > >      datapaths.  If a logical flow is referenced by just one
> > >      datapath, we don't rehash it.
> > >
> > >   -  The synthetic 'hash' column of sbrec_logical_flow now
> > >      doesn't use the logical datapath too.  This means that
> > >      when ovn-northd is updated/upgraded and has this commit,
> > >      all the logical flows with 'logical_datapath' column
> > >      set will get deleted and re-added causing some disruptions.
> > >
> > >   -  With the commit [1] which added I-P support for logical
> > >      port changes, multiple logical flows with same match 'M'
> > >      and actions 'A' are generated and stored without the
> > >      dp groups, which was not the case prior to
> > >      that patch.
> > >      One example to generate these lflows is:
> > >              ovn-nbctl lsp-set-addresses sw0p1 "MAC1 IP1"
> > >              ovn-nbctl lsp-set-addresses sw1p1 "MAC1 IP1"
> > >              ovn-nbctl lsp-set-addresses sw2p1 "MAC1 IP1"
> > >
> > >      Now with this patch we go back to the earlier way.  i.e
> > >      one logical flow with logical_dp_groups set.
> > >
> > Thanks for taking the effort to achieve this. The approach is indeed very
> > smart. I was aware of such scenarios, but it just didn''t seem to impact
> > scalability because the number of such lflows should be small, so it wasn't
> > considered a priority.
> >
> > >   -  With this patch any updates to a logical port which
> > >      doesn't result in new logical flows will not result in
> > >      deletion and addition of same logical flows.
> > >      Eg.
> > >      ovn-nbctl set logical_switch_port sw0p1 external_ids:foo=bar
> > >      will be a no-op to the SB logical flow table.
> > >
> > Similar to the above case, this doesn't seem to be critical for
> > scalability, but it is good to see the enhancement.
> > I will post more feedback after finishing the full review.
> >
> > Thanks,
> > Han
> >
> > > [1] - 8bbd67891f68("northd: Incremental processing of VIF additions in
> > 'lflow' node.")
> > >
> > > Suggested-by: Dumitru Ceara <dce...@redhat.com>
> > > Signed-off-by: Numan Siddique <num...@ovn.org>
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to