On Fri, Aug 18, 2023 at 2:33 PM Numan Siddique <num...@ovn.org> wrote: > > On Fri, Aug 18, 2023 at 11:38 AM Han Zhou <hz...@ovn.org> wrote: > > > > On Wed, Aug 9, 2023 at 9:38 AM <num...@ovn.org> wrote: > > > > > > From: Numan Siddique <num...@ovn.org> > > > > > Thanks Numan for the enhancement. > > > > > Instead of maintaing the lport to lflow references using > > > 'struct lflow_ref_list', this patch makes use of the existing > > > objdep APIs. Since objdep_mgr is not thread safe, an instance > > > of objdep_mgr is maintained for each 'ovn_port'. > > > > > > Once we add the thread safe support to objdep APIs we can move > > > the objdep_mgr to 'lflow' engine date. > > > > > > Using objdep APIs does come with a cost in memory. We now > > > maintain an additional hmap of ovn_lflow's to look up using > > > uuid. But this will be useful to handle datapath and load > > > balancer changes in the lflow engine node. > > > > > > > I understand that the 'struct lflow_ref_list' may not satisfy the needs of > > more complex I-P, but I wonder if objdep_mgr (without modification) is the > > solution. It seems to me more complex at least for this patch without > > adding obvious benefits. Since I am still reviewing, maybe I missed some > > points which would show the value from the rest of the patches, and please > > forgive me for my slow review. > > Sure. No worries. > > Can you please continue reviewing from the latest v6 ? > > http://patchwork.ozlabs.org/project/ovn/list/?series=369384 > > I think we can discuss more once you go through all the patches in the > series (or until patch 14). > I think that would give a better idea of how it is used and we can > discuss if it's worth using obj dep mgr. > > Also I think the patches P1 - P8 make one set and can be considered > first. I submitted all these patches > as one series mainly because > - all the patches can be applied easily > - the soft freeze was near. > > The patches can be found here too - > https://github.com/numansiddique/ovn/tree/northd_ip_lb_ip_v6 > > Thanks > Numan > > > > > While I am still reviewing the patches, I did a performance test with > > https://github.com/hzhou8/ovn-test-script for the 200 chassis x 50 lsp > > scale of simulated ovn-k8s topology. > > > > - Before this patch, a recompute takes 1293ms > > - After this patch, a recompute takes 1536ms, which is 20% more time. > > - I also did the same test with all the rest of the patches of this series, > > which takes 1690ms. > >
I did some testing using the OVN dbs of the ovn-heater density heavy run. A triggered forced recompute with the present main of ovn-northd takes around 4389ms where as with the patch 9 of this series, it is taking around 4838ms. I hacked the 'engine_compute_log_timeout_msec' [1] value from 500ms to 50ms so that inc-eng can log the engine nodes taking more than 50ms in the recompute. With main ------- 2023-08-18T11:28:20.714Z|00117|inc_proc_eng|INFO|User triggered force recompute. 2023-08-18T11:28:23.189Z|00118|inc_proc_eng|INFO|node: northd, recompute (forced) took 2475ms 2023-08-18T11:28:25.102Z|00119|inc_proc_eng|INFO|node: lflow, recompute (forced) took 1887ms 2023-08-18T11:28:25.102Z|00120|timeval|WARN|Unreasonably long 4389ms poll interval (4373ms user, 3ms system) And with patch 9 --------------------- 2023-08-18T11:28:10.558Z|00220|inc_proc_eng|INFO|User triggered force recompute. 2023-08-18T11:28:10.883Z|00221|inc_proc_eng|INFO|node: lb_data, recompute (forced) took 325ms 2023-08-18T11:28:13.163Z|00222|inc_proc_eng|INFO|node: northd, recompute (forced) took 2280ms 2023-08-18T11:28:13.431Z|00223|inc_proc_eng|INFO|node: sync_to_sb_lb, recompute (forced) took 241ms 2023-08-18T11:28:15.396Z|00224|inc_proc_eng|INFO|node: lflow, recompute (forced) took 1956ms 2023-08-18T11:28:15.397Z|00225|timeval|WARN|Unreasonably long 4838ms poll interval (4819ms user, 3ms system) As you can see the lflow recompute time is not very significant (1887 ms vs 1956 ms) It is the lb_data engine node which is taking 325ms more. But if you see the lb_data engine code, the lb_data handlers never return false. So the lb_data full engine recompute cost can be ignored as it will be triggered when engine aborts or when the user triggers a recompute. I don't think the lflow engine recompute cost due to objdep is significant if lflow recomputes are less. Thanks Numan > > So the major increase of the recompute time is from this patch. I didn't > > debug what's the exact cause of this increase, but I think it might be > > related to the objdep usage. > > > > > This patch does few more changes which are significant: > > > - Logical flows are now hashed without the logical > > > datapaths. If a logical flow is referenced by just one > > > datapath, we don't rehash it. > > > > > > - The synthetic 'hash' column of sbrec_logical_flow now > > > doesn't use the logical datapath too. This means that > > > when ovn-northd is updated/upgraded and has this commit, > > > all the logical flows with 'logical_datapath' column > > > set will get deleted and re-added causing some disruptions. > > > > > > - With the commit [1] which added I-P support for logical > > > port changes, multiple logical flows with same match 'M' > > > and actions 'A' are generated and stored without the > > > dp groups, which was not the case prior to > > > that patch. > > > One example to generate these lflows is: > > > ovn-nbctl lsp-set-addresses sw0p1 "MAC1 IP1" > > > ovn-nbctl lsp-set-addresses sw1p1 "MAC1 IP1" > > > ovn-nbctl lsp-set-addresses sw2p1 "MAC1 IP1" > > > > > > Now with this patch we go back to the earlier way. i.e > > > one logical flow with logical_dp_groups set. > > > > > Thanks for taking the effort to achieve this. The approach is indeed very > > smart. I was aware of such scenarios, but it just didn''t seem to impact > > scalability because the number of such lflows should be small, so it wasn't > > considered a priority. > > > > > - With this patch any updates to a logical port which > > > doesn't result in new logical flows will not result in > > > deletion and addition of same logical flows. > > > Eg. > > > ovn-nbctl set logical_switch_port sw0p1 external_ids:foo=bar > > > will be a no-op to the SB logical flow table. > > > > > Similar to the above case, this doesn't seem to be critical for > > scalability, but it is good to see the enhancement. > > I will post more feedback after finishing the full review. > > > > Thanks, > > Han > > > > > [1] - 8bbd67891f68("northd: Incremental processing of VIF additions in > > 'lflow' node.") > > > > > > Suggested-by: Dumitru Ceara <dce...@redhat.com> > > > Signed-off-by: Numan Siddique <num...@ovn.org> > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev