Please see below the updated cover letter which has more details. (Apologies for sending this as an html instead of plain text).
This patch series adds incremental processing in the lflow engine node to handle changes to northd and other engine nodes. Changed related to load balancers and NAT are mainly handled in this patch series. This patch series can also be found here - https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1 Prior to this patch series, most of the changes to northd engine resulted in full recomputation of logical flows. This series aims to improve the performance of ovn-northd by adding the I-P support. In order to add this support, some of the northd engine node data (from struct ovn_datapath) is split and moved over to new engine nodes - mainly related to load balancers, NAT and ACLs. Below are the scale testing results done with these patches applied using ovn-heater. The test ran the scenario - ocp-500-density-heavy.yml [1]. With all the lflow I-P patches applied, the resuts are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 0.136883 1.129016 1.192001 1.204167 1.212728 0.665017 83.127099 125 0 Namespace.add_ports 0.005216 0.005736 0.007034 0.015486 0.018978 0.006211 0.776373 125 0 WorkerNode.bind_port 0.035030 0.046082 0.052469 0.058293 0.060311 0.045973 11.493259 250 0 WorkerNode.ping_port 0.005057 0.006727 1.047692 1.069253 1.071336 0.266896 66.724094 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- The results with the present main are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 0.135491 2.223805 3.311270 3.339078 3.345346 1.729172 216.146495 125 0 Namespace.add_ports 0.005380 0.005744 0.006819 0.018773 0.020800 0.006292 0.786532 125 0 WorkerNode.bind_port 0.034179 0.046055 0.053488 0.058801 0.071043 0.046117 11.529311 250 0 WorkerNode.ping_port 0.004956 0.006952 3.086952 3.191743 3.192807 0.791544 197.886026 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- [1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml v2 -> v3 ------- * Addressed some of the review comments from Han and Dumitru. There are still a few pending review comments which needs to be addressed or discussed. * Renamed the engine node from "lr_lbnat_data" to "lr_stateful" (v3 patch 5). * Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3 patch 8). * Removed v2 patch 2 from the series (northd: Track ovn_datapaths in northd engine track data."). This patch is now part of v3 patch 7 (northd: Add a new node 'ls_stateful'). * Squashed v2 patch 8 (northd: Don't commit dhcp response flows in the conntrack.) into v3 patch 7 (northd: Add a new node 'ls_stateful'.) v1 -> v2 -------- * Now also maintaing array indexes for ls_lbacls, lr_nat and lr_lb_nat_data tables (similar to ovn_datapaths->array) to make the lookup effecient. The same ovn_datapath->index is reused. * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c. In v2 we don't use objdep_mgr to maintain the resource to lflow references. Instead we maintain the 'struct lflow' pointer. With this we don't need to maintain additional hmap of lflows. Below is the brief description of this patch series. (This could help the reviewers). This patch series adds 3 new engine nodes 1. lr_nat: It maintains the NAT related data for each logical router which was earlier maintained by northd engine node in the 'struct ovn_datapath' It has only one input node - "northd" engine_add_input(&en_lr_nat, &en_northd, lr_nat_northd_handler); Below NAT related data is stored for each logical router struct lr_nat_record { ... const struct ovn_datapath *od; struct ovn_nat *nat_entries; size_t n_nat_entries; bool has_distributed_nat; /* Set of nat external ips on the router. */ struct sset external_ips; /* Set of nat external macs on the router. */ struct sset external_macs; /* SNAT IPs owned by the router (shash of 'struct ovn_snat_ip'). */ struct shash snat_ips; struct lport_addresses dnat_force_snat_addrs; struct lport_addresses lb_force_snat_addrs; bool lb_force_snat_router_ip; ... }; 2. lr_stateful: It maintains the stateful data for each logical router. Stateful data consists of Load balancers and NAT. The name 'stateful' is chosen since OVN uses conntrack for Load balancers and NAT. It has 3 inputs - northd - lr_nat - lb_data engine_add_input(&en_lr_stateful, &en_northd, lr_stateful_northd_handler); engine_add_input(&en_lr_stateful, &en_lr_nat, lr_stateful_lr_nat_handler); engine_add_input(&en_lr_stateful, &en_lb_data, lr_stateful_lb_data_handler); The below 'stateful' related data is stored for each logical router struct lr_stateful_record { ... const struct lr_nat_record *lrnat_rec; bool has_lb_vip; /* Load Balancer vIPs relevant for this datapath. */ struct ovn_lb_ip_set *lb_ips; /* sset of vips which are also part of lr nats. */ struct sset vip_nats; /* 'lflow_ref' is used to reference logical flows generated for * this lr_stateful record. */ struct lflow_ref *lflow_ref; ... }; "lr_nat" engine node data could have been part of "lr_stateful" engine node. But it had to be separated out in order to generate "vip_nats" i.e set of vips which are also part of logical router NAT external ips. 'lr_stateful' becomes an input to 'lflow' engine node. engine_add_input(&en_lflow, &en_lr_stateful, lflow_lr_stateful_handler); References to the stateful logical flows generated for each logical router are stored in the lr_stateful_record->lflow_ref similar to the lflow references stored for each VIF logical port. 3. ls_stateful It maintains the stateful data for each logical switch. Stateful data consists of Load balancers and ACLs associated with a logical switch. The name 'stateful' is chosen since OVN uses conntrack for Load balancers and ACls. It has 2 inputs - northd - port_group engine_add_input(&en_ls_stateful, &en_northd, ls_stateful_northd_handler); engine_add_input(&en_ls_stateful, &en_port_group, ls_stateful_port_group_handler); The below 'stateful' related data is stored for each logical switch struct ls_stateful_record { ... const struct ovn_datapath *od; bool has_stateful_acl; bool has_lb_vip; bool has_acls; uint64_t max_acl_tier; /* 'lflow_ref' is used to reference logical flows generated for * this ls_stateful record. */ struct lflow_ref *lflow_ref; ... }; 'ls_stateful' becomes an input to 'lflow' engine node. engine_add_input(&en_lflow, &en_ls_stateful, lflow_ls_stateful_handler); References to the stateful logical flows generated for each logical switch are stored in the ls_stateful_record->lflow_ref. Lflow references ---------------- Prior to this patch series, lflow I-P support was added for VIF logical ports and all the logical flows related to a VIF logical port were stored in the ovs_list ovn_port->lflows. 'struct lflow_ref_node' was added to store the logical port to logical flow reference. ----- /* A node that maintains link between an object (such as an ovn_port) and * a lflow. */ struct lflow_ref_node { /* This list follows different lflows referenced by the same object. List * head is, for example, ovn_port->lflows. */ struct ovs_list lflow_list_node; /* This list follows different objects that reference the same lflow. List * head is ovn_lflow->referenced_by. */ struct ovs_list ref_list_node; /* The lflow. */ struct ovn_lflow *lflow; }; ----- In this patch series, 'struct lflow_ref_node' is modified and the list of lflow references for a given resource (logical port, load balancer, etc) is stored in 'struct lflow_ref' ---- /* lflow ref mgr */ struct lflow_ref { char *res_name; /* head of the list 'struct lflow_ref_node'. */ struct ovs_list lflows_ref_list; /* hmapx_node is 'struct lflow *'. This is used to ensure * that there are no duplicates in 'lflow_ref_list' above. */ struct hmapx lflows; }; struct lflow_ref_node { struct ovs_list ref_list_node; struct ovn_lflow *lflow; size_t dp_index; }; ---- A logical flow can be added to the lflow table using lflow_table_add_lflow() and 'struct lflow_ref' instance can be passed to this function in which case the lflow is added to this lflow_ref. All the logical flows pertaining to a given 'ovn_port' are referenced in ovn_port->lflow_ref. Similarly all the logical flows pertaining to 'lr_stateful', 'ls_stateful' and 'lb_datapaths' are stored in their respective lflow_refs. This becomes easier to clear the logical flows for a given resource and rebuild them when the resource gets modified. I-P for the lflows ------------------- Below is how this patch series handles changes to NAT incrementally starting from northd engine node to logical flow generation. Changes to NAT ------------ | | NAT changes (ovn-nbctl lr-nat-add lr0 ...) v +-------------------------------+ | | | en_northd | | (northd_handle_lr_changes) | +-------------------------------+ | northd_data->trk_data.lr_with_changed_nats = [lr0] | v +-------------------------------+ | | | en_lr_nat | | (lr_nat_northd_handler) | +-------------------------------+ | lr_nat_data->trk_data.crupdated = [lr0] | v +-------------------------------+ | | | en_lr_stateful | | (lr_stateful_lr_nat_handler) | +-------------------------------+ |lr_stateful_data->trk_data.crupdated = [lr0] | v +-------------------------------+ | | | en_lflow | | (lflow_lr_stateful_handler) | +-------------------------------+ | | v lflow_lr_stateful_handler() { for each lr_stateful record 'lr_sful_rec' in the lr_stateful_data->trk_data.crupdated do lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */ build_lr_stateful_flows(..., lr_sful_rec->lflow_ref) lflow_ref_sync_lflows_to_sb(lr_sful_rec->lflow_ref) /* Sync the lflows to sb. */ done } Below is how this patch series handles changes to LBs incrementally starting from lb_data engine node to logical flow generation. Let's say there is a load balancer 'lb0' associated with a logical switch 'sw0' and logical router 'lr0' and a new VIP is added to it. ovn-nbctl set load_balancer lb0 vips:10.0.0.10=20.0.0.20 | | LB changes (ovn-nbctl set load_balancer lb0 ....) v +------------------------------------+ | | | en_lb_data | | (lb_data_load_balancer_handler) | | (lb_data_logical_switch_handler) | | (lb_data_logical_router_handler) | +------------------------------------+ | tracked_lb_data->crupdated_lbs = [lb0] | tracked_lb_data->crupdated_ls_lbs = [ {od:sw0}, {assoc_lbs: [lb0]} ] | tracked_lb_data->crupdated_lr_lbs = [ {od:lr0}, {assoc_lbs: [lb0]} ] v +---------------------------------------------------------------+ | | | | v | +-------------------------------+ | | | | | en_northd | | | (northd_lb_data_handler) | | +-------------------------------+ | | | | tracked_northd_data->trk_lbs.crupdated = [lb0] | | tracked_northd_data->ls_with_changed_lbs = [sw0] | | | +------- | | | | | | tracked_northd_data->ls_with_changed_lbs = [sw0] | v | +-------------------------------+ | | | | | en_ls_stateful | | | (ls_stateful_northd_handler) | | +-------------------------------+ | | | | | v | ls_stateful_northd_handler() { | for each lswitch od 'ls' in tracked_northd_data->ls_with_changed_lbs | do | ls_sful_rec = ls_stateful_table_find('ls') | ls_stateful_record_reinit(ls_sful_rec) | Add ls_sful_rec to tracked data -> ls_stateful_data->trk_data.crupdated | done | } | | | | | | | +---------------------------------- + | | | tracked_lb_data->crupdated_lbs = [lb0] | | tracked_lb_data->crupdated_lr_lbs = [ {od:lr0}, {assoc_lbs: [lb0]} ] | v | +------------------------------------+ | | | | | en_lr_stateful | | | (lr_stateful_lb_data_handler) | | +------------------------------------+ | | | | | v | lr_stateful_lb_data_handler() { | for each 'lr_lb' in tracked_lb_data->crupdated_lr_lbs | do | lr = lr_lb['od'] | lr_sful_rec = lr_sful_table_find(lr) | | for each 'lb' in lr_lb['assoc_lbs'] | do | build_lrouter_lb_ips(lr, lb) | build_lrouter_lb_reachable_ips(lr, lb) | done | | Add lr_sful_rec to tracked data -> ls_stateful_data->trk_data.crupdated | done | } | | | +--------------------------------+ | | | | | | | | | | | | | | | | |ls_sful_data->trk_data.crupdated = [sw0] | +--------+ | | | | lr_sful_data->trk_data.crupdated = [lr0] | | +--------------------------------------------+ | | v v +------------------------------------+ | | | en_lflow | | (lflow_ls_stateful_handler) | | (lflow_lr_stateful_handler) | tracked_northd_data->trk_lbs.crupdated = [lb0] | (lflow_northd_handler) <-------|----- (from en_northd) +------------------------------------+ | | | v lflow_ls_stateful_handler() { for each lr_sful_rec in ls_sful_data->trk_data.crupdated do lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */ build_ls_stateful_flows(..., ls_sful_rec->lflow_ref) lflow_ref_sync_lflows_to_sb(ls_sful_rec->lflow_ref) /* Sync the lflows to sb. */ done } lflow_lr_stateful_handler() { for each lr_sful_rec in the lr_sful_data->trk_data.crupdated do lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */ build_lr_stateful_flows(..., lr_sful_rec->lflow_ref) lflow_ref_sync_lflows_to_sb(lr_sful_rec->lflow_ref) /* Sync the lflows to sb. */ done } lflow_northd_handler() { for each 'lb_dps' in tracked_northd_data->trk_lbs.crupdated do lflow_ref_clear_lflows(lb_dps->lflow_ref); /* Clear the old lflows. */ build_lflows_for_lb_dps(..., lb_dps->lflow_ref); lflow_ref_sync_lflows_to_sb(lb_dps->lflow_ref) /* Sync the lflows to sb. */ done } Thanks Numan On Mon, Nov 27, 2023 at 9:39 PM Numan Siddique <num...@ovn.org> wrote: > > On Mon, Nov 27, 2023 at 9:34 PM <num...@ovn.org> wrote: > > > > From: Numan Siddique <num...@ovn.org> > > > > This patch series adds incremental processing in the lflow engine > > node to handle changes to northd and other engine nodes. > > Changed related to load balancers and NAT are mainly handled in > > this patch series. > > > > This patch series can also be found here - https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1 > > Correction - https://github.com/numansiddique/ovn/tree/northd_lbnatacl_lflow/v3 > > Thanks > Numan > > > > > Prior to this patch series, most of the changes to northd engine > > resulted in full recomputation of logical flows. This series > > aims to improve the performance of ovn-northd by adding the I-P > > support. In order to add this support, some of the northd engine > > node data (from struct ovn_datapath) is split and moved over to > > new engine nodes - mainly related to load balancers, NAT and ACLs. > > > > Below are the scale testing results done with these patches applied > > using ovn-heater. The test ran the scenario - > > ocp-500-density-heavy.yml [1]. > > > > With all the lflow I-P patches applied, the resuts are: > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > Iteration Total 0.136883 1.129016 1.192001 1.204167 1.212728 0.665017 83.127099 125 0 > > Namespace.add_ports 0.005216 0.005736 0.007034 0.015486 0.018978 0.006211 0.776373 125 0 > > WorkerNode.bind_port 0.035030 0.046082 0.052469 0.058293 0.060311 0.045973 11.493259 250 0 > > WorkerNode.ping_port 0.005057 0.006727 1.047692 1.069253 1.071336 0.266896 66.724094 250 0 > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > The results with the present main are: > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > Iteration Total 0.135491 2.223805 3.311270 3.339078 3.345346 1.729172 216.146495 125 0 > > Namespace.add_ports 0.005380 0.005744 0.006819 0.018773 0.020800 0.006292 0.786532 125 0 > > WorkerNode.bind_port 0.034179 0.046055 0.053488 0.058801 0.071043 0.046117 11.529311 250 0 > > WorkerNode.ping_port 0.004956 0.006952 3.086952 3.191743 3.192807 0.791544 197.886026 250 0 > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > [1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml > > > > > > Note: This cover letter will be enhanced with more details in v4. > > > > v2 -> v3 > > ------- > > * Addressed some of the review comments from Han and Dumitru. There > > are still a few pending review comments which needs to be addressed > > or discussed. > > > > * Renamed the engine node from "lr_lbnat_data" to "lr_stateful" > > (v3 patch 5). > > > > * Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3 patch 8). > > > > * Removed v2 patch 2 from the series (northd: Track ovn_datapaths in > > northd engine track data."). This patch is now part of v3 patch 7 > > (northd: Add a new node 'ls_stateful'). > > > > * Squashed v2 patch 8 (northd: Don't commit dhcp response flows in > > the conntrack.) into v3 patch 7 (northd: Add a new node > > 'ls_stateful'.) > > > > > > v1 -> v2 > > -------- > > * Now also maintaing array indexes for ls_lbacls, lr_nat and > > lr_lb_nat_data tables (similar to ovn_datapaths->array) to > > make the lookup effecient. The same ovn_datapath->index > > is reused. > > > > * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c. > > In v2 we don't use objdep_mgr to maintain the resource to lflow > > references. Instead we maintain the 'struct lflow' pointer. > > With this we don't need to maintain additional hmap of lflows. > > > > > > > > Numan Siddique (16): > > northd: Refactor the northd change tracking. > > tests: Add a couple of tests in ovn-northd for I-P. > > northd: Move router ports SB PB options sync to sync_to_sb_pb node. > > northd: Add a new engine 'lr_nat' to manage lr NAT data. > > northd: Add a new engine 'lr_stateful' to manage lr's stateful data. > > northd: Generate router's stateful flows using lr_stateful data. > > northd: Add a new node 'ls_stateful'. > > northd: Refactor lflow management into a separate module. > > northd: Use lflow_ref when adding all logical flows. > > northd: Move ovn_lb_datapaths from lib to northd module. > > northd: Handle lb changes in lflow engine. > > northd: Add lr_stateful handler for lflow engine node. > > northd: Add ls_stateful handler for lflow engine node. > > northd: Add I-P for NB_Global and SB_Global. > > northd: Add a noop handler for northd SB mac binding. > > northd: Add northd change handler for sync_to_sb_lb node. > > > > lib/lb.c | 96 - > > lib/lb.h | 57 - > > lib/ovn-util.c | 17 +- > > lib/ovn-util.h | 2 +- > > lib/stopwatch-names.h | 3 + > > northd/aging.c | 21 +- > > northd/automake.mk | 12 +- > > northd/en-global-config.c | 588 ++++ > > northd/en-global-config.h | 65 + > > northd/en-lflow.c | 122 +- > > northd/en-lflow.h | 8 + > > northd/en-lr-nat.c | 423 +++ > > northd/en-lr-nat.h | 130 + > > northd/en-lr-stateful.c | 678 ++++ > > northd/en-lr-stateful.h | 136 + > > northd/en-ls-stateful.c | 448 +++ > > northd/en-ls-stateful.h | 104 + > > northd/en-northd.c | 70 +- > > northd/en-northd.h | 2 +- > > northd/en-port-group.h | 3 + > > northd/en-sync-from-sb.c | 2 +- > > northd/en-sync-sb.c | 510 ++- > > northd/inc-proc-northd.c | 71 +- > > northd/lflow-mgr.c | 1078 ++++++ > > northd/lflow-mgr.h | 192 ++ > > northd/northd.c | 6510 ++++++++++++++++--------------------- > > northd/northd.h | 593 +++- > > northd/ovn-northd.c | 7 + > > tests/ovn-macros.at | 44 + > > tests/ovn-northd.at | 953 +++++- > > 30 files changed, 8825 insertions(+), 4120 deletions(-) > > create mode 100644 northd/en-global-config.c > > create mode 100644 northd/en-global-config.h > > create mode 100644 northd/en-lr-nat.c > > create mode 100644 northd/en-lr-nat.h > > create mode 100644 northd/en-lr-stateful.c > > create mode 100644 northd/en-lr-stateful.h > > create mode 100644 northd/en-ls-stateful.c > > create mode 100644 northd/en-ls-stateful.h > > create mode 100644 northd/lflow-mgr.c > > create mode 100644 northd/lflow-mgr.h > > > > -- > > 2.41.0 > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev