Re: [ovs-dev] [PATCH ovn] northd: Use datapath groups for southbound load balancers.
Thanks Ilya and Ales. I removed the paragraph about ovn-northd from the commit message and replaced it with the suggested bullet point. I then pushed the change to the main branch. On 8/9/22 04:31, Ales Musil wrote: On Wed, Aug 3, 2022 at 4:44 PM Ilya Maximets wrote: Some CMSes, like ovn-kubernetes, tend to create load balancers attached to a big number of logical switches or routers. For each of these load balancers northd creates a record in Load_Balancer table of the Southbound database with all the logical datapaths (switches, routers) listed in the 'datapaths' column. All these logical datapaths are references to datapath bindings. With large number of load balancers like these applied to the same set of load balancers, the size of the Southbound database can grow significantly and these references can take up to 90% of all the traffic between Sb DB and ovn-controllers. For example, while creating 3 load balancers (1 for tcp, udp and sctp) in a 250-node cluster in ovn-heater cluster-density test, the database update sent to every ovn-controller is about 40 KB in size, out of which 36 KB are just UUIDs of 250 logical datapaths repeated 3 times. Introducing a new column 'datapath_group' in a Load_Balancer table, so we can create a group once and just refer to it from each load balancer. This saves a lot of CPU time, memory and network bandwidth. Re-using already existing Logical_DP_Group table to store these groups. In 250 node cluster-density test with ovn-heater that creates 30K load balancers applied to all 250 logical switches the following improvement was observed: - Southbound database size decreased from 310 MB to 118 MB. - CPU time on Southbound ovsdb-server process decreased by a factor of 7, from ~35 minutes per server to ~5. - Memory consumption on Southbound ovsdb-server process decreased from 12 GB (peaked at ~14) RSS down to ~1 GB (peaked at ~2). - Memory consumption on ovn-controller processes decreased by 400 MB on each node, from 1300 MB to 900 MB. Similar decrease observed for ovn-northd as well. We're adding some extra work to ovn-northd process with this change to find/create datapath groups. CPU time in the test above increased by ~10%, but overall round trip time for changes in OVN to be propagated to OVN controllers is still noticeably lower due to improvements in other components like Sb DB. Implementation details: - Groups are not shared between logical flows and load balancers, so there could be duplicated groups this way, but that should not be a big problem. Sharing groups will require some code re-structuring with unclear benefits. - ovn-controller and ovn-sbctl are parsing both the 'datapaths' column and the 'datapath_group' to keep them backward compatible. - Load_Balancer table is not conditionally monitored by ovn-controller right now, so not changing that behavior. If conditional monitoring will be introduced later, same considerations as for logical flows should be applied. Signed-off-by: Ilya Maximets --- controller/lflow.c | 44 -- controller/ovn-controller.c | 54 northd/northd.c | 168 ovn-sb.ovsschema| 8 +- ovn-sb.xml | 5 ++ tests/ovn-northd.at | 68 --- utilities/ovn-sbctl.c | 40 + 7 files changed, 278 insertions(+), 109 deletions(-) diff --git a/controller/lflow.c b/controller/lflow.c index 6055097b5..eef44389f 100644 --- a/controller/lflow.c +++ b/controller/lflow.c @@ -2063,6 +2063,20 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb, ofpbuf_uninit(&ofpacts); } +static void +add_lb_ct_snat_hairpin_for_dp(const struct ovn_controller_lb *lb, + const struct sbrec_datapath_binding *datapath, + struct match *dp_match, + struct ofpbuf *dp_acts, + struct ovn_desired_flow_table *flow_table) +{ +match_set_metadata(dp_match, htonll(datapath->tunnel_key)); +ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200, + lb->slb->header_.uuid.parts[0], + dp_match, dp_acts, &lb->slb->header_.uuid, + NX_CTLR_NO_METER, NULL); +} + static void add_lb_ct_snat_hairpin_dp_flows(struct ovn_controller_lb *lb, uint32_t id, @@ -2088,12 +2102,15 @@ add_lb_ct_snat_hairpin_dp_flows(struct ovn_controller_lb *lb, struct match dp_match = MATCH_CATCHALL_INITIALIZER; for (size_t i = 0; i < lb->slb->n_datapaths; i++) { -match_set_metadata(&dp_match, - htonll(lb->slb->datapaths[i]->tunnel_key)); -ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200, - lb->slb->header_.uuid.parts[0], -
Re: [ovs-dev] [PATCH ovn] northd: Use datapath groups for southbound load balancers.
On Wed, Aug 3, 2022 at 4:44 PM Ilya Maximets wrote: > Some CMSes, like ovn-kubernetes, tend to create load balancers attached > to a big number of logical switches or routers. For each of these > load balancers northd creates a record in Load_Balancer table of the > Southbound database with all the logical datapaths (switches, routers) > listed in the 'datapaths' column. All these logical datapaths are > references to datapath bindings. With large number of load balancers > like these applied to the same set of load balancers, the size of > the Southbound database can grow significantly and these references > can take up to 90% of all the traffic between Sb DB and ovn-controllers. > > For example, while creating 3 load balancers (1 for tcp, udp and sctp) > in a 250-node cluster in ovn-heater cluster-density test, the database > update sent to every ovn-controller is about 40 KB in size, out of > which 36 KB are just UUIDs of 250 logical datapaths repeated 3 times. > > Introducing a new column 'datapath_group' in a Load_Balancer table, > so we can create a group once and just refer to it from each load > balancer. This saves a lot of CPU time, memory and network bandwidth. > Re-using already existing Logical_DP_Group table to store these groups. > > In 250 node cluster-density test with ovn-heater that creates 30K load > balancers applied to all 250 logical switches the following improvement > was observed: > > - Southbound database size decreased from 310 MB to 118 MB. > - CPU time on Southbound ovsdb-server process decreased by a >factor of 7, from ~35 minutes per server to ~5. > - Memory consumption on Southbound ovsdb-server process decreased >from 12 GB (peaked at ~14) RSS down to ~1 GB (peaked at ~2). > - Memory consumption on ovn-controller processes decreased >by 400 MB on each node, from 1300 MB to 900 MB. Similar decrease >observed for ovn-northd as well. > > We're adding some extra work to ovn-northd process with this change > to find/create datapath groups. CPU time in the test above increased > by ~10%, but overall round trip time for changes in OVN to be > propagated to OVN controllers is still noticeably lower due to > improvements in other components like Sb DB. > > Implementation details: > - Groups are not shared between logical flows and load balancers, >so there could be duplicated groups this way, but that should >not be a big problem. Sharing groups will require some code >re-structuring with unclear benefits. > - ovn-controller and ovn-sbctl are parsing both the 'datapaths' column >and the 'datapath_group' to keep them backward compatible. > - Load_Balancer table is not conditionally monitored by ovn-controller >right now, so not changing that behavior. If conditional monitoring >will be introduced later, same considerations as for logical flows >should be applied. > > Signed-off-by: Ilya Maximets > --- > controller/lflow.c | 44 -- > controller/ovn-controller.c | 54 > northd/northd.c | 168 > ovn-sb.ovsschema| 8 +- > ovn-sb.xml | 5 ++ > tests/ovn-northd.at | 68 --- > utilities/ovn-sbctl.c | 40 + > 7 files changed, 278 insertions(+), 109 deletions(-) > > diff --git a/controller/lflow.c b/controller/lflow.c > index 6055097b5..eef44389f 100644 > --- a/controller/lflow.c > +++ b/controller/lflow.c > @@ -2063,6 +2063,20 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb > *lb, > ofpbuf_uninit(&ofpacts); > } > > +static void > +add_lb_ct_snat_hairpin_for_dp(const struct ovn_controller_lb *lb, > + const struct sbrec_datapath_binding > *datapath, > + struct match *dp_match, > + struct ofpbuf *dp_acts, > + struct ovn_desired_flow_table *flow_table) > +{ > +match_set_metadata(dp_match, htonll(datapath->tunnel_key)); > +ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200, > + lb->slb->header_.uuid.parts[0], > + dp_match, dp_acts, &lb->slb->header_.uuid, > + NX_CTLR_NO_METER, NULL); > +} > + > static void > add_lb_ct_snat_hairpin_dp_flows(struct ovn_controller_lb *lb, > uint32_t id, > @@ -2088,12 +2102,15 @@ add_lb_ct_snat_hairpin_dp_flows(struct > ovn_controller_lb *lb, > struct match dp_match = MATCH_CATCHALL_INITIALIZER; > > for (size_t i = 0; i < lb->slb->n_datapaths; i++) { > -match_set_metadata(&dp_match, > - htonll(lb->slb->datapaths[i]->tunnel_key)); > -ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, > 200, > - lb->slb->header_.uuid.parts[0], > - &dp_match, &dp_acts, > &lb->slb->
Re: [ovs-dev] [PATCH ovn] northd: Use datapath groups for southbound load balancers.
On 8/3/22 16:43, Ilya Maximets wrote: > Some CMSes, like ovn-kubernetes, tend to create load balancers attached > to a big number of logical switches or routers. For each of these > load balancers northd creates a record in Load_Balancer table of the > Southbound database with all the logical datapaths (switches, routers) > listed in the 'datapaths' column. All these logical datapaths are > references to datapath bindings. With large number of load balancers > like these applied to the same set of load balancers, the size of > the Southbound database can grow significantly and these references > can take up to 90% of all the traffic between Sb DB and ovn-controllers. > > For example, while creating 3 load balancers (1 for tcp, udp and sctp) > in a 250-node cluster in ovn-heater cluster-density test, the database > update sent to every ovn-controller is about 40 KB in size, out of > which 36 KB are just UUIDs of 250 logical datapaths repeated 3 times. > > Introducing a new column 'datapath_group' in a Load_Balancer table, > so we can create a group once and just refer to it from each load > balancer. This saves a lot of CPU time, memory and network bandwidth. > Re-using already existing Logical_DP_Group table to store these groups. > > In 250 node cluster-density test with ovn-heater that creates 30K load > balancers applied to all 250 logical switches the following improvement > was observed: > > - Southbound database size decreased from 310 MB to 118 MB. > - CPU time on Southbound ovsdb-server process decreased by a >factor of 7, from ~35 minutes per server to ~5. > - Memory consumption on Southbound ovsdb-server process decreased >from 12 GB (peaked at ~14) RSS down to ~1 GB (peaked at ~2). > - Memory consumption on ovn-controller processes decreased >by 400 MB on each node, from 1300 MB to 900 MB. Similar decrease >observed for ovn-northd as well. > > We're adding some extra work to ovn-northd process with this change > to find/create datapath groups. CPU time in the test above increased > by ~10%, but overall round trip time for changes in OVN to be > propagated to OVN controllers is still noticeably lower due to > improvements in other components like Sb DB. Actually, I was using the data from the prliminary run for this commit message and also looked into a wrong place while checking northd performance numbers :/ . Re-run with the version of the code in this patch shows 12% performance improvement on ovn-northd looking at actual length of poll intervals. So, northd-related paragraph above shuold be thrown away and replaced with one more bullet in observed improvements: - Poll intervals on ovn-northd reduced by 12%. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn] northd: Use datapath groups for southbound load balancers.
Some CMSes, like ovn-kubernetes, tend to create load balancers attached to a big number of logical switches or routers. For each of these load balancers northd creates a record in Load_Balancer table of the Southbound database with all the logical datapaths (switches, routers) listed in the 'datapaths' column. All these logical datapaths are references to datapath bindings. With large number of load balancers like these applied to the same set of load balancers, the size of the Southbound database can grow significantly and these references can take up to 90% of all the traffic between Sb DB and ovn-controllers. For example, while creating 3 load balancers (1 for tcp, udp and sctp) in a 250-node cluster in ovn-heater cluster-density test, the database update sent to every ovn-controller is about 40 KB in size, out of which 36 KB are just UUIDs of 250 logical datapaths repeated 3 times. Introducing a new column 'datapath_group' in a Load_Balancer table, so we can create a group once and just refer to it from each load balancer. This saves a lot of CPU time, memory and network bandwidth. Re-using already existing Logical_DP_Group table to store these groups. In 250 node cluster-density test with ovn-heater that creates 30K load balancers applied to all 250 logical switches the following improvement was observed: - Southbound database size decreased from 310 MB to 118 MB. - CPU time on Southbound ovsdb-server process decreased by a factor of 7, from ~35 minutes per server to ~5. - Memory consumption on Southbound ovsdb-server process decreased from 12 GB (peaked at ~14) RSS down to ~1 GB (peaked at ~2). - Memory consumption on ovn-controller processes decreased by 400 MB on each node, from 1300 MB to 900 MB. Similar decrease observed for ovn-northd as well. We're adding some extra work to ovn-northd process with this change to find/create datapath groups. CPU time in the test above increased by ~10%, but overall round trip time for changes in OVN to be propagated to OVN controllers is still noticeably lower due to improvements in other components like Sb DB. Implementation details: - Groups are not shared between logical flows and load balancers, so there could be duplicated groups this way, but that should not be a big problem. Sharing groups will require some code re-structuring with unclear benefits. - ovn-controller and ovn-sbctl are parsing both the 'datapaths' column and the 'datapath_group' to keep them backward compatible. - Load_Balancer table is not conditionally monitored by ovn-controller right now, so not changing that behavior. If conditional monitoring will be introduced later, same considerations as for logical flows should be applied. Signed-off-by: Ilya Maximets --- controller/lflow.c | 44 -- controller/ovn-controller.c | 54 northd/northd.c | 168 ovn-sb.ovsschema| 8 +- ovn-sb.xml | 5 ++ tests/ovn-northd.at | 68 --- utilities/ovn-sbctl.c | 40 + 7 files changed, 278 insertions(+), 109 deletions(-) diff --git a/controller/lflow.c b/controller/lflow.c index 6055097b5..eef44389f 100644 --- a/controller/lflow.c +++ b/controller/lflow.c @@ -2063,6 +2063,20 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb, ofpbuf_uninit(&ofpacts); } +static void +add_lb_ct_snat_hairpin_for_dp(const struct ovn_controller_lb *lb, + const struct sbrec_datapath_binding *datapath, + struct match *dp_match, + struct ofpbuf *dp_acts, + struct ovn_desired_flow_table *flow_table) +{ +match_set_metadata(dp_match, htonll(datapath->tunnel_key)); +ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200, + lb->slb->header_.uuid.parts[0], + dp_match, dp_acts, &lb->slb->header_.uuid, + NX_CTLR_NO_METER, NULL); +} + static void add_lb_ct_snat_hairpin_dp_flows(struct ovn_controller_lb *lb, uint32_t id, @@ -2088,12 +2102,15 @@ add_lb_ct_snat_hairpin_dp_flows(struct ovn_controller_lb *lb, struct match dp_match = MATCH_CATCHALL_INITIALIZER; for (size_t i = 0; i < lb->slb->n_datapaths; i++) { -match_set_metadata(&dp_match, - htonll(lb->slb->datapaths[i]->tunnel_key)); -ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200, - lb->slb->header_.uuid.parts[0], - &dp_match, &dp_acts, &lb->slb->header_.uuid, - NX_CTLR_NO_METER, NULL); +add_lb_ct_snat_hairpin_for_dp(lb, lb->slb->datapaths[i], + &dp_match, &dp_acts, flow_table); +} +if (lb->slb->datap