On 10/5/22 08:19, Han Zhou wrote: > On Fri, Sep 30, 2022 at 7:01 AM Dumitru Ceara <dce...@redhat.com> wrote: >> >> Sometimes network components are compute node-specific. Sometimes such >> components are replicated, almost identically, for multiple nodes >> in the cluster. >> >> One such example is the case of Kubernetes NodePort services which >> translate (in the ovn-kubernetes case) to Load_Balancer >> objects being applied to each and every node's logical gateway router. >> These load balancers are almost identical, the main difference being >> the fact that they use different VIPs (the node's IP). >> >> With the current OVN load balancer design, this becomes a problem at >> scale because the number of load balancers that must be configured is >> N x M (N nodes times M services). >> >> This series proposes a new concept in OVN: virtual network component >> templates. The goal of the templates is to help reduce resource >> consumption in the OVN central components in specific cases like the one >> described above. >> >> To achieve that, the CMS will instead configure a "templated" load >> balancer for every service and apply that single template record to >> the cluster-wide load balancer group. This template is then >> instantiated differently on different compute nodes. This translation >> is controlled through per-chassis "template variables" configured by >> the CMS in the new NB.Template_Var table. >> > Thanks Dumitru for the great improvement! >
Thanks for reviewing this! >> A syntetic benchmark simulating what an OpenShift router (using Node >> Port services) scale test would do shows the following preliminary >> results: >> A. 120 node, 2K NodePort services: >> - before: >> - Southbound DB size on disk (compacted): ~385MB >> - Southbound DB memory usage (RSS): ~3GB >> - Southbound DB logical flows: 720K >> >> - after: >> - Southbound DB size on disk (compacted): ~100MB >> - Southbound DB memory usage (RSS): ~250MB >> - Southbound DB logical flows: 6K >> >> B. 250 node, 2K NodePort services: >> - after (didn't run the "before" test as it was taking way too long): >> - Southbound DB size on disk (compacted): ~155MB >> - Southbound DB memory usage (RSS): ~760MB >> - Southbound DB logical flows: 6K I'll add the (hacky) benchmark script below just for clarity. > > A quick question to the test. How many LSPs per node? I am just wondering, > how could the number of lflows be the same (6k) when number of nodes > increased from 120 to 250? For some of my scale tests, the number of lflows > are far more than this even if I don't create any LBs. (also consider that > ovn-k8s deployment has at least an ext-LS and a GR per node) I really only focused on logical flows (and SB.Load_Balancers) created due to NB.Load_Balancers provisioned like ovn-k8s provisions them today. So the test doesn't add a lot of LSPs. However, in the "OpenShift router" scenario I was trying to fix, the load due to LSPs is also minimal. It's exactly the huge number of (very similar) load balancers that causes issues. > I have no doubt of the effectiveness of this improvement, but just need to > understand the numbers better since I am also doing scale tests and > measurements on top of this patch series. Sure, makes complete sense. And if we can find even more use cases for component templates, even better! > > Thanks, > Han Thanks, Dumitru --- diff --git a/tutorial/node-template-lb-stress.sh b/tutorial/node-template-lb-stress.sh new file mode 100755 index 0000000000..e1a051182a --- /dev/null +++ b/tutorial/node-template-lb-stress.sh @@ -0,0 +1,57 @@ +#!/bin/bash + +nrtr=$1 +nlb=$2 +nbackends=$3 + +echo "ROUTERS : $nrtr" +echo "LBS : $nlb" +echo "BACKENDS PER LB: $nbackends" + +export OVN_NB_DAEMON=$(ovn-nbctl --detach) +export OVN_SB_DAEMON=$(ovn-sbctl --detach) +trap "killall -9 ovn-nbctl; killall -9 ovn-sbctl" EXIT + +lbg=$(ovn-nbctl create load_balancer_group name=lbg) +for i in $(seq $nrtr); do + r=lr-$i + lrp=lrp-$i + echo Router $r + ovn-nbctl lr-add $r -- set logical_router $r load_balancer_group=$lbg + ovn-nbctl lrp-add $r $lrp 00:00:00:00:01:00 88.88.88.88 + s=ls-$i + echo Switch $s + ovn-nbctl ls-add $s -- set logical_switch $s load_balancer_group=$lbg + lsp=lsp-$i + echo LSP $lsp + ovn-nbctl lsp-add $s $lsp + ovs-vsctl add-port br-int $lsp -- set interface $lsp external_ids:iface-id=$lsp +done + +for l in $(seq $nlb); do + lb=lb-$l + ovn-nbctl --template lb-add $lb "^vip:$l" "^backends$l" tcp + lb_uuid=$(ovn-nbctl --columns _uuid --bare find load_balancer name=$lb) + ovn-nbctl add load_balancer_group $lbg load_balancer $lb_uuid +done + +for i in $(seq $nrtr); do + ovn-nbctl create chassis_template_var name=vip value=42.42.42.$i chassis_name="chassis-$i" + + cmd= + for j in $(seq $nlb); do + echo "CREATING TEMPLATE VARS for RTR $i LB $j" + backends="" + for k in $(seq $nbackends); do + j1=$(expr $j / 250) + j2=$(expr $j % 250) + backends="42.$k.$j1.$j2:$j,$backends" + done + cmd="$cmd -- create chassis_template_var name=backends$j value=\"$backends\" chassis_name=\"chassis-$i\"" + if [ $(expr $j % 1000) -eq "0" ]; then + ovn-nbctl $cmd + cmd= + fi + done + ovn-nbctl $cmd +done _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev