On Fri, Mar 22, 2024 at 7:51 AM Roberto Bartzen Acosta
<roberto.aco...@luizalabs.com> wrote:
>
> Hi Mark,
>
> Thanks for your feedback.
>
> Em seg., 18 de mar. de 2024 às 11:53, Mark Michelson <mmich...@redhat.com>
> escreveu:
>
> > Hi Roberto,
> >
> > I have some concerns about this patch. Let's use the test case you added
> > as an example network. Let's bind the vms and DGPs to hypervisors:
> >
> > * vm1 and lr1-ts1 are bound to hypervisor hv1
> > * vm2 and lr1-ts2 are bound to hypervisor hv2
> >
> > Now imagine a packet arrives on lr1-ts1. The packet gets load balanced
> > and sent to vm2. In this case, since lr1-ts1 is on hv1, the ct_lb_mark()
> > action runs there, creating a conntrack entry on hv1. However, the
> > packet eventually is tunneled to hv2 so that it can be output to vm2.
> >
> > Now vm2 replies to the packet. Is there anything that ensures that the
> > reply from vm2 gets sent to hv1 for the egress pipeline of lr1? If so,
> > then this should work fine since the packet will be unDNATted as
> > expected on hv1. However, if the packet runs all of lr1's pipelines on
> > hv2, then there is no conntrack entry present, and the attempt to unDNAT
> > will not succeed. The packet will either be dropped because of invalid
> > CT, or the packet will have an incorrect source IP and port. Either way,
> > this isn't what is desired.
> >
>
> yep, you're right! that makes sense.
> If the packet comes back with a chassis that does not have the related LB
> contrack entry corresponding to the initial request, this will trigger a
> miss in the pipeline.
>
> I tried to disable ct tracking for lb by configuring the parameter on the
> router, but I still wasn't successful. E.g.:
> options:lb_force_snat_ip="172.16.200.201"
>
> Even changing the lb_force snat ip on the router, or creating a SNAT
> "stateless" rule, I still see this action in the output pipeline in the
> highest priority table (table=1).
>
>   table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-01" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-01")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-02" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-02")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-03" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-03")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-04" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-04")), action=(ct_dnat_in_czone;)
>   table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
> 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
> 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
> 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
> 8000)) && outport == "incus-net40-lr-lrp-ext" &&
> is_chassis_resident("cr-incus-net40-lr-lrp-ext")),
> action=(ct_dnat_in_czone;)
>
>
> Considering that the return when using multiple DGPs is probably associated
> with ECMP, any idea on how to change the rules so that the LB output rules
> are 'stateless' (not ct dependent) in this case with multiple DGPs? I
> imagine this solves the problem and guarantees a return for any chassis.

I think the only way to solve your problem is to add NAT support for a
router having multiple DGPs.
I'm not sure how easy that is or if it is even possible to support.
We need to explore this.

Numan

>
> Thanks,
> Roberto
>
> PS: I have a complex load balancer scenario for the use case with multiple
> DPGs, and 'internal' VIP addresses + OVN interconnect. I can explain the
> general context if you are interested :)
>
>
> > On 2/19/24 16:33, Roberto Bartzen Acosta wrote:
> > > This commit fixes the build_distr_lrouter_nat_flows_for_lb function to
> > > include one NAT flow entry for each DGP in use. Since we have added
> > support
> > > to create multiple gateway ports per logical router, it's necessary to
> > > include in the LR nat rules pipeline a specific entry for each attached
> > > DGP. Otherwise, the ingress traffic is only redirected when the incoming
> > > LRP matches the chassis_resident field.
> > >
> > > Considering that DNAT rules for DGPs were implemented with the need to
> > > configure the DGP-related gateway-port column, the load-balancer NAT rule
> > > configuration can use a similar idea. In this case, we don't know the LRP
> > > responsible for the incoming traffic, and therefore we must apply the
> > > load-balancer automatically created NAT rule in all DGPs to allow the
> > > incoming traffic.
> > >
> > > Reported-at: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322
> > > Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port
> > support.")
> > > Signed-off-by: Roberto Bartzen Acosta <roberto.aco...@luizalabs.com>
> > > ---
> > >   northd/en-lr-stateful.c | 12 ------
> > >   northd/northd.c         | 14 ++++---
> > >   tests/ovn-northd.at     | 92 +++++++++++++++++++++++++++++++++++++++++
> > >   3 files changed, 100 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
> > > index 6d0192487..7ffa4a690 100644
> > > --- a/northd/en-lr-stateful.c
> > > +++ b/northd/en-lr-stateful.c
> > > @@ -537,18 +537,6 @@ lr_stateful_record_create(struct lr_stateful_table
> > *table,
> > >
> > >       table->array[od->index] = lr_stateful_rec;
> > >
> > > -    /* Load balancers are not supported (yet) if a logical router has
> > multiple
> > > -     * distributed gateway port.  Log a warning. */
> > > -    if (lr_stateful_rec->has_lb_vip && lr_has_multiple_gw_ports(od)) {
> > > -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> > > -        VLOG_WARN_RL(&rl, "Load-balancers are configured on logical "
> > > -                     "router %s, which has %"PRIuSIZE" distributed "
> > > -                     "gateway ports. Load-balancer is not supported "
> > > -                     "yet when there is more than one distributed "
> > > -                     "gateway port on the router.",
> > > -                     od->nbr->name, od->n_l3dgw_ports);
> > > -    }
> > > -
> > >       return lr_stateful_rec;
> > >   }
> > >
> > > diff --git a/northd/northd.c b/northd/northd.c
> > > index 2c3560ce2..7eb943d2f 100644
> > > --- a/northd/northd.c
> > > +++ b/northd/northd.c
> > > @@ -10919,10 +10919,9 @@ static void
> > >   build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx
> > *ctx,
> > >                                        enum lrouter_nat_lb_flow_type
> > type,
> > >                                        struct ovn_datapath *od,
> > > -                                     struct lflow_ref *lflow_ref)
> > > +                                     struct lflow_ref *lflow_ref,
> > > +                                     struct ovn_port *dgp)
> > >   {
> > > -    struct ovn_port *dgp = od->l3dgw_ports[0];
> > > -
> > >       const char *undnat_action;
> > >
> > >       switch (type) {
> > > @@ -10953,7 +10952,7 @@ build_distr_lrouter_nat_flows_for_lb(struct
> > lrouter_nat_lb_flows_ctx *ctx,
> > >
> > >       if (ctx->lb_vip->n_backends || !ctx->lb_vip->empty_backend_rej) {
> > >           ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
> > > -                      od->l3dgw_ports[0]->cr_port->json_key);
> > > +                      dgp->cr_port->json_key);
> > >       }
> > >
> > >       ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT,
> > ctx->prio,
> > > @@ -11164,8 +11163,11 @@ build_lrouter_nat_flows_for_lb(
> > >           if (!od->n_l3dgw_ports) {
> > >               bitmap_set1(gw_dp_bitmap[type], index);
> > >           } else {
> > > -            build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> > > -                                                 lb_dps->lflow_ref);
> > > +            for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
> > > +                struct ovn_port *dgp = od->l3dgw_ports[i];
> > > +                build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> > > +                                                     lb_dps->lflow_ref,
> > dgp);
> > > +            }
> > >           }
> > >
> > >           if (lb->affinity_timeout) {
> > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > index 6fdd761da..fa24935e1 100644
> > > --- a/tests/ovn-northd.at
> > > +++ b/tests/ovn-northd.at
> > > @@ -12313,3 +12313,95 @@ check_engine_stats northd recompute nocompute
> > >   check_engine_stats lflow recompute nocompute
> > >
> > >   AT_CLEANUP
> > > +
> > > +OVN_FOR_EACH_NORTHD_NO_HV([
> > > +AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)])
> > > +ovn_start
> > > +
> > > +check ovn-nbctl ls-add public
> > > +check ovn-nbctl lr-add lr1
> > > +
> > > +# lr1 DGP ts1
> > > +check ovn-nbctl ls-add ts1
> > > +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24
> > > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2
> > > +
> > > +# lr1 DGP ts2
> > > +check ovn-nbctl ls-add ts2
> > > +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24
> > > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3
> > > +
> > > +# lr1 DGP public
> > > +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16
> > > +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24
> > > +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1
> > > +
> > > +check ovn-nbctl ls-add s1
> > > +# s1 - lr1
> > > +check ovn-nbctl lsp-add s1 s1_lr1
> > > +check ovn-nbctl lsp-set-type s1_lr1 router
> > > +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1"
> > > +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1
> > > +
> > > +# s1 - backend vm1
> > > +check ovn-nbctl lsp-add s1 vm1
> > > +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101"
> > > +
> > > +# s1 - backend vm2
> > > +check ovn-nbctl lsp-add s1 vm2
> > > +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102"
> > > +
> > > +# s1 - backend vm3
> > > +check ovn-nbctl lsp-add s1 vm3
> > > +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103"
> > > +
> > > +# Add the lr1 DGP ts1 to the public switch
> > > +check ovn-nbctl lsp-add public public_lr1_ts1
> > > +check ovn-nbctl lsp-set-type public_lr1_ts1 router
> > > +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router
> > > +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1
> > nat-addresses=router
> > > +
> > > +# Add the lr1 DGP ts2 to the public switch
> > > +check ovn-nbctl lsp-add public public_lr1_ts2
> > > +check ovn-nbctl lsp-set-type public_lr1_ts2 router
> > > +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router
> > > +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2
> > nat-addresses=router
> > > +
> > > +# Add the lr1 DGP public to the public switch
> > > +check ovn-nbctl lsp-add public public_lr1
> > > +check ovn-nbctl lsp-set-type public_lr1 router
> > > +check ovn-nbctl lsp-set-addresses public_lr1 router
> > > +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public
> > nat-addresses=router
> > > +
> > > +# Create the Load Balancer lb1
> > > +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1"
> > "172.16.0.103,172.16.0.102,172.16.0.101"
> > > +
> > > +# Associate load balancer to s1
> > > +check ovn-nbctl ls-lb-add s1 lb1
> > > +check ovn-nbctl --wait=sb sync
> > > +
> > > +ovn-sbctl dump-flows s1 > s1flows
> > > +AT_CAPTURE_FILE([s1flows])
> > > +
> > > +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep
> > "30.0.0.1"], [0], [dnl
> > > +  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1
> > && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;)
> > > +])
> > > +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep
> > "30.0.0.1"], [0], [dnl
> > > +  table=??(ls_in_lb           ), priority=110  , match=(ct.new &&
> > ip4.dst == 30.0.0.1), action=(reg0[[1]] = 0;
> > ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > > +])
> > > +
> > > +# Associate load balancer to lr1 with DGP
> > > +check ovn-nbctl lr-lb-add lr1 lb1
> > > +check ovn-nbctl --wait=sb sync
> > > +
> > > +ovn-sbctl dump-flows lr1 > lr1flows
> > > +AT_CAPTURE_FILE([lr1flows])
> > > +
> > > +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep
> > "30.0.0.1"], [0], [dnl
> > > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> > !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> > is_chassis_resident("cr-lr1-ts1")),
> > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> > !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> > is_chassis_resident("cr-lr1-ts2")),
> > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> > !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> > is_chassis_resident("cr-lr1_public")),
> > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > > +])
> > > +
> > > +AT_CLEANUP
> > > +])
> >
> >
>
> --
>
>
>
>
> _‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’._
>
>
> * **‘Apesar do Magazine Luiza tomar
> todas as precauções razoáveis para assegurar que nenhum vírus esteja
> presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por
> quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*
>
>
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to