On Thu, Apr 6, 2023 at 5:31 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
> On 4/6/23 16:53, Ales Musil wrote: > > > > > > On Thu, Apr 6, 2023 at 4:48 PM Ilya Maximets <i.maxim...@ovn.org > <mailto:i.maxim...@ovn.org>> wrote: > > > > On 4/6/23 16:37, Ales Musil wrote: > > > There are essentially three problems with the current > > > combination of DGP + SNAT + LB: > > > > > > 1) The first packet is being SNATed in common zone due > > > to a problem with pinctrl not preserving ct_mark/ct_label. > > > The commit would create a SNAT entry within the same with DNAT > > > entry. > > > > > > 2) The unSNAT for reply always happened in common zone because of > > > the loopback check which would be triggered only when we loop > > > the packet through the LR. Now there are two possibilities how > > > the reply packet would be handled: > > > > > > a) If the entry for SNAT in common zone did not time out yet, the > > > packet would be passed through unSNAT in common zone which would > > > be fine and continue on. However, the unDNAT wouldn't work due to > > > the limitation of CT not capable of doing unSNAT/unDNAT on the same > > > packet twice in the same zone. So the reply would arrive to > > > the original interface, but with wrong source address. > > > > > > b) If the entry for SNAT timed out it would loop and do unSNAT > correctly > > > in separate zone and then also unDNAT. This is not possible > anymore with > > > a recent change 8c341b9d (northd: Drop packets destined to router > owned NAT IP for DGP). > > > The reply would be dropped before looping after that change co the > traffic > > > would never arrive to the original interface. > > > > > > 3) The unDNAT was happening only if the DGP was outport meaning > > > the reply traffic was routed out, but in the opposite case > > > the unDNAT was happening only because of the looping which made > > > outport=inport. That's why it worked before introduction of > explicit drop. > > > > > > In order to fix all those issues do two changes: > > > > > > 1) Include inport in the unDNAT match, so that we have both > > > routing directions covered e.g. (inport == "dgp_port" || outport > == "dpg_port"). > > > > > > 2) Always use the separate zone for SNAT and DNAT. As the common > > > zone was needed for HWOL make the common zone optional with > > > configuration option called "use_common_zone". This option is > > > by default "false" and can be specified per LR. Use of separate > > > zones also eliminates the need for the flag propagation > > > in "lr_out_chk_dnat_local" stage, removing the match on > ct_mark/ct_label. > > > > > > The "SNAT in separate zone from DNAT" system test is moved to run > only > > > with kernel datapath. The reason is that this test doesn't work > with > > > userspace datapath due to recirculation limit, currently set to 6 > [0]. > > > > > > [0] > https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102 > < > https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102 > > > > > Reported-at: https://bugzilla.redhat.com/2161281 < > https://bugzilla.redhat.com/2161281> > > > Signed-off-by: Ales Musil <amu...@redhat.com <mailto: > amu...@redhat.com>> > > > --- > > > v2: Fix flaky system test. > > > v3: Rebase on top of current main. > > > v4: Rebase on top of current main. > > > Move the system test to system-ovn-kmod > > > to unblock the failures with userspace. > > > v5: Rebase on top of current main. > > > v6: Rebase on top of current main. > > > Change the config to a global option instead. > > > v7: Address comments from Dumitru: > > > Rename the stateless helper to better reflect > > > that it applies only to snat_and_dnat type. > > > Call the common zone and separate zone functions > > > separately based on condition. > > > --- > > > northd/northd.c | 536 > +++++++++++++++++++++------------------ > > > northd/ovn-northd.8.xml | 90 +------ > > > ovn-nb.xml | 9 + > > > tests/ovn-northd.at <http://ovn-northd.at> | 213 > +++++++++++----- > > > tests/ovn.at <http://ovn.at> | 3 + > > > tests/system-ovn-kmod.at <http://system-ovn-kmod.at> | 166 > ++++++++++++ > > > tests/system-ovn.at <http://system-ovn.at> | 117 --------- > > > 7 files changed, 634 insertions(+), 500 deletions(-) > > > > > > diff --git a/northd/northd.c b/northd/northd.c > > > index 7a3886de0..1d13c3d8e 100644 > > > --- a/northd/northd.c > > > +++ b/northd/northd.c > > > @@ -66,6 +66,9 @@ static bool check_lsp_is_up; > > > > > > static bool install_ls_lb_from_router; > > > > > > +/* Use common zone for SNAT and DNAT if this option is set to > "true". */ > > > +static bool use_common_zone = false; > > > + > > > /* MAC allocated for service monitor usage. Just one mac is > allocated > > > * for this purpose and ovn-controller's on each chassis will > make use > > > * of this mac when sending out the packets to monitor the > services > > > @@ -10662,6 +10665,8 @@ > build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, > > > enum > lrouter_nat_lb_flow_type type, > > > struct ovn_datapath *od) > > > { > > > + struct ovn_port *dgp = od->l3dgw_ports[0]; > > > + > > > const char *undnat_action; > > > > > > switch (type) { > > > @@ -10673,9 +10678,12 @@ > build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, > > > break; > > > case LROUTER_NAT_LB_FLOW_NORMAL: > > > case LROUTER_NAT_LB_FLOW_MAX: > > > - undnat_action = od->is_gw_router ? "ct_dnat;" : > "ct_dnat_in_czone;"; > > > + undnat_action = !od->is_gw_router && use_common_zone > > > + ? "ct_dnat_in_czone;" > > > + : "ct_dnat;"; > > > break; > > > } > > > + > > > /* Store the match lengths, so we can reuse the ds buffer. */ > > > size_t new_match_len = ctx->new_match->length; > > > size_t undnat_match_len = ctx->undnat_match->length; > > > @@ -10702,10 +10710,9 @@ > build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, > > > return; > > > } > > > > > > - ds_put_format(ctx->undnat_match, > > > - ") && outport == %s && is_chassis_resident(%s)", > > > - od->l3dgw_ports[0]->json_key, > > > - od->l3dgw_ports[0]->cr_port->json_key); > > > + ds_put_format(ctx->undnat_match, ") && (inport == %s || > outport == %s)" > > > + " && is_chassis_resident(%s)", dgp->json_key, > dgp->json_key, > > > + dgp->cr_port->json_key); > > > ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_OUT_UNDNAT, > 120, > > > ds_cstr(ctx->undnat_match), > undnat_action, > > > &ctx->lb->nlb->header_); > > > @@ -11157,6 +11164,14 @@ lrouter_dnat_and_snat_is_stateless(const > struct nbrec_nat *nat) > > > !strcmp(nat->type, "dnat_and_snat"); > > > } > > > > > > +static inline bool > > > +lrouter_use_common_zone_in_nat(const struct ovn_datapath *od, > > > + const struct nbrec_nat *nat) > > > +{ > > > + return !od->is_gw_router && use_common_zone && > > > + !lrouter_dnat_and_snat_is_stateless(nat); > > > +} > > > + > > > /* Handles the match criteria and actions in logical flow > > > * based on external ip based NAT rule filter. > > > * > > > @@ -13648,85 +13663,90 @@ build_lrouter_ipv4_ip_input(struct > ovn_port *op, > > > } > > > } > > > > > > +static void > > > +build_lrouter_in_unsnat_in_czone_flow(struct hmap *lflows, > > > + struct ovn_datapath *od, > > > + const struct nbrec_nat *nat, > > > + struct ds *match, bool > distributed, > > > + bool is_v6, struct ovn_port > *l3dgw_port) > > > +{ > > > + if (strcmp(nat->type, "snat") && strcmp(nat->type, > "dnat_and_snat")) { > > > + return; > > > + } > > > + > > > + ds_clear(match); > > > + > > > + struct ds zone_match = DS_EMPTY_INITIALIZER; > > > + > > > + ds_put_format(match, "ip && ip%c.dst == %s && inport == %s", > > > + is_v6 ? '6' : '4', nat->external_ip, > l3dgw_port->json_key); > > > + ds_clone(&zone_match, match); > > > > Not sure how big of a problem it is, but it would be better to avoid > > cloning. It's an extra memory allocation and copy. Instead, it > should > > be possible to save the length, create one flow, truncate, create a > > second one. We do that in many other places. > > > > > > > + > > > + ds_put_cstr(match, " && flags.loopback == 0"); > > > + > > > + /* Update common zone match for the hairpin traffic. */ > > > + ds_put_cstr(&zone_match, " && flags.loopback == 1" > > > + " && flags.use_snat_zone == 1"); > > > + > > > + > > > + if (!distributed && od->n_l3dgw_ports) { > > > + /* Flows for NAT rules that are centralized are only > > > + * programmed on the gateway chassis. */ > > > + ds_put_format(match, " && is_chassis_resident(%s)", > > > + l3dgw_port->cr_port->json_key); > > > + ds_put_format(&zone_match, " && is_chassis_resident(%s)", > > > + l3dgw_port->cr_port->json_key); > > > > If we add this before checking flags, we'll not need to update both > matches, > > i.e. we can do that before the clone / before saving the length. > > > > > > That was my original idea. There is one caveat that it affects a ton of > tests. > > Would you agree to have this change as next patch that will target > specifically > > this so the tests are easier to review? > > AFAICT, only 24 lines needs changing in tests for this. And you're > touching > more than 200 already. So, I'm not sure if it's worth a separate patch. > Yeah it wasn't that bad afterall, the ds_clone was removed in v8. > > > > > > > > > > + } > > > + > > > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, > > > + 100, ds_cstr(match), > "ct_snat_in_czone;", > > > + &nat->header_); > > > + > > > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, > > > + 100, ds_cstr(&zone_match), "ct_snat;", > > > + &nat->header_); > > > + > > > + ds_destroy(&zone_match); > > > +} > > > > Best regards, Ilya Maximets. > > > > > > > > Thanks, > > Ales > > > > -- > > > > Ales Musil > > > > Senior Software Engineer - OVN Core > > > > Red Hat EMEA <https://www.redhat.com> > > > > amu...@redhat.com <mailto:amu...@redhat.com> IM: amusil > > > > <https://red.ht/sig> > > > > Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> amu...@redhat.com IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev