v2 of patch #5 was submitted: https://patchwork.ozlabs.org/project/ovn/patch/20230530150328.1224972-1-odiv...@gmail.com/
> On 30 May 2023, at 13:12, Vladislav Odintsov <odiv...@gmail.com> wrote: > > > >> On 30 May 2023, at 13:10, Dumitru Ceara <dce...@redhat.com> wrote: >> >> On 5/30/23 11:59, Vladislav Odintsov wrote: >>> Hi Dumitru, >>> >>> thanks for the review! >>> My answers are inline. >>> >>>> On 30 May 2023, at 12:27, Dumitru Ceara <dce...@redhat.com> wrote: >>>> >>>> On 5/19/23 20:18, Vladislav Odintsov wrote: >>>>> This patch is intended to make next two changes: >>>>> >>>>> 1. Support create/update of MAC_Binding for GARP/ND from HW VTEP. >>>>> >>>>> Prior to this patch MAC_Binding records were created only when LRP issued >>>>> ARP request/Neighbor solicitation. If IP to MAC in network attached via >>>>> vtep lport was changed the old MAC_Binding prevented normal >>>>> communication. Now router port (chassisredirect) in logical switch with >>>>> vtep lport catches GARP or Neighbor Solicitation and updates MAC_Binding. >>>>> >>>>> New flow with max priority was added in ls_in_arp_nd_resp and additional >>>>> OF rule in table 37 (OFTABLE_REMOTE_OUTPUT) to process multicast packets >>>>> received from RAMP_SWITCH. >>>>> >>>>> 2. Answer to ARP/ND on requests coming from HW VTEP in lrouter pipeline. >>>>> >>>>> This is needed to reduce duplicated arp-responses, which were generated >>>>> by OVN arp-responder flows in node, where chassisredirect port located >>>>> with responses coming directly from VM interfaces. >>>>> >>>>> As a result of this change, now vifs located on same chassis, where >>>>> chassisredirect ports are located receive ARP/ND mcast packets, which >>>>> previously were suppressed by arp-responder on the node. >>>>> >>>>> Tests and documentation were updated to conform to new behaviour. >>>>> >>>>> Signed-off-by: Vladislav Odintsov <odiv...@gmail.com> >>>>> --- >>>> >>>> Hi Vladislav, >>>> >>>> Thanks for the patch. >>>> >>>>> controller/binding.c | 48 ++++++++++++++++++++ >>>>> controller/local_data.h | 3 ++ >>>>> controller/physical.c | 46 +++++++++++++++++-- >>>>> northd/northd.c | 10 +++++ >>>>> northd/ovn-northd.8.xml | 6 +++ >>>>> tests/ovn.at <http://ovn.at/> | 99 >>>>> ++++++++++++++++++++++++++++++++++++++++- >>>> >>>> Could you please add a NEWS entry for this behavior change? >>> >>> Sure, will do it in v2. >>> >>>> >>>>> 6 files changed, 207 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/controller/binding.c b/controller/binding.c >>>>> index c7a2b3f10..5932ad785 100644 >>>>> --- a/controller/binding.c >>>>> +++ b/controller/binding.c >>>>> @@ -509,6 +509,30 @@ update_ld_multichassis_ports(const struct >>>>> sbrec_port_binding *binding_rec, >>>>> } >>>>> } >>>>> >>>>> +static void >>>>> +update_ld_vtep_port(const struct sbrec_port_binding *binding_rec, >>>>> + struct hmap *local_datapaths) >>>>> +{ >>>>> + struct local_datapath *ld >>>>> + = get_local_datapath(local_datapaths, >>>>> + binding_rec->datapath->tunnel_key); >>>>> + if (!ld) { >>>>> + return; >>>>> + } >>>>> + >>>>> + if (ld->vtep_port && strcmp(ld->vtep_port->logical_port, >>>>> + binding_rec->logical_port)) { >>>>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); >>>>> + VLOG_WARN_RL(&rl, "vtep port '%s' already set for datapath " >>>>> + "'%"PRId64"', skipping the new port '%s'.", >>>>> + ld->vtep_port->logical_port, >>>>> + binding_rec->datapath->tunnel_key, >>>>> + binding_rec->logical_port); >>>>> + return; >>>>> + } >>>>> + ld->vtep_port = binding_rec; >>>>> +} >>>>> + >>>>> static void >>>>> update_ld_localnet_port(const struct sbrec_port_binding *binding_rec, >>>>> struct shash *bridge_mappings, >>>>> @@ -1987,6 +2011,7 @@ binding_run(struct binding_ctx_in *b_ctx_in, >>>>> struct binding_ctx_out *b_ctx_out) >>>>> struct ovs_list external_lports = >>>>> OVS_LIST_INITIALIZER(&external_lports); >>>>> struct ovs_list multichassis_ports = OVS_LIST_INITIALIZER( >>>>> >>>>> &multichassis_ports); >>>>> + struct ovs_list vtep_lports = OVS_LIST_INITIALIZER(&vtep_lports); >>>>> >>>>> struct lport { >>>>> struct ovs_list list_node; >>>>> @@ -2010,8 +2035,13 @@ binding_run(struct binding_ctx_in *b_ctx_in, >>>>> struct binding_ctx_out *b_ctx_out) >>>>> >>>>> switch (lport_type) { >>>>> case LP_PATCH: >>>>> + update_related_lport(pb, b_ctx_out); >>>>> + break; >>>>> case LP_VTEP: >>>>> update_related_lport(pb, b_ctx_out); >>>>> + struct lport *vtep_lport = xmalloc(sizeof *vtep_lport); >>>>> + vtep_lport->pb = pb; >>>>> + ovs_list_push_back(&vtep_lports, &vtep_lport->list_node); >>>>> break; >>>>> >>>>> case LP_LOCALPORT: >>>>> @@ -2111,6 +2141,16 @@ binding_run(struct binding_ctx_in *b_ctx_in, >>>>> struct binding_ctx_out *b_ctx_out) >>>>> free(multichassis_lport); >>>>> } >>>>> >>>>> + /* Run through vtep lport list to see if there are vtep ports >>>>> + * on local datapaths discovered from above loop, and update the >>>>> + * corresponding local datapath accordingly. */ >>>>> + struct lport *vtep_lport; >>>>> + ovs_list_size(&vtep_lports); >>>> >>>> I guess this is a leftover. >>> >>> Oops, will fix it in v2. >>> >>>> >>>>> + LIST_FOR_EACH_POP (vtep_lport, list_node, &vtep_lports) { >>>>> + update_ld_vtep_port(vtep_lport->pb, b_ctx_out->local_datapaths); >>>>> + free(vtep_lport); >>>>> + } >>>>> + >>>>> shash_destroy(&bridge_mappings); >>>>> /* Remove stale QoS entries. */ >>>>> ovs_qos_entries_gc(b_ctx_in->ovs_idl_txn, >>>>> b_ctx_in->ovsrec_port_by_qos, >>>>> @@ -2175,6 +2215,11 @@ remove_pb_from_local_datapath(const struct >>>>> sbrec_port_binding *pb, >>>>> } >>>>> } else if (!strcmp(pb->type, "external")) { >>>>> remove_local_datapath_external_port(ld, pb->logical_port); >>>>> + } else if (!strcmp(pb->type, "vtep")) { >>>>> + if (ld->vtep_port && !strcmp(ld->vtep_port->logical_port, >>>>> + pb->logical_port)) { >>>>> + ld->vtep_port = NULL; >>>>> + } >>>>> } >>>>> remove_local_datapath_multichassis_port(ld, pb->logical_port); >>>>> } >>>>> @@ -2836,6 +2881,7 @@ handle_updated_port(struct binding_ctx_in >>>>> *b_ctx_in, >>>>> >>>>> case LP_VTEP: >>>>> update_related_lport(pb, b_ctx_out); >>>>> + update_ld_vtep_port(pb, b_ctx_out->local_datapaths); >>>>> /* VTEP lports are claimed/released by ovn-controller-vteps. >>>>> * We are not sure what changed. */ >>>>> b_ctx_out->non_vif_ports_changed = true; >>>>> @@ -3091,6 +3137,8 @@ delete_done: >>>>> } else if (pb->n_additional_chassis) { >>>>> update_ld_multichassis_ports(pb, >>>>> >>>>> b_ctx_out->local_datapaths); >>>>> + } else if (lport_type == LP_VTEP) { >>>>> + update_ld_vtep_port(pb, b_ctx_out->local_datapaths); >>>>> } >>>> >>>> Nit: I think I'd move this "else if (lport_type == LP_VTEP) {" just >>>> under the LP_EXTERNAL case. >>> >>> I’ll move it, but I guess it’s not possible to have additional chassis >>> for vtep pb. >>> >>>> >>>>> } >>>>> sbrec_port_binding_index_destroy_row(target); >>>>> diff --git a/controller/local_data.h b/controller/local_data.h >>>>> index 748f009aa..19d13a558 100644 >>>>> --- a/controller/local_data.h >>>>> +++ b/controller/local_data.h >>>>> @@ -50,6 +50,9 @@ struct local_datapath { >>>>> /* The localnet port in this datapath, if any (at most one is >>>>> allowed). */ >>>>> const struct sbrec_port_binding *localnet_port; >>>>> >>>>> + /* The vtep port in this datapath, if any (at most one is >>>>> allowed). */ >>>>> + const struct sbrec_port_binding *vtep_port; >>>>> + >>>>> struct { >>>>> const struct sbrec_port_binding *local; >>>>> const struct sbrec_port_binding *remote; >>>>> diff --git a/controller/physical.c b/controller/physical.c >>>>> index 2925dcd1d..b75e120c7 100644 >>>>> --- a/controller/physical.c >>>>> +++ b/controller/physical.c >>>>> @@ -194,6 +194,15 @@ get_localnet_port(const struct hmap >>>>> *local_datapaths, int64_t tunnel_key) >>>>> } >>>>> >>>>> >>>>> +static const struct sbrec_port_binding * >>>>> +get_vtep_port(const struct hmap *local_datapaths, int64_t tunnel_key) >>>>> +{ >>>>> + const struct local_datapath *ld = >>>>> get_local_datapath(local_datapaths, >>>>> + tunnel_key); >>>>> + return ld ? ld->vtep_port : NULL; >>>>> +} >>>>> + >>>>> + >>>>> static struct zone_ids >>>>> get_zone_ids(const struct sbrec_port_binding *binding, >>>>> const struct simap *ct_zones) >>>>> @@ -1732,7 +1741,7 @@ consider_mc_group(struct ovsdb_idl_index >>>>> *sbrec_port_binding_by_name, >>>>> struct sset remote_chassis = SSET_INITIALIZER(&remote_chassis); >>>>> struct sset vtep_chassis = SSET_INITIALIZER(&vtep_chassis); >>>>> >>>>> - struct match match; >>>>> + struct match match, match_ramp; >>>>> match_outport_dp_and_port_keys(&match, dp_key, mc->tunnel_key); >>>>> >>>>> /* Go through all of the ports in the multicast group: >>>>> @@ -1755,10 +1764,10 @@ consider_mc_group(struct ovsdb_idl_index >>>>> *sbrec_port_binding_by_name, >>>>> * set the output port to be the router patch port for which >>>>> * the redirect port was added. >>>>> */ >>>>> - struct ofpbuf ofpacts; >>>>> + struct ofpbuf ofpacts, remote_ofpacts, remote_ofpacts_ramp; >>>>> ofpbuf_init(&ofpacts, 0); >>>>> - struct ofpbuf remote_ofpacts; >>>>> ofpbuf_init(&remote_ofpacts, 0); >>>>> + ofpbuf_init(&remote_ofpacts_ramp, 0); >>>>> for (size_t i = 0; i < mc->n_ports; i++) { >>>>> struct sbrec_port_binding *port = mc->ports[i]; >>>>> >>>>> @@ -1783,6 +1792,7 @@ consider_mc_group(struct ovsdb_idl_index >>>>> *sbrec_port_binding_by_name, >>>>> local_output_pb(port->tunnel_key, &ofpacts); >>>>> } else { >>>>> local_output_pb(port->tunnel_key, &remote_ofpacts); >>>>> + local_output_pb(port->tunnel_key, &remote_ofpacts_ramp); >>>>> } >>>>> } if (!strcmp(port->type, "remote")) { >>>>> if (port->chassis) { >>>>> @@ -1866,6 +1876,35 @@ consider_mc_group(struct ovsdb_idl_index >>>>> *sbrec_port_binding_by_name, >>>>> * multicast group as the logical output port. */ >>>>> put_load(mc->tunnel_key, MFF_LOG_OUTPORT, 0, 32, >>>>> &remote_ofpacts); >>>>> + >>>>> + if (get_vtep_port(local_datapaths, >>>>> mc->datapath->tunnel_key)) { >>>> >>>> I'd declare 'match_ramp' here as it's only used on this branch of the if >>>> statement. >>> >>> Okay, will be fixed in v2. >>> >>>> >>>>> + match_set_reg_masked(&match, MFF_LOG_FLAGS - >>>>> MFF_REG0, 0, >>>>> + MLF_RCV_FROM_RAMP); >>>>> + >>>> >>>> Are we going to match less traffic now? We add this "flags & >>>> MLF_RCV_FROM_RAMP" condition to the already existing priority-100 >>>> OFTABLE_REMOTE_OUTPUT flow. Does this break multicast traffic not >>>> received from RAMP (vtep) ports? >>> >>> Yes, this limits traffic matching on this flow. >>> Such flow matches only traffic, which was received from non-RAMP lports. >>> We need to flood traffic to all multicast group ports only if it was >>> received from non-RAMP ports. >>> This is due to fact that flood on the directon "from vtep" is done on >>> the vtep (ramp) switch: it floods BUM to all hypervisors. >>> If we don’t do it, we’ll duplicate BUM traffic (first packet is >>> originally sent from VTEP switch to all hypervisors) and others are >>> replicated on all hypervisors and re-sent to other hypervisors. >>> >>> Please, let me know if this makes sense. >>> >> >> It does, sorry for the noise. I missed the value = 0 argument. >> >> Thanks for the clarification! >> >> I'm planning to apply the first 4 patches today and I'll look forward >> for a v2 of this patch afterwards. > > Okay, thanks. > I’ll submit v2 right after #1-#4 merge. > >> >> Regards, >> Dumitru >> > > > Regards, > Vladislav Odintsov > > _______________________________________________ > dev mailing list > d...@openvswitch.org <mailto:d...@openvswitch.org> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev Regards, Vladislav Odintsov _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev