On Wed, May 28, 2025 at 8:59 AM Aaron Conole via dev <ovs-dev@openvswitch.org> wrote: > > > Hi byeongkeon-lee, > > This patch title doesn't look correct to me. This is introducing a new > flag to avoid flushing the dp flows. Whether a latency spike happens or > not isn't really part of the patch. > > byeongkeon-lee <byeongkeon....@nhn.com> writes: > > > The current ovs-vswitchd binary implementation clears up all the kernel > > datapath whenever the user types a command: 'ovs-vsctl del-br {br}'. > > The culprit is 'ofproto->ofproto_class->flush(ofproto)' in > > 'ofproto_flush__', which is called every time for bridge delete. > > It clears up all the kernel datapath's entries, resulting in a latency > > spike. > > > > In the multi-tenant environment, each bridge represents an individual > > tenant. However, when a specific tenant requests a bridge deletion, > > it affects all the other tenants due to the lack of isolation of the > > datapath flush. As a result, the flush clears up all datapaths, causing > > packet upcalls and rebuilding the datapath that were not targeted for > > deletion. > > There may be a better way to resolve this. Rather than flushing the > complete datapath, we can isolate those flows which are related to the > bridge in question and flush those. That would be preferable to leaving > the datapath with these flows.
In bridge_destroy(), first all ports are deleted, and then we call ofproto_destroy(). Aren't the calls to port_destroy() enough to clean up the relevant flows? The call to ofproto->ofproto_class->flush in ofproto_flush__() on bridge delete wipes all flows and also stops/restarts all handlers and revalidators . So I tried commenting it out, and indeed the port_destroy is enough to clean up all the relevant flows for only the correct bridge. I'm not as familiar with this area of code, but the value of this call isn't clear to me at all, except for perhaps the deletion of the last bridge. -M > > > The code below suggests the functionality that bypasses > > 'ofproto->ofproto_class->flush(ofproto)' operation. This is unnecessary > > since the port delete operation already removes corresponding datapaths > > selectively, and further, the datapath resident time is relatively short. > > This behavior is choosable by toggling the other-config option. > > I'm having trouble understanding what this is trying to convey. I > understand the port deletion part. But I don't understand the datapath > resident time. > > > Signed-off-by: byeongkeon-lee <byeongkeon....@nhn.com> > > --- > > ovs/ofproto/ofproto.c | 19 ++++++++++++++----- > > ovs/ofproto/ofproto.h | 3 ++- > > ovs/vswitchd/bridge.c | 18 +++++++++++------- > > ovs/vswitchd/vswitch.xml | 18 ++++++++++++++++++ > > 4 files changed, 45 insertions(+), 13 deletions(-) > > Not sure how others think about it. I think this would also need to be > documented in the NEWS file. > > > diff --git a/ovs/ofproto/ofproto.c b/ovs/ofproto/ofproto.c > > index ef615e5..4724733 100644 > > --- a/ovs/ofproto/ofproto.c > > +++ b/ovs/ofproto/ofproto.c > > @@ -315,6 +315,7 @@ unsigned ofproto_offloaded_stats_delay = > > OFPROTO_OFFLOADED_STATS_DELAY; > > bool ofproto_explicit_sampled_drops = > > OFPROTO_EXPLICIT_SAMPLED_DROPS_DEFAULT; > > > > uint32_t n_handlers, n_revalidators; > > +bool no_flush_on_bridge_delete; > > > > /* Map from datapath name to struct ofproto, for use by unixctl commands. > > */ > > static struct hmap all_ofprotos = HMAP_INITIALIZER(&all_ofprotos); > > @@ -821,6 +822,12 @@ ofproto_set_threads(int n_handlers_, int > > n_revalidators_) > > n_handlers = MAX(n_handlers_, 0); > > } > > > > +void > > +ofproto_set_no_flush_on_bridge_delete(bool no_flush) > > +{ > > + no_flush_on_bridge_delete = no_flush; > > +} > > + > > void > > ofproto_set_dp_desc(struct ofproto *p, const char *dp_desc) > > { > > @@ -1680,13 +1687,15 @@ ofproto_rule_delete(struct ofproto *ofproto, struct > > rule *rule) > > } > > > > static void > > -ofproto_flush__(struct ofproto *ofproto, bool del) > > +ofproto_flush__(struct ofproto *ofproto, bool del, bool force_flush) > > OVS_EXCLUDED(ofproto_mutex) > > { > > struct oftable *table; > > > > /* This will flush all datapath flows. */ > > - if (del && ofproto->ofproto_class->flush) { > > + if (del && > > + ofproto->ofproto_class->flush && > > + (force_flush || !no_flush_on_bridge_delete)) { > > ofproto->ofproto_class->flush(ofproto); > > } > > > > @@ -1829,7 +1838,7 @@ ofproto_unref(struct ofproto *ofproto) > > } > > > > void > > -ofproto_destroy(struct ofproto *p, bool del) > > +ofproto_destroy(struct ofproto *p, bool del, bool force_flush) > > OVS_EXCLUDED(ofproto_mutex) > > { > > struct ofport *ofport; > > @@ -1839,7 +1848,7 @@ ofproto_destroy(struct ofproto *p, bool del) > > return; > > } > > > > - ofproto_flush__(p, del); > > + ofproto_flush__(p, del, force_flush); > > HMAP_FOR_EACH_SAFE (ofport, hmap_node, &p->ports) { > > ofport_destroy(ofport, del); > > } > > @@ -2422,7 +2431,7 @@ void > > ofproto_flush_flows(struct ofproto *ofproto) > > { > > COVERAGE_INC(ofproto_flush); > > - ofproto_flush__(ofproto, false); > > + ofproto_flush__(ofproto, false, false); > > connmgr_flushed(ofproto->connmgr); > > } > > > > diff --git a/ovs/ofproto/ofproto.h b/ovs/ofproto/ofproto.h > > index 3f85509..96148b5 100644 > > --- a/ovs/ofproto/ofproto.h > > +++ b/ovs/ofproto/ofproto.h > > @@ -277,7 +277,7 @@ void ofproto_type_wait(const char *datapath_type); > > int ofproto_create(const char *datapath, const char *datapath_type, > > struct ofproto **ofprotop) > > OVS_EXCLUDED(ofproto_mutex); > > -void ofproto_destroy(struct ofproto *, bool del); > > +void ofproto_destroy(struct ofproto *, bool del, bool force_flush); > > int ofproto_delete(const char *name, const char *type); > > > > int ofproto_run(struct ofproto *); > > @@ -365,6 +365,7 @@ int ofproto_set_mcast_snooping(struct ofproto *ofproto, > > int ofproto_port_set_mcast_snooping(struct ofproto *ofproto, void *aux, > > const struct > > ofproto_mcast_snooping_port_settings *s); > > void ofproto_set_threads(int n_handlers, int n_revalidators); > > +void ofproto_set_no_flush_on_bridge_delete(bool no_flush); > > void ofproto_type_set_config(const char *type, > > const struct smap *other_config); > > void ofproto_set_dp_desc(struct ofproto *, const char *dp_desc); > > diff --git a/ovs/vswitchd/bridge.c b/ovs/vswitchd/bridge.c > > index 456b784..5137692 100644 > > --- a/ovs/vswitchd/bridge.c > > +++ b/ovs/vswitchd/bridge.c > > @@ -266,7 +266,7 @@ static uint64_t last_ifaces_changed; > > static void add_del_bridges(const struct ovsrec_open_vswitch *); > > static void bridge_run__(void); > > static void bridge_create(const struct ovsrec_bridge *); > > -static void bridge_destroy(struct bridge *, bool del); > > +static void bridge_destroy(struct bridge *, bool del, bool force_flush); > > static struct bridge *bridge_lookup(const char *name); > > static unixctl_cb_func bridge_unixctl_dump_flows; > > static unixctl_cb_func bridge_unixctl_reconnect; > > @@ -554,7 +554,7 @@ bridge_exit(bool delete_datapath) > > > > struct bridge *br; > > HMAP_FOR_EACH_SAFE (br, node, &all_bridges) { > > - bridge_destroy(br, delete_datapath); > > + bridge_destroy(br, delete_datapath, true); > > } > > > > ovsdb_idl_destroy(idl); > > @@ -890,6 +890,10 @@ bridge_reconfigure(const struct ovsrec_open_vswitch > > *ovs_cfg) > > smap_get_int(&ovs_cfg->other_config, "n-handler-threads", 0), > > smap_get_int(&ovs_cfg->other_config, "n-revalidator-threads", 0)); > > > > + ofproto_set_no_flush_on_bridge_delete( > > + smap_get_bool(&ovs_cfg->other_config, "no-flush-on-bridge-delete", > > + false)); > > + > > ofproto_set_explicit_sampled_drops( > > smap_get_bool(&ovs_cfg->other_config, "explicit-sampled-drops", > > OFPROTO_EXPLICIT_SAMPLED_DROPS_DEFAULT)); > > @@ -938,7 +942,7 @@ bridge_reconfigure(const struct ovsrec_open_vswitch > > *ovs_cfg) > > VLOG_ERR("failed to create bridge %s: %s", br->name, > > ovs_strerror(error)); > > shash_destroy(&br->wanted_ports); > > - bridge_destroy(br, true); > > + bridge_destroy(br, true, true); > > } else { > > /* Trigger storing datapath version. */ > > seq_change(connectivity_seq_get()); > > @@ -2137,7 +2141,7 @@ add_del_bridges(const struct ovsrec_open_vswitch *cfg) > > br->cfg = shash_find_data(&new_br, br->name); > > if (!br->cfg || strcmp(br->type, ofproto_normalize_type( > > br->cfg->datapath_type))) { > > - bridge_destroy(br, true); > > + bridge_destroy(br, true, false); > > } > > } > > > > @@ -3381,7 +3385,7 @@ bridge_run(void) > > (long int) getpid()); > > > > HMAP_FOR_EACH_SAFE (br, node, &all_bridges) { > > - bridge_destroy(br, false); > > + bridge_destroy(br, false, false); > > } > > /* Since we will not be running system_stats_run() in this process > > * with the current situation of multiple ovs-vswitchd daemons, > > @@ -3700,7 +3704,7 @@ bridge_create(const struct ovsrec_bridge *br_cfg) > > } > > > > static void > > -bridge_destroy(struct bridge *br, bool del) > > +bridge_destroy(struct bridge *br, bool del, bool force_flush) > > { > > if (br) { > > struct mirror *mirror; > > @@ -3714,7 +3718,7 @@ bridge_destroy(struct bridge *br, bool del) > > } > > > > hmap_remove(&all_bridges, &br->node); > > - ofproto_destroy(br->ofproto, del); > > + ofproto_destroy(br->ofproto, del, force_flush); > > hmap_destroy(&br->ifaces); > > hmap_destroy(&br->ports); > > hmap_destroy(&br->iface_by_name); > > diff --git a/ovs/vswitchd/vswitch.xml b/ovs/vswitchd/vswitch.xml > > index 76df9aa..3a82048 100644 > > --- a/ovs/vswitchd/vswitch.xml > > +++ b/ovs/vswitchd/vswitch.xml > > @@ -662,6 +662,24 @@ > > </p> > > </column> > > > > + <column name="other_config" key="no-flush-on-bridege-delete" > > + type='{"type": "boolean"}'> > > + <p> > > + Set this value to <code>true</code> to keep the datapath from > > being > > + flushed whenever a bridge is deleted. > > + </p> > > + <p> > > + The default value is <code>false</code>. > > + </p> > > + <p> > > + If this configuration is <code>true</code>, it protects the > > datapath, > > + which resides in the kernel when an arbitrary bridge is deleted. > > + Otherwise, the bridge delete operation causes a datapath flush, > > + requiring rebuilding the kernel datapath. The default value is > > + <code>false</code> as it is the default behavior until now. > > + </p> > > + </column> > > + > > <column name="other_config" key="emc-insert-inv-prob" > > type='{"type": "integer", "minInteger": 0, "maxInteger": > > 4294967295}'> > > <p> > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev