From: Han Zhou <hz...@ovn.org> Now in ofctrl we wait in S_WAIT_BEFORE_CLEAR for the initial flow compute to complete before clearing the existing flows, to reduce the data plane down time during ovn-controller restart. However, the flow installation takes a long time when the flow number is huge, which still leads to a long data plane interruption after we clearing the flows in S_CLEAR_FLOWS and before the new flow installation completes.
This patch moves the initial flow/group/meter deletion from run_S_CLEAR_FLOWS to ofctrl_put() the first time when we install flows to OVS after the transition from S_CLEAR_FLOWS state, and combine the initial flow/group deletion and the new flow/group installation to a bundle, so that OVS atomically switch from the old flows to the new ones without any gap. Ideally we should include meters in the bundle as well, but OVS doesn't support meter mod in the bundle yet. The new order of the operations in ofctrl_put becomes: - clear meters (if it is first run after S_CLEAR_FLOWS) - add new meters - bundle - clear flows (if it is first run after S_CLEAR_FLOWS) - clear groups (if it is first run after S_CLEAR_FLOWS) - add new groups - flow changes - delete old groups - delete old meters Tested with ~200k ovs flows generated by ovn-controller on a system with 8 Armv8 A72 cores, with ovn-ofctrl-wait-before-clear set to 8000. Without this patch, there were ~5 seconds ping failures during ovn-controller restart. With this patch, there is no ping failure observed with 100ms interval (ping -i 0.1). Signed-off-by: Han Zhou <hz...@ovn.org> --- controller/ofctrl.c | 78 ++++++++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 32 deletions(-) diff --git a/controller/ofctrl.c b/controller/ofctrl.c index ff5679d00..3fce4db27 100644 --- a/controller/ofctrl.c +++ b/controller/ofctrl.c @@ -366,9 +366,11 @@ static void ofctrl_meter_bands_clear(void); * S_CLEAR_FLOWS or S_UPDATE_FLOWS, this is really the option we have. */ static enum mf_field_id mff_ovn_geneve; -/* Indicates if flows need to be reinstalled for scenarios when ovs - * is restarted, even if there is no change in the desired flow table. */ -static bool need_reinstall_flows; +/* Indicates if we just went through the S_CLEAR_FLOWS state, which means we + * need to perform a one time deletion for all the existing flows, groups and + * meters. This can happen during initialization or OpenFlow reconnection + * (e.g. after OVS restart). */ +static bool ofctrl_initial_clear; static ovs_be32 queue_msg(struct ofpbuf *); @@ -634,25 +636,10 @@ run_S_CLEAR_FLOWS(void) { VLOG_DBG("clearing all flows"); - need_reinstall_flows = true; - /* Send a flow_mod to delete all flows. */ - struct ofputil_flow_mod fm = { - .table_id = OFPTT_ALL, - .command = OFPFC_DELETE, - }; - minimatch_init_catchall(&fm.match); - queue_msg(encode_flow_mod(&fm)); - minimatch_destroy(&fm.match); - - /* Send a group_mod to delete all groups. */ - struct ofputil_group_mod gm; - memset(&gm, 0, sizeof gm); - gm.command = OFPGC11_DELETE; - gm.group_id = OFPG_ALL; - gm.command_bucket_id = OFPG15_BUCKET_ALL; - ovs_list_init(&gm.buckets); - queue_msg(encode_group_mod(&gm)); - ofputil_uninit_group_mod(&gm); + /* Set the flag so that the ofctrl_run() can clear the existing flows, + * groups and meters. We clear them in ofctrl_run() right before the new + * ones are installed to avoid data plane downtime. */ + ofctrl_initial_clear = true; /* Clear installed_flows, to match the state of the switch. */ ovn_installed_flow_table_clear(); @@ -662,13 +649,6 @@ run_S_CLEAR_FLOWS(void) ovn_extend_table_clear(groups, true); } - /* Send a meter_mod to delete all meters. */ - struct ofputil_meter_mod mm; - memset(&mm, 0, sizeof mm); - mm.command = OFPMC13_DELETE; - mm.meter.meter_id = OFPM13_ALL; - queue_msg(encode_meter_mod(&mm)); - /* Clear existing meters, to match the state of the switch. */ if (meters) { ovn_extend_table_clear(meters, true); @@ -2405,7 +2385,7 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table, static uint64_t old_req_cfg = 0; bool need_put = false; if (lflows_changed || pflows_changed || skipped_last_time || - need_reinstall_flows) { + ofctrl_initial_clear) { need_put = true; old_req_cfg = req_cfg; } else if (req_cfg != old_req_cfg) { @@ -2434,8 +2414,6 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table, return; } - need_reinstall_flows = false; - /* OpenFlow messages to send to the switch to bring it up-to-date. */ struct ovs_list msgs = OVS_LIST_INITIALIZER(&msgs); @@ -2450,6 +2428,19 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table, } } + if (ofctrl_initial_clear) { + /* Send a meter_mod to delete all meters. + * XXX: Ideally, we should include the meter deletion and + * reinstallation in the same bundle just like for flows and groups, + * for minimum data plane interruption. However, OVS doesn't support + * METER_MOD in bundle yet. */ + struct ofputil_meter_mod mm; + memset(&mm, 0, sizeof mm); + mm.command = OFPMC13_DELETE; + mm.meter.meter_id = OFPM13_ALL; + add_meter_mod(&mm, &msgs); + } + /* Iterate through all the desired meters. If there are new ones, * add them to the switch. */ struct ovn_extend_table_info *m_desired; @@ -2482,6 +2473,29 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table, bundle_open = ofputil_encode_bundle_ctrl_request(OFP15_VERSION, &bc); ovs_list_push_back(&msgs, &bundle_open->list_node); + if (ofctrl_initial_clear) { + /* Send a flow_mod to delete all flows. */ + struct ofputil_flow_mod fm = { + .table_id = OFPTT_ALL, + .command = OFPFC_DELETE, + }; + minimatch_init_catchall(&fm.match); + add_flow_mod(&fm, &bc, &msgs); + minimatch_destroy(&fm.match); + + /* Send a group_mod to delete all groups. */ + struct ofputil_group_mod gm; + memset(&gm, 0, sizeof gm); + gm.command = OFPGC11_DELETE; + gm.group_id = OFPG_ALL; + gm.command_bucket_id = OFPG15_BUCKET_ALL; + ovs_list_init(&gm.buckets); + add_group_mod(&gm, &bc, &msgs); + ofputil_uninit_group_mod(&gm); + + ofctrl_initial_clear = false; + } + /* Iterate through all the desired groups. If there are new ones, * add them to the switch. */ struct ovn_extend_table_info *desired; -- 2.37.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev