On Mon, Oct 6, 2025 at 11:52 AM Ilya Maximets <[email protected]> wrote:
> On 10/3/25 10:06 PM, Mark Michelson via dev wrote: > > Thanks Lucas and Numan. > > > > I merged this to main. > > > > On Thu, Sep 25, 2025 at 10:25 AM Numan Siddique <[email protected]> wrote: > >> > >> On Thu, Sep 25, 2025 at 8:16 AM Lucas Vargas Dias via dev > >> <[email protected]> wrote: > >>> > >>> Change the logic to save sbflow uuid and just update if > >>> the lflow is reused. Otherwise, it's removed. > >>> Also, reduce sbflow searching with uuidset instead of > >>> searching through all lflow table. > >>> Add lflow states: > >>> LFLOW_STALE - Lflow is not relevant > >>> LFLOW_TO_SYNC - Lflow needs to be synced with SB DB > >>> LFLOW_SYNCED - Lflow is synced with SB SB > >>> > >>> It generates the following results in a scenario with: > >>> - LSPs: 56548 > >>> - LRPs: 27304 > >>> - LRs: 7922 > >>> - LSs: 28602 > >>> > >>> Without the commit: > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1 > >>> Time spent on processing nb_cfg 275438: > >>> ovn-northd delay before processing: 16069ms > >>> ovn-northd completion: 32828ms > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1 > >>> Time spent on processing nb_cfg 275439: > >>> ovn-northd delay before processing: 15019ms > >>> ovn-northd completion: 33207ms > >>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1 > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64 > >>> Time spent on processing nb_cfg 275440: > >>> ovn-northd delay before processing: 14784ms > >>> ovn-northd completion: 33577ms > >>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1 > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router > >>> Time spent on processing nb_cfg 275441: > >>> ovn-northd delay before processing: 14598ms > >>> ovn-northd completion: 31942ms > >>> > >>> With the commit: > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1 > >>> Time spent on processing nb_cfg 275401: > >>> ovn-northd delay before processing: 12602ms > >>> ovn-northd completion: 26103ms > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1 > >>> Time spent on processing nb_cfg 275402: > >>> ovn-northd delay before processing: 12639ms > >>> ovn-northd completion: 26759ms > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1 > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64 > >>> Time spent on processing nb_cfg 275403: > >>> ovn-northd delay before processing: 11874ms > >>> ovn-northd completion: 29733ms > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1 > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router > >>> Time spent on processing nb_cfg 275404: > >>> ovn-northd delay before processing: 4058ms > >>> ovn-northd completion: 17323ms > >>> > >>> Signed-off-by: Lucas Vargas Dias <[email protected]> > >> > >> Thanks for addressing the comments and for the great improvements. > >> > >> Acked-by: Numan Siddique <[email protected]> > >> > >> Numan > >> > >>> --- > >>> northd/en-lflow.c | 2 +- > >>> northd/lflow-mgr.c | 108 ++++++++++++++++++++++++-------------------- > >>> northd/lflow-mgr.h | 10 +++- > >>> northd/northd.c | 8 ++-- > >>> tests/ovn-northd.at | 32 +++++++++++++ > >>> 5 files changed, 106 insertions(+), 54 deletions(-) > >>> > >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c > >>> index 50570b611..13c5e3119 100644 > >>> --- a/northd/en-lflow.c > >>> +++ b/northd/en-lflow.c > >>> @@ -122,7 +122,7 @@ en_lflow_run(struct engine_node *node, void *data) > >>> stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec()); > >>> > >>> struct lflow_data *lflow_data = data; > >>> - lflow_table_clear(lflow_data->lflow_table); > >>> + lflow_table_clear(lflow_data->lflow_table, false); > >>> lflow_reset_northd_refs(&lflow_input); > >>> lflow_ref_clear(lflow_input.igmp_lflow_ref); > >>> > >>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c > >>> index 6a66a9718..5a989fe19 100644 > >>> --- a/northd/lflow-mgr.c > >>> +++ b/northd/lflow-mgr.c > >>> @@ -26,6 +26,7 @@ > >>> #include "lflow-mgr.h" > >>> #include "lib/ovn-parallel-hmap.h" > >>> #include "lib/ovn-util.h" > >>> +#include "lib/uuidset.h" > >>> > >>> VLOG_DEFINE_THIS_MODULE(lflow_mgr); > >>> > >>> @@ -37,7 +38,8 @@ static void ovn_lflow_init(struct ovn_lflow *, > struct ovn_datapath *od, > >>> uint16_t priority, char *match, > >>> char *actions, char *io_port, > >>> char *ctrl_meter, char *stage_hint, > >>> - const char *where, const char *flow_desc); > >>> + const char *where, const char *flow_desc, > >>> + struct uuid sbuuid); > >>> static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows, > >>> enum ovn_stage stage, > >>> uint16_t priority, const char > *match, > >>> @@ -147,6 +149,13 @@ static struct ovs_mutex > lflow_hash_locks[LFLOW_HASH_LOCK_MASK + 1]; > >>> */ > >>> extern struct ovs_mutex fake_hash_mutex; > >>> > >>> + > >>> +enum ovn_lflow_state { > >>> + LFLOW_STALE, > >>> + LFLOW_TO_SYNC, > >>> + LFLOW_SYNCED, > >>> +}; > >>> + > >>> /* Represents a logical ovn flow (lflow). > >>> * > >>> * A logical flow with match 'M' and actions 'A' - L(M, A) is created > >>> @@ -181,14 +190,7 @@ struct ovn_lflow { > >>> struct hmap dp_refcnts_map; /* Maintains the number of times this > ovn_lflow > >>> * is referenced by a given datapath. > >>> * Contains 'struct dp_refcnt' in the > map. */ > >>> -}; > >>> - > >>> -/* Logical flow table. */ > >>> -struct lflow_table { > >>> - struct hmap entries; /* hmap of lflows. */ > >>> - struct hmap ls_dp_groups; /* hmap of logical switch dp groups. */ > >>> - struct hmap lr_dp_groups; /* hmap of logical router dp groups. */ > >>> - ssize_t max_seen_lflow_size; > >>> + enum ovn_lflow_state sync_state; > >>> }; > >>> > >>> struct lflow_table * > >>> @@ -210,10 +212,14 @@ lflow_table_init(struct lflow_table *lflow_table) > >>> } > >>> > >>> void > >>> -lflow_table_clear(struct lflow_table *lflow_table) > >>> +lflow_table_clear(struct lflow_table *lflow_table, bool destroy_all) > >>> { > >>> struct ovn_lflow *lflow; > >>> HMAP_FOR_EACH_SAFE (lflow, hmap_node, &lflow_table->entries) { > >>> + if (!destroy_all) { > >>> + lflow->sync_state = LFLOW_STALE; > >>> + continue; > >>> + } > >>> ovn_lflow_destroy(lflow_table, lflow); > >>> } > >>> > >>> @@ -224,7 +230,7 @@ lflow_table_clear(struct lflow_table *lflow_table) > >>> void > >>> lflow_table_destroy(struct lflow_table *lflow_table) > >>> { > >>> - lflow_table_clear(lflow_table); > >>> + lflow_table_clear(lflow_table, true); > >>> hmap_destroy(&lflow_table->entries); > >>> ovn_dp_groups_destroy(&lflow_table->ls_dp_groups); > >>> ovn_dp_groups_destroy(&lflow_table->lr_dp_groups); > >>> @@ -257,16 +263,42 @@ lflow_table_sync_to_sb(struct lflow_table > *lflow_table, > >>> const struct sbrec_logical_flow_table > *sb_flow_table, > >>> const struct sbrec_logical_dp_group_table > *dpgrp_table) > >>> { > >>> + struct uuidset sb_uuid_set = UUIDSET_INITIALIZER(&sb_uuid_set); > >>> struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp); > >>> struct hmap *lflows = &lflow_table->entries; > >>> struct ovn_lflow *lflow; > >>> + const struct sbrec_logical_flow *sbflow; > >>> > >>> fast_hmap_size_for(&lflows_temp, > >>> lflow_table->max_seen_lflow_size); > >>> > >>> + HMAP_FOR_EACH_SAFE (lflow, hmap_node, lflows) { > >>> + if (lflow->sync_state == LFLOW_STALE) { > >>> + ovn_lflow_destroy(lflow_table, lflow); > >>> + continue; > >>> + } > >>> + sbflow = NULL; > >>> + if (!uuid_is_zero(&lflow->sb_uuid)) { > >>> + sbflow = > sbrec_logical_flow_table_get_for_uuid(sb_flow_table, > >>> + > &lflow->sb_uuid); > >>> + } > >>> + sync_lflow_to_sb(lflow, ovnsb_txn, lflow_table, ls_datapaths, > >>> + lr_datapaths, ovn_internal_version_changed, > >>> + sbflow, dpgrp_table); > >>> + uuidset_insert(&sb_uuid_set, &lflow->sb_uuid); > >>> + hmap_remove(lflows, &lflow->hmap_node); > >>> + hmap_insert(&lflows_temp, &lflow->hmap_node, > >>> + hmap_node_hash(&lflow->hmap_node)); > >>> + } > > Hi, Lucas, Numan, Mark, others. > > This change introduces a significant performance issue when northd > is restarted or when one northd goes on standby and the other becomes > active. > > The problem is that the code above is using lflow->sb_uuid that is > always zero for logical flows not created by the currently active > northd process. So, if the flows existed in the database before the > current process became active, it will remove all of these flows and > re-create anew. The issue can be easily reproduced in the sanbdbox > by adding some resources, then stopping northd and restarting it. > Checking the database file we can see all the logical flows re-created. > > In ovn-heater setup this behavior is causing a significant iteration > time spike for up to 140 seconds in 500-node cluster-density test > every time SB leader changes and a new northd becomes active. It > overwrites about 200 MB of logical flows, making Sb ovsdb-server send > 200 x 150-ish (clients) MB of data momentarily consuming 30 GB of RAM. > All ovn-controllers also spend 20-30 seconds processing the new update. > > We still need a way to search for logical flows in the southbound > database directly, as not every lflow was created by the current process. > We need to find a way to do that on top of this change or revert it. > I assume, revert might be easier in the short term as serching Sb is > exacly what this change was trying to avoid. > > Best regards, Ilya Maximets. > > Hi, based on the above I have posted a revert for now. I think it will be best to unblock CI and focus on the development outside for now. Thanks, Ales _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
