Hi, I'll work on this issue reported when northd is restarted or when one northd goes on standby and the other becomes active.
Best, Lucas Em seg., 6 de out. de 2025 às 07:19, Ales Musil via dev < [email protected]> escreveu: > On Mon, Oct 6, 2025 at 11:52 AM Ilya Maximets <[email protected]> wrote: > > > On 10/3/25 10:06 PM, Mark Michelson via dev wrote: > > > Thanks Lucas and Numan. > > > > > > I merged this to main. > > > > > > On Thu, Sep 25, 2025 at 10:25 AM Numan Siddique <[email protected]> > wrote: > > >> > > >> On Thu, Sep 25, 2025 at 8:16 AM Lucas Vargas Dias via dev > > >> <[email protected]> wrote: > > >>> > > >>> Change the logic to save sbflow uuid and just update if > > >>> the lflow is reused. Otherwise, it's removed. > > >>> Also, reduce sbflow searching with uuidset instead of > > >>> searching through all lflow table. > > >>> Add lflow states: > > >>> LFLOW_STALE - Lflow is not relevant > > >>> LFLOW_TO_SYNC - Lflow needs to be synced with SB DB > > >>> LFLOW_SYNCED - Lflow is synced with SB SB > > >>> > > >>> It generates the following results in a scenario with: > > >>> - LSPs: 56548 > > >>> - LRPs: 27304 > > >>> - LRs: 7922 > > >>> - LSs: 28602 > > >>> > > >>> Without the commit: > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1 > > >>> Time spent on processing nb_cfg 275438: > > >>> ovn-northd delay before processing: 16069ms > > >>> ovn-northd completion: 32828ms > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1 > > >>> Time spent on processing nb_cfg 275439: > > >>> ovn-northd delay before processing: 15019ms > > >>> ovn-northd completion: 33207ms > > >>> > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1 > > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64 > > >>> Time spent on processing nb_cfg 275440: > > >>> ovn-northd delay before processing: 14784ms > > >>> ovn-northd completion: 33577ms > > >>> > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1 > > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router > > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router > > >>> Time spent on processing nb_cfg 275441: > > >>> ovn-northd delay before processing: 14598ms > > >>> ovn-northd completion: 31942ms > > >>> > > >>> With the commit: > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1 > > >>> Time spent on processing nb_cfg 275401: > > >>> ovn-northd delay before processing: 12602ms > > >>> ovn-northd completion: 26103ms > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1 > > >>> Time spent on processing nb_cfg 275402: > > >>> ovn-northd delay before processing: 12639ms > > >>> ovn-northd completion: 26759ms > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1 > > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64 > > >>> Time spent on processing nb_cfg 275403: > > >>> ovn-northd delay before processing: 11874ms > > >>> ovn-northd completion: 29733ms > > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1 > > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router > > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router > > >>> Time spent on processing nb_cfg 275404: > > >>> ovn-northd delay before processing: 4058ms > > >>> ovn-northd completion: 17323ms > > >>> > > >>> Signed-off-by: Lucas Vargas Dias <[email protected]> > > >> > > >> Thanks for addressing the comments and for the great improvements. > > >> > > >> Acked-by: Numan Siddique <[email protected]> > > >> > > >> Numan > > >> > > >>> --- > > >>> northd/en-lflow.c | 2 +- > > >>> northd/lflow-mgr.c | 108 > ++++++++++++++++++++++++-------------------- > > >>> northd/lflow-mgr.h | 10 +++- > > >>> northd/northd.c | 8 ++-- > > >>> tests/ovn-northd.at | 32 +++++++++++++ > > >>> 5 files changed, 106 insertions(+), 54 deletions(-) > > >>> > > >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c > > >>> index 50570b611..13c5e3119 100644 > > >>> --- a/northd/en-lflow.c > > >>> +++ b/northd/en-lflow.c > > >>> @@ -122,7 +122,7 @@ en_lflow_run(struct engine_node *node, void > *data) > > >>> stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec()); > > >>> > > >>> struct lflow_data *lflow_data = data; > > >>> - lflow_table_clear(lflow_data->lflow_table); > > >>> + lflow_table_clear(lflow_data->lflow_table, false); > > >>> lflow_reset_northd_refs(&lflow_input); > > >>> lflow_ref_clear(lflow_input.igmp_lflow_ref); > > >>> > > >>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c > > >>> index 6a66a9718..5a989fe19 100644 > > >>> --- a/northd/lflow-mgr.c > > >>> +++ b/northd/lflow-mgr.c > > >>> @@ -26,6 +26,7 @@ > > >>> #include "lflow-mgr.h" > > >>> #include "lib/ovn-parallel-hmap.h" > > >>> #include "lib/ovn-util.h" > > >>> +#include "lib/uuidset.h" > > >>> > > >>> VLOG_DEFINE_THIS_MODULE(lflow_mgr); > > >>> > > >>> @@ -37,7 +38,8 @@ static void ovn_lflow_init(struct ovn_lflow *, > > struct ovn_datapath *od, > > >>> uint16_t priority, char *match, > > >>> char *actions, char *io_port, > > >>> char *ctrl_meter, char *stage_hint, > > >>> - const char *where, const char > *flow_desc); > > >>> + const char *where, const char *flow_desc, > > >>> + struct uuid sbuuid); > > >>> static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows, > > >>> enum ovn_stage stage, > > >>> uint16_t priority, const > char > > *match, > > >>> @@ -147,6 +149,13 @@ static struct ovs_mutex > > lflow_hash_locks[LFLOW_HASH_LOCK_MASK + 1]; > > >>> */ > > >>> extern struct ovs_mutex fake_hash_mutex; > > >>> > > >>> + > > >>> +enum ovn_lflow_state { > > >>> + LFLOW_STALE, > > >>> + LFLOW_TO_SYNC, > > >>> + LFLOW_SYNCED, > > >>> +}; > > >>> + > > >>> /* Represents a logical ovn flow (lflow). > > >>> * > > >>> * A logical flow with match 'M' and actions 'A' - L(M, A) is > created > > >>> @@ -181,14 +190,7 @@ struct ovn_lflow { > > >>> struct hmap dp_refcnts_map; /* Maintains the number of times > this > > ovn_lflow > > >>> * is referenced by a given > datapath. > > >>> * Contains 'struct dp_refcnt' in > the > > map. */ > > >>> -}; > > >>> - > > >>> -/* Logical flow table. */ > > >>> -struct lflow_table { > > >>> - struct hmap entries; /* hmap of lflows. */ > > >>> - struct hmap ls_dp_groups; /* hmap of logical switch dp groups. > */ > > >>> - struct hmap lr_dp_groups; /* hmap of logical router dp groups. > */ > > >>> - ssize_t max_seen_lflow_size; > > >>> + enum ovn_lflow_state sync_state; > > >>> }; > > >>> > > >>> struct lflow_table * > > >>> @@ -210,10 +212,14 @@ lflow_table_init(struct lflow_table > *lflow_table) > > >>> } > > >>> > > >>> void > > >>> -lflow_table_clear(struct lflow_table *lflow_table) > > >>> +lflow_table_clear(struct lflow_table *lflow_table, bool destroy_all) > > >>> { > > >>> struct ovn_lflow *lflow; > > >>> HMAP_FOR_EACH_SAFE (lflow, hmap_node, &lflow_table->entries) { > > >>> + if (!destroy_all) { > > >>> + lflow->sync_state = LFLOW_STALE; > > >>> + continue; > > >>> + } > > >>> ovn_lflow_destroy(lflow_table, lflow); > > >>> } > > >>> > > >>> @@ -224,7 +230,7 @@ lflow_table_clear(struct lflow_table > *lflow_table) > > >>> void > > >>> lflow_table_destroy(struct lflow_table *lflow_table) > > >>> { > > >>> - lflow_table_clear(lflow_table); > > >>> + lflow_table_clear(lflow_table, true); > > >>> hmap_destroy(&lflow_table->entries); > > >>> ovn_dp_groups_destroy(&lflow_table->ls_dp_groups); > > >>> ovn_dp_groups_destroy(&lflow_table->lr_dp_groups); > > >>> @@ -257,16 +263,42 @@ lflow_table_sync_to_sb(struct lflow_table > > *lflow_table, > > >>> const struct sbrec_logical_flow_table > > *sb_flow_table, > > >>> const struct sbrec_logical_dp_group_table > > *dpgrp_table) > > >>> { > > >>> + struct uuidset sb_uuid_set = UUIDSET_INITIALIZER(&sb_uuid_set); > > >>> struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp); > > >>> struct hmap *lflows = &lflow_table->entries; > > >>> struct ovn_lflow *lflow; > > >>> + const struct sbrec_logical_flow *sbflow; > > >>> > > >>> fast_hmap_size_for(&lflows_temp, > > >>> lflow_table->max_seen_lflow_size); > > >>> > > >>> + HMAP_FOR_EACH_SAFE (lflow, hmap_node, lflows) { > > >>> + if (lflow->sync_state == LFLOW_STALE) { > > >>> + ovn_lflow_destroy(lflow_table, lflow); > > >>> + continue; > > >>> + } > > >>> + sbflow = NULL; > > >>> + if (!uuid_is_zero(&lflow->sb_uuid)) { > > >>> + sbflow = > > sbrec_logical_flow_table_get_for_uuid(sb_flow_table, > > >>> + > > &lflow->sb_uuid); > > >>> + } > > >>> + sync_lflow_to_sb(lflow, ovnsb_txn, lflow_table, > ls_datapaths, > > >>> + lr_datapaths, ovn_internal_version_changed, > > >>> + sbflow, dpgrp_table); > > >>> + uuidset_insert(&sb_uuid_set, &lflow->sb_uuid); > > >>> + hmap_remove(lflows, &lflow->hmap_node); > > >>> + hmap_insert(&lflows_temp, &lflow->hmap_node, > > >>> + hmap_node_hash(&lflow->hmap_node)); > > >>> + } > > > > Hi, Lucas, Numan, Mark, others. > > > > This change introduces a significant performance issue when northd > > is restarted or when one northd goes on standby and the other becomes > > active. > > > > The problem is that the code above is using lflow->sb_uuid that is > > always zero for logical flows not created by the currently active > > northd process. So, if the flows existed in the database before the > > current process became active, it will remove all of these flows and > > re-create anew. The issue can be easily reproduced in the sanbdbox > > by adding some resources, then stopping northd and restarting it. > > Checking the database file we can see all the logical flows re-created. > > > > In ovn-heater setup this behavior is causing a significant iteration > > time spike for up to 140 seconds in 500-node cluster-density test > > every time SB leader changes and a new northd becomes active. It > > overwrites about 200 MB of logical flows, making Sb ovsdb-server send > > 200 x 150-ish (clients) MB of data momentarily consuming 30 GB of RAM. > > All ovn-controllers also spend 20-30 seconds processing the new update. > > > > We still need a way to search for logical flows in the southbound > > database directly, as not every lflow was created by the current process. > > We need to find a way to do that on top of this change or revert it. > > I assume, revert might be easier in the short term as serching Sb is > > exacly what this change was trying to avoid. > > > > Best regards, Ilya Maximets. > > > > > Hi, > > based on the above I have posted a revert for now. I think it will be best > to unblock CI and focus on the development outside for now. > > Thanks, > Ales > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > -- _‘Esta mensagem é direcionada apenas para os endereços constantes no cabeçalho inicial. Se você não está listado nos endereços constantes no cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão imediatamente anuladas e proibidas’._ * **‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
