Hi all,

I sent a v4 that fixed these conditions.


Best,
Lucas


Em seg., 6 de out. de 2025 às 08:49, Lucas Vargas Dias <
[email protected]> escreveu:

> Hi,
>
> I'll work on this issue reported when northd is restarted or when one
> northd goes on standby and the other becomes
> active.
>
> Best,
> Lucas
>
> Em seg., 6 de out. de 2025 às 07:19, Ales Musil via dev <
> [email protected]> escreveu:
>
>> On Mon, Oct 6, 2025 at 11:52 AM Ilya Maximets <[email protected]> wrote:
>>
>> > On 10/3/25 10:06 PM, Mark Michelson via dev wrote:
>> > > Thanks Lucas and Numan.
>> > >
>> > > I merged this to main.
>> > >
>> > > On Thu, Sep 25, 2025 at 10:25 AM Numan Siddique <[email protected]>
>> wrote:
>> > >>
>> > >> On Thu, Sep 25, 2025 at 8:16 AM Lucas Vargas Dias via dev
>> > >> <[email protected]> wrote:
>> > >>>
>> > >>> Change the logic to save sbflow uuid and just update if
>> > >>> the lflow is reused. Otherwise, it's removed.
>> > >>> Also, reduce sbflow searching with uuidset instead of
>> > >>> searching through all lflow table.
>> > >>> Add lflow states:
>> > >>> LFLOW_STALE - Lflow is not relevant
>> > >>> LFLOW_TO_SYNC - Lflow needs to be synced with SB DB
>> > >>> LFLOW_SYNCED - Lflow is synced with SB SB
>> > >>>
>> > >>> It generates the following results in a scenario with:
>> > >>> - LSPs: 56548
>> > >>> - LRPs: 27304
>> > >>> - LRs: 7922
>> > >>> - LSs: 28602
>> > >>>
>> > >>> Without the commit:
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
>> > >>> Time spent on processing nb_cfg 275438:
>> > >>>         ovn-northd delay before processing:     16069ms
>> > >>>         ovn-northd completion:                  32828ms
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
>> > >>> Time spent on processing nb_cfg 275439:
>> > >>>         ovn-northd delay before processing:     15019ms
>> > >>>         ovn-northd completion:                  33207ms
>> > >>>
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
>> > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
>> > >>> Time spent on processing nb_cfg 275440:
>> > >>>         ovn-northd delay before processing:     14784ms
>> > >>>         ovn-northd completion:                  33577ms
>> > >>>
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add
>> bar9-1
>> > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
>> > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
>> > >>> Time spent on processing nb_cfg 275441:
>> > >>>         ovn-northd delay before processing:     14598ms
>> > >>>         ovn-northd completion:                  31942ms
>> > >>>
>> > >>> With the commit:
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
>> > >>> Time spent on processing nb_cfg 275401:
>> > >>>         ovn-northd delay before processing:     12602ms
>> > >>>         ovn-northd completion:                  26103ms
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
>> > >>> Time spent on processing nb_cfg 275402:
>> > >>>         ovn-northd delay before processing:     12639ms
>> > >>>         ovn-northd completion:                  26759ms
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
>> > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
>> > >>> Time spent on processing nb_cfg 275403:
>> > >>>         ovn-northd delay before processing:     11874ms
>> > >>>         ovn-northd completion:                  29733ms
>> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add
>> bar9-1
>> > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
>> > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
>> > >>> Time spent on processing nb_cfg 275404:
>> > >>>         ovn-northd delay before processing:     4058ms
>> > >>>         ovn-northd completion:                  17323ms
>> > >>>
>> > >>> Signed-off-by: Lucas Vargas Dias <[email protected]>
>> > >>
>> > >> Thanks for addressing the comments and for the great improvements.
>> > >>
>> > >> Acked-by: Numan Siddique <[email protected]>
>> > >>
>> > >> Numan
>> > >>
>> > >>> ---
>> > >>>  northd/en-lflow.c   |   2 +-
>> > >>>  northd/lflow-mgr.c  | 108
>> ++++++++++++++++++++++++--------------------
>> > >>>  northd/lflow-mgr.h  |  10 +++-
>> > >>>  northd/northd.c     |   8 ++--
>> > >>>  tests/ovn-northd.at |  32 +++++++++++++
>> > >>>  5 files changed, 106 insertions(+), 54 deletions(-)
>> > >>>
>> > >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c
>> > >>> index 50570b611..13c5e3119 100644
>> > >>> --- a/northd/en-lflow.c
>> > >>> +++ b/northd/en-lflow.c
>> > >>> @@ -122,7 +122,7 @@ en_lflow_run(struct engine_node *node, void
>> *data)
>> > >>>      stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
>> > >>>
>> > >>>      struct lflow_data *lflow_data = data;
>> > >>> -    lflow_table_clear(lflow_data->lflow_table);
>> > >>> +    lflow_table_clear(lflow_data->lflow_table, false);
>> > >>>      lflow_reset_northd_refs(&lflow_input);
>> > >>>      lflow_ref_clear(lflow_input.igmp_lflow_ref);
>> > >>>
>> > >>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
>> > >>> index 6a66a9718..5a989fe19 100644
>> > >>> --- a/northd/lflow-mgr.c
>> > >>> +++ b/northd/lflow-mgr.c
>> > >>> @@ -26,6 +26,7 @@
>> > >>>  #include "lflow-mgr.h"
>> > >>>  #include "lib/ovn-parallel-hmap.h"
>> > >>>  #include "lib/ovn-util.h"
>> > >>> +#include "lib/uuidset.h"
>> > >>>
>> > >>>  VLOG_DEFINE_THIS_MODULE(lflow_mgr);
>> > >>>
>> > >>> @@ -37,7 +38,8 @@ static void ovn_lflow_init(struct ovn_lflow *,
>> > struct ovn_datapath *od,
>> > >>>                             uint16_t priority, char *match,
>> > >>>                             char *actions, char *io_port,
>> > >>>                             char *ctrl_meter, char *stage_hint,
>> > >>> -                           const char *where, const char
>> *flow_desc);
>> > >>> +                           const char *where, const char
>> *flow_desc,
>> > >>> +                           struct uuid sbuuid);
>> > >>>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
>> > >>>                                          enum ovn_stage stage,
>> > >>>                                          uint16_t priority, const
>> char
>> > *match,
>> > >>> @@ -147,6 +149,13 @@ static struct ovs_mutex
>> > lflow_hash_locks[LFLOW_HASH_LOCK_MASK + 1];
>> > >>>   */
>> > >>>  extern struct ovs_mutex fake_hash_mutex;
>> > >>>
>> > >>> +
>> > >>> +enum ovn_lflow_state {
>> > >>> +    LFLOW_STALE,
>> > >>> +    LFLOW_TO_SYNC,
>> > >>> +    LFLOW_SYNCED,
>> > >>> +};
>> > >>> +
>> > >>>  /* Represents a logical ovn flow (lflow).
>> > >>>   *
>> > >>>   * A logical flow with match 'M' and actions 'A' - L(M, A) is
>> created
>> > >>> @@ -181,14 +190,7 @@ struct ovn_lflow {
>> > >>>      struct hmap dp_refcnts_map; /* Maintains the number of times
>> this
>> > ovn_lflow
>> > >>>                                   * is referenced by a given
>> datapath.
>> > >>>                                   * Contains 'struct dp_refcnt' in
>> the
>> > map. */
>> > >>> -};
>> > >>> -
>> > >>> -/* Logical flow table. */
>> > >>> -struct lflow_table {
>> > >>> -    struct hmap entries; /* hmap of lflows. */
>> > >>> -    struct hmap ls_dp_groups; /* hmap of logical switch dp groups.
>> */
>> > >>> -    struct hmap lr_dp_groups; /* hmap of logical router dp groups.
>> */
>> > >>> -    ssize_t max_seen_lflow_size;
>> > >>> +    enum ovn_lflow_state sync_state;
>> > >>>  };
>> > >>>
>> > >>>  struct lflow_table *
>> > >>> @@ -210,10 +212,14 @@ lflow_table_init(struct lflow_table
>> *lflow_table)
>> > >>>  }
>> > >>>
>> > >>>  void
>> > >>> -lflow_table_clear(struct lflow_table *lflow_table)
>> > >>> +lflow_table_clear(struct lflow_table *lflow_table, bool
>> destroy_all)
>> > >>>  {
>> > >>>      struct ovn_lflow *lflow;
>> > >>>      HMAP_FOR_EACH_SAFE (lflow, hmap_node, &lflow_table->entries) {
>> > >>> +        if (!destroy_all) {
>> > >>> +            lflow->sync_state = LFLOW_STALE;
>> > >>> +            continue;
>> > >>> +        }
>> > >>>          ovn_lflow_destroy(lflow_table, lflow);
>> > >>>      }
>> > >>>
>> > >>> @@ -224,7 +230,7 @@ lflow_table_clear(struct lflow_table
>> *lflow_table)
>> > >>>  void
>> > >>>  lflow_table_destroy(struct lflow_table *lflow_table)
>> > >>>  {
>> > >>> -    lflow_table_clear(lflow_table);
>> > >>> +    lflow_table_clear(lflow_table, true);
>> > >>>      hmap_destroy(&lflow_table->entries);
>> > >>>      ovn_dp_groups_destroy(&lflow_table->ls_dp_groups);
>> > >>>      ovn_dp_groups_destroy(&lflow_table->lr_dp_groups);
>> > >>> @@ -257,16 +263,42 @@ lflow_table_sync_to_sb(struct lflow_table
>> > *lflow_table,
>> > >>>                         const struct sbrec_logical_flow_table
>> > *sb_flow_table,
>> > >>>                         const struct sbrec_logical_dp_group_table
>> > *dpgrp_table)
>> > >>>  {
>> > >>> +    struct uuidset sb_uuid_set = UUIDSET_INITIALIZER(&sb_uuid_set);
>> > >>>      struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp);
>> > >>>      struct hmap *lflows = &lflow_table->entries;
>> > >>>      struct ovn_lflow *lflow;
>> > >>> +    const struct sbrec_logical_flow *sbflow;
>> > >>>
>> > >>>      fast_hmap_size_for(&lflows_temp,
>> > >>>                         lflow_table->max_seen_lflow_size);
>> > >>>
>> > >>> +    HMAP_FOR_EACH_SAFE (lflow, hmap_node, lflows) {
>> > >>> +        if (lflow->sync_state == LFLOW_STALE) {
>> > >>> +            ovn_lflow_destroy(lflow_table, lflow);
>> > >>> +            continue;
>> > >>> +        }
>> > >>> +        sbflow = NULL;
>> > >>> +        if (!uuid_is_zero(&lflow->sb_uuid)) {
>> > >>> +            sbflow =
>> > sbrec_logical_flow_table_get_for_uuid(sb_flow_table,
>> > >>> +
>> >  &lflow->sb_uuid);
>> > >>> +        }
>> > >>> +        sync_lflow_to_sb(lflow, ovnsb_txn, lflow_table,
>> ls_datapaths,
>> > >>> +                         lr_datapaths,
>> ovn_internal_version_changed,
>> > >>> +                         sbflow, dpgrp_table);
>> > >>> +        uuidset_insert(&sb_uuid_set, &lflow->sb_uuid);
>> > >>> +        hmap_remove(lflows, &lflow->hmap_node);
>> > >>> +        hmap_insert(&lflows_temp, &lflow->hmap_node,
>> > >>> +                    hmap_node_hash(&lflow->hmap_node));
>> > >>> +    }
>> >
>> > Hi, Lucas, Numan, Mark, others.
>> >
>> > This change introduces a significant performance issue when northd
>> > is restarted or when one northd goes on standby and the other becomes
>> > active.
>> >
>> > The problem is that the code above is using lflow->sb_uuid that is
>> > always zero for logical flows not created by the currently active
>> > northd process.  So, if the flows existed in the database before the
>> > current process became active, it will remove all of these flows and
>> > re-create anew.  The issue can be easily reproduced in the sanbdbox
>> > by adding some resources, then stopping northd and restarting it.
>> > Checking the database file we can see all the logical flows re-created.
>> >
>> > In ovn-heater setup this behavior is causing a significant iteration
>> > time spike for up to 140 seconds in 500-node cluster-density test
>> > every time SB leader changes and a new northd becomes active.  It
>> > overwrites about 200 MB of logical flows, making Sb ovsdb-server send
>> > 200 x 150-ish (clients) MB of data momentarily consuming 30 GB of RAM.
>> > All ovn-controllers also spend 20-30 seconds processing the new update.
>> >
>> > We still need a way to search for logical flows in the southbound
>> > database directly, as not every lflow was created by the current
>> process.
>> > We need to find a way to do that on top of this change or revert it.
>> > I assume, revert might be easier in the short term as serching Sb is
>> > exacly what this change was trying to avoid.
>> >
>> > Best regards, Ilya Maximets.
>> >
>> >
>> Hi,
>>
>> based on the above I have posted a revert for now. I think it will be best
>> to unblock CI and focus on the development outside for now.
>>
>> Thanks,
>> Ales
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>

-- 




_‘Esta mensagem é direcionada apenas para os endereços constantes no 
cabeçalho inicial. Se você não está listado nos endereços constantes no 
cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa 
mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão 
imediatamente anuladas e proibidas’._


* **‘Apesar do Magazine Luiza tomar 
todas as precauções razoáveis para assegurar que nenhum vírus esteja 
presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por 
quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*



_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to