Hi Mark, On 7/3/25 2:57 PM, Mark Michelson wrote: > On 7/3/25 7:54 AM, Dumitru Ceara wrote: >> On 7/2/25 6:26 PM, Mark Michelson via dev wrote: >>> This is similar to a recent change that refactored datapath syncing. >>> This works in a very similar way. >>> >>> * Input nodes create type-agnostic ovn_unpaired_port_bindings. These are >>> fed into the en-port-binding-pair node. >>> * The en-port-binding-pair node ensures that a southbound Port_Binding >>> is created for each unpaired port binding. Any remaining soutbound >>> Port_Bindings are deleted. >>> * Type-specific nodes then convert the paired port bindings into >>> type-specific paired port bindings that can be consumed by other >>> nodes. >>> >>> However, there are some important differences to note between this and >>> the datapath sync patch: >>> >>> * This patch opts for the word "pair" instead of "sync". This is because >>> en-port-binding-pair ensures that all southbound Port_Bindings are >>> paired with some northbound structure. However, the columns in the >>> southobund Port_Binding are not all synced with their northd >>> counterpart. Other engine nodes are responsible for fully syncing the >>> data. >>> * Not all southbound Port_Bindings have a corresponding northbound row. >>> Therefore, the engine nodes that create unpaired port bindings >>> pass an >>> opaque cookie pointer to the pairing node instead of a database row. >>> * Port bindings tend to be identified by name. This means the code has >>> to have several safeguards in place to catch scenarios such as a port >>> "moving" from one datapath to another, or a port being deleted and >>> re-added quickly. >>> * Unlike with the datapath syncing code, this required changes to >>> northd's incremental processing since northd was already >>> incrementally >>> processing some northbound logical switch port changes. >>> >>> Signed-off-by: Mark Michelson <mmich...@redhat.com> >>> --- >> >> Hi Mark, >> >> Thanks for this new revision! This is not the actual review though, >> just an >> update with some scale testing results we got with ovn-heater. >> >>> v9 -> v10: >>> * Rebased. >>> * Used designated initializers after allocating structures. >>> * Added a new "free_cookie" callback to the >>> ovn_unsynced_port_binding_map callbacks. All port binding engine >>> nodes now provide this callback. This means there is no longer a >>> fallback to the default callbacks if none are provided. >>> * Some formatting fixes. >>> >>> v8 -> v9: >>> * Rebased. >>> * Changed some file names to be in alignment with others in the >>> directory. >>> * Fixed memory leak of port tunnel ids. >>> * Added a missing "static" to a structure in >>> en-port-binding-logical-switch-port.c. >>> >>> v7 -> v8: >>> * Rebased. >>> >>> v6 -> v7: >>> * Rebased. >>> * Made some recommended changes to formatting (combining if >>> statements, >>> fixing indentation, etc.) >>> * Switched from an array of router DPGs in the chassisredirect code >>> to a vector. >>> * Fixed several memory leaks in the chassisredirect code. >>> * Fixed a crash in northd.c in the incremental case where a northbound >>> logical switch port did not have a corresponding paired logical >>> switch port to find due to tunnel key issues. >>> * Ensured that we only compare a requested tunnel key against the >>> VXLAN multicast minimum when VXLAN is actually enabled. >>> >>> v5 -> v6: >>> * Rebased. >>> * All calls to xzalloc() in this patch have been changed to xmalloc() >>> or xcalloc(). >>> * Candidate paired port bindings use a vector instead of a list now. >>> * Chassisredirect ports have been simplified a great deal. >>> * This version introduces an enum for port binding types to make type >>> checking more efficient. >>> * This patch abandons the attempt to blindly pull in unpaired port >>> binding maps based on their positions in the engine node inputs >>> array. Instead, we now explicitly pull them in by input name and >>> assign them to array positions based on their port binding type. >>> >>> v4 -> v5: >>> * Rebased. >>> * Fixed some formatting anomalies in mirror port syncing. >>> * Fixed checkpatch warnings. >>> >>> v3 -> v4: >>> * Rebased. >>> * Addressed newly-added mirror port functionality. >>> * Fixed many memory leaks. >>> >>> v3: This is the first version with this patch present. >>> --- >> >> [...] >> >>> >>> diff --git a/TODO.rst b/TODO.rst >>> index 78962bb92..60ae155c5 100644 >>> --- a/TODO.rst >>> +++ b/TODO.rst >>> @@ -168,3 +168,15 @@ OVN To-do List >>> ovn\_synced\_logical_router and ovn\_synced\_logical\_switch. >>> This will >>> allow for the eventual removal of the ovn\_datapath structure >>> from the >>> codebase. >>> + >>> +* Port Binding sync nodes >>> + >>> + * Southbound Port bindings are synced across three engine nodes: >>> + - en_port_binding_pair >>> + - en_northd >>> + - en_sync_to_sb >>> + It would be easier to work with if these were combined into a >>> + single node instead. >>> + >>> + * Add incremental processing to the en-port-binding-pair node, as >>> + well as derivative nodes. >> >> Unfortunately, it seems that the lack of incremental processing for these >> northd nodes introduces a measurable performance regression. >> With ovn-heater, I ran a 500 node density-heavy (simulating realistic >> ovn-kubernetes workloads) test. That showed an increase of ~50% in >> time it takes to run the simulated workloads. >>
I just realized, I had tested with v9 but the same performance regression would happen with v10. Also, v10 seems to have a small issue which causes compilation to fail: https://mail.openvswitch.org/pipermail/ovs-build/2025-July/047274.html northd/northd.c:589:16: error: Using plain integer as NULL pointer northd/northd.c:597:16: error: Using plain integer as NULL pointer >> Most of that seems to be due to these new nodes performing recomputation >> on every input change. >> >> For ease of debugging I uploaded the NB database extracted from that >> test here: >> https://people.redhat.com/~dceara/ovn-northd-datapath-port-binding- >> refactor-v10/ovnnb_db.db.gz >> >> To make it easier to debug directly on my machine I hacked the ovn- >> sandbox >> script in the following way: >> >> diff --git a/tutorial/ovn-sandbox b/tutorial/ovn-sandbox >> index ed334d1c31..a64bb930dd 100755 >> --- a/tutorial/ovn-sandbox >> +++ b/tutorial/ovn-sandbox >> @@ -661,14 +661,6 @@ else >> run ovs-vsctl set open . external-ids:ovn-remote=$OVN_SB_DB >> OVN_CTRLR_PKI="" >> fi >> -for i in $(seq $n_ics); do >> - if [ $i -eq 1 ]; then inst=""; else inst=$i; fi >> - rungdb $gdb_ovn_ic $gdb_ovn_ic_ex ovn-ic --detach \ >> - --no-chdir --pidfile=ovn-ic${inst}.pid -vconsole:off \ >> - --log-file=ovn-ic${inst}.log -vsyslog:off \ >> - --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" \ >> - --ic-sb-db="$OVN_IC_SB_DB" --ic-nb-db="$OVN_IC_NB_DB" >> -done >> northd_args= >> OVN_NORTHD=ovn-northd >> @@ -680,17 +672,11 @@ for i in $(seq $n_northds); do >> --log-file=$OVN_NORTHD$inst.log -vsyslog:off \ >> --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" >> $northd_args >> done >> -for i in $(seq $n_controllers); do >> - if [ $i -eq 1 ]; then inst=""; else inst=$i; fi >> - rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \ >> - $OVN_CTRLR_PKI --detach --no-chdir -vsyslog:off \ >> - --log-file=ovn-controller${inst}.log \ >> - --pidfile=ovn-controller${inst}.pid -vconsole:off >> -done >> -rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \ >> - ovn-controller-vtep --detach --no-chdir --pidfile -vconsole:off \ >> - $OVN_CTRLR_PKI --log-file -vsyslog:off \ >> - --ovnsb-db="$OVN_SB_DB" >> + >> +run sleep 10 >> +run ovn-sbctl set connection . inactivity_probe=180000 >> +run ovn-nbctl set connection . inactivity_probe=180000 >> +run ovn-nbctl set NB_GLOBAL . options:northd_probe_interval=180000 >> cat <<EOF >> --- >> >> Then ran a sandbox with the test NB database: >> >> $ make -j4 sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db" >> >> Decreased the threshold for logging I-P node compute times: >> $ ovn-appctl -t ovn-northd inc-engine/compute-log-timeout 10 >> >> I waited for things to settle down and then I added a new logical switch >> port: >> $ ovn-nbctl lsp-add lswitch-ovn-scale-99 lsp-foo \ >> -- lsp-set-addresses lsp-foo "00:00:00:00:00:01 42.42.42.42" >> >> ovn-northd.log shows: >> 2025-07-03T11:49:59.869Z|00113|inc_proc_eng|INFO|node: >> port_binding_logical_switch_port, recompute (missing handler for input >> datapath_synced_logical_switch) took 101ms >> 2025-07-03T11:49:59.895Z|00114|inc_proc_eng|INFO|node: >> port_binding_mirror, recompute (missing handler for input >> datapath_synced_logical_switch) took 15ms >> 2025-07-03T11:50:00.059Z|00115|inc_proc_eng|INFO|node: >> port_binding_pair, recompute (missing handler for input >> port_binding_logical_switch_port) took 164ms >> 2025-07-03T11:50:00.126Z|00116|inc_proc_eng|INFO|node: >> port_binding_paired_logical_switch_port, recompute (missing handler >> for input port_binding_pair) took 62ms >> 2025-07-03T11:50:00.255Z|00117|inc_proc_eng|INFO|node: >> port_binding_logical_switch_port, recompute (missing handler for input >> datapath_synced_logical_switch) took 99ms >> 2025-07-03T11:50:00.290Z|00118|inc_proc_eng|INFO|node: >> port_binding_mirror, recompute (missing handler for input >> datapath_synced_logical_switch) took 24ms >> 2025-07-03T11:50:00.480Z|00119|inc_proc_eng|INFO|node: >> port_binding_pair, recompute (missing handler for input >> port_binding_logical_switch_port) took 190ms >> 2025-07-03T11:50:00.573Z|00120|inc_proc_eng|INFO|node: >> port_binding_paired_logical_switch_port, recompute (missing handler >> for input port_binding_pair) took 85ms >> 2025-07-03T11:50:00.609Z|00121|inc_proc_eng|INFO|node: sync_from_sb, >> recompute (missing handler for input SB_port_binding) took 27ms >> >> While on main most of this is done incrementally (only the sync_from_sb >> node takes some time). >> >> I plan to still review the series but I'm afraid that we cannot merge >> it until we figure out how to reduce performance impact due to these >> these new nodes. > > Thanks for looking into this Dumitru. I was aware there could be some > performance regression due to the lack of incremental processing in the > new engine nodes, but I wasn't aware it was going to be so severe. I > think this particular test shows the issues in the port binding nodes, > but other tests could be crafted to show a similar issue in the datapath > nodes. > I guess we'd have to test to confirm. > I also agree that merging this with such a performance regression would > be ill-advised, especially since we don't have much time to fix it up > before soft freeze for ovn 25.09 (18 July). I think I have two choices > here: try to get incremental patches developed and posted before the > soft freeze, or defer this series until ovn 26.03 and take the time to > develop incremental patches during that period. > It might be easier to implement incremental processing for the datapath nodes added in patch 3/4. We could also consider accepting the first 3 patches in 25.09 in that case. Regards, Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev