Hi Mark,

On 7/3/25 2:57 PM, Mark Michelson wrote:
> On 7/3/25 7:54 AM, Dumitru Ceara wrote:
>> On 7/2/25 6:26 PM, Mark Michelson via dev wrote:
>>> This is similar to a recent change that refactored datapath syncing.
>>> This works in a very similar way.
>>>
>>> * Input nodes create type-agnostic ovn_unpaired_port_bindings. These are
>>>    fed into the en-port-binding-pair node.
>>> * The en-port-binding-pair node ensures that a southbound Port_Binding
>>>    is created for each unpaired port binding. Any remaining soutbound
>>>    Port_Bindings are deleted.
>>> * Type-specific nodes then convert the paired port bindings into
>>>    type-specific paired port bindings that can be consumed by other
>>>    nodes.
>>>
>>> However, there are some important differences to note between this and
>>> the datapath sync patch:
>>>
>>> * This patch opts for the word "pair" instead of "sync". This is because
>>>    en-port-binding-pair ensures that all southbound Port_Bindings are
>>>    paired with some northbound structure. However, the columns in the
>>>    southobund Port_Binding are not all synced with their northd
>>>    counterpart. Other engine nodes are responsible for fully syncing the
>>>    data.
>>> * Not all southbound Port_Bindings have a corresponding northbound row.
>>>    Therefore, the engine nodes that create unpaired port bindings
>>> pass an
>>>    opaque cookie pointer to the pairing node instead of a database row.
>>> * Port bindings tend to be identified by name. This means the code has
>>>    to have several safeguards in place to catch scenarios such as a port
>>>    "moving" from one datapath to another, or a port being deleted and
>>>    re-added quickly.
>>> * Unlike with the datapath syncing code, this required changes to
>>>    northd's incremental processing since northd was already
>>> incrementally
>>>    processing some northbound logical switch port changes.
>>>
>>> Signed-off-by: Mark Michelson <mmich...@redhat.com>
>>> ---
>>
>> Hi Mark,
>>
>> Thanks for this new revision!  This is not the actual review though,
>> just an
>> update with some scale testing results we got with ovn-heater.
>>
>>> v9 -> v10:
>>>   * Rebased.
>>>   * Used designated initializers after allocating structures.
>>>   * Added a new "free_cookie" callback to the
>>>     ovn_unsynced_port_binding_map callbacks. All port binding engine
>>>     nodes now provide this callback. This means there is no longer a
>>>     fallback to the default callbacks if none are provided.
>>>   * Some formatting fixes.
>>>
>>> v8 -> v9:
>>>   * Rebased.
>>>   * Changed some file names to be in alignment with others in the
>>>     directory.
>>>   * Fixed memory leak of port tunnel ids.
>>>   * Added a missing "static" to a structure in
>>>     en-port-binding-logical-switch-port.c.
>>>
>>> v7 -> v8:
>>>   * Rebased.
>>>
>>> v6 -> v7:
>>>   * Rebased.
>>>   * Made some recommended changes to formatting (combining if
>>> statements,
>>>     fixing indentation, etc.)
>>>   * Switched from an array of router DPGs in the chassisredirect code
>>>     to a vector.
>>>   * Fixed several memory leaks in the chassisredirect code.
>>>   * Fixed a crash in northd.c in the incremental case where a northbound
>>>     logical switch port did not have a corresponding paired logical
>>>     switch port to find due to tunnel key issues.
>>>   * Ensured that we only compare a requested tunnel key against the
>>>     VXLAN multicast minimum when VXLAN is actually enabled.
>>>
>>> v5 -> v6:
>>>   * Rebased.
>>>   * All calls to xzalloc() in this patch have been changed to xmalloc()
>>>     or xcalloc().
>>>   * Candidate paired port bindings use a vector instead of a list now.
>>>   * Chassisredirect ports have been simplified a great deal.
>>>   * This version introduces an enum for port binding types to make type
>>>     checking more efficient.
>>>   * This patch abandons the attempt to blindly pull in unpaired port
>>>     binding maps based on their positions in the engine node inputs
>>>     array. Instead, we now explicitly pull them in by input name and
>>>     assign them to array positions based on their port binding type.
>>>
>>> v4 -> v5:
>>>   * Rebased.
>>>   * Fixed some formatting anomalies in mirror port syncing.
>>>   * Fixed checkpatch warnings.
>>>
>>> v3 -> v4:
>>>   * Rebased.
>>>   * Addressed newly-added mirror port functionality.
>>>   * Fixed many memory leaks.
>>>
>>> v3: This is the first version with this patch present.
>>> ---
>>
>> [...]
>>
>>>
>>> diff --git a/TODO.rst b/TODO.rst
>>> index 78962bb92..60ae155c5 100644
>>> --- a/TODO.rst
>>> +++ b/TODO.rst
>>> @@ -168,3 +168,15 @@ OVN To-do List
>>>       ovn\_synced\_logical_router and ovn\_synced\_logical\_switch.
>>> This will
>>>       allow for the eventual removal of the ovn\_datapath structure
>>> from the
>>>       codebase.
>>> +
>>> +* Port Binding sync nodes
>>> +
>>> +  * Southbound Port bindings are synced across three engine nodes:
>>> +    - en_port_binding_pair
>>> +    - en_northd
>>> +    - en_sync_to_sb
>>> +    It would be easier to work with if these were combined into a
>>> +    single node instead.
>>> +
>>> +  * Add incremental processing to the en-port-binding-pair node, as
>>> +    well as derivative nodes.
>>
>> Unfortunately, it seems that the lack of incremental processing for these
>> northd nodes introduces a measurable performance regression.
>> With ovn-heater, I ran a 500 node density-heavy (simulating realistic
>> ovn-kubernetes workloads) test.  That showed an increase of ~50% in
>> time it takes to run the simulated workloads.
>>

I just realized, I had tested with v9 but the same performance
regression would happen with v10.

Also, v10 seems to have a small issue which causes compilation to fail:

https://mail.openvswitch.org/pipermail/ovs-build/2025-July/047274.html

northd/northd.c:589:16: error: Using plain integer as NULL pointer
northd/northd.c:597:16: error: Using plain integer as NULL pointer

>> Most of that seems to be due to these new nodes performing recomputation
>> on every input change.
>>
>> For ease of debugging I uploaded the NB database extracted from that
>> test here:
>> https://people.redhat.com/~dceara/ovn-northd-datapath-port-binding-
>> refactor-v10/ovnnb_db.db.gz
>>
>> To make it easier to debug directly on my machine I hacked the ovn-
>> sandbox
>> script in the following way:
>>
>> diff --git a/tutorial/ovn-sandbox b/tutorial/ovn-sandbox
>> index ed334d1c31..a64bb930dd 100755
>> --- a/tutorial/ovn-sandbox
>> +++ b/tutorial/ovn-sandbox
>> @@ -661,14 +661,6 @@ else
>>       run ovs-vsctl set open . external-ids:ovn-remote=$OVN_SB_DB
>>       OVN_CTRLR_PKI=""
>>   fi
>> -for i in $(seq $n_ics); do
>> -    if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
>> -    rungdb $gdb_ovn_ic $gdb_ovn_ic_ex ovn-ic --detach \
>> -            --no-chdir --pidfile=ovn-ic${inst}.pid -vconsole:off \
>> -            --log-file=ovn-ic${inst}.log -vsyslog:off \
>> -            --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" \
>> -            --ic-sb-db="$OVN_IC_SB_DB" --ic-nb-db="$OVN_IC_NB_DB"
>> -done
>>     northd_args=
>>   OVN_NORTHD=ovn-northd
>> @@ -680,17 +672,11 @@ for i in $(seq $n_northds); do
>>               --log-file=$OVN_NORTHD$inst.log -vsyslog:off \
>>               --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB"
>> $northd_args
>>   done
>> -for i in $(seq $n_controllers); do
>> -    if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
>> -    rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
>> -            $OVN_CTRLR_PKI --detach --no-chdir -vsyslog:off \
>> -            --log-file=ovn-controller${inst}.log \
>> -            --pidfile=ovn-controller${inst}.pid -vconsole:off
>> -done
>> -rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
>> -    ovn-controller-vtep --detach --no-chdir --pidfile -vconsole:off \
>> -    $OVN_CTRLR_PKI --log-file -vsyslog:off \
>> -    --ovnsb-db="$OVN_SB_DB"
>> +
>> +run sleep 10
>> +run ovn-sbctl set connection . inactivity_probe=180000
>> +run ovn-nbctl set connection . inactivity_probe=180000
>> +run ovn-nbctl set NB_GLOBAL . options:northd_probe_interval=180000
>>     cat <<EOF
>>   ---
>>
>> Then ran a sandbox with the test NB database:
>>
>> $ make -j4 sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db"
>>
>> Decreased the threshold for logging I-P node compute times:
>> $ ovn-appctl -t ovn-northd inc-engine/compute-log-timeout 10
>>
>> I waited for things to settle down and then I added a new logical switch
>> port:
>> $ ovn-nbctl lsp-add lswitch-ovn-scale-99 lsp-foo \
>>    -- lsp-set-addresses lsp-foo "00:00:00:00:00:01 42.42.42.42"
>>
>> ovn-northd.log shows:
>> 2025-07-03T11:49:59.869Z|00113|inc_proc_eng|INFO|node:
>> port_binding_logical_switch_port, recompute (missing handler for input
>> datapath_synced_logical_switch) took 101ms
>> 2025-07-03T11:49:59.895Z|00114|inc_proc_eng|INFO|node:
>> port_binding_mirror, recompute (missing handler for input
>> datapath_synced_logical_switch) took 15ms
>> 2025-07-03T11:50:00.059Z|00115|inc_proc_eng|INFO|node:
>> port_binding_pair, recompute (missing handler for input
>> port_binding_logical_switch_port) took 164ms
>> 2025-07-03T11:50:00.126Z|00116|inc_proc_eng|INFO|node:
>> port_binding_paired_logical_switch_port, recompute (missing handler
>> for input port_binding_pair) took 62ms
>> 2025-07-03T11:50:00.255Z|00117|inc_proc_eng|INFO|node:
>> port_binding_logical_switch_port, recompute (missing handler for input
>> datapath_synced_logical_switch) took 99ms
>> 2025-07-03T11:50:00.290Z|00118|inc_proc_eng|INFO|node:
>> port_binding_mirror, recompute (missing handler for input
>> datapath_synced_logical_switch) took 24ms
>> 2025-07-03T11:50:00.480Z|00119|inc_proc_eng|INFO|node:
>> port_binding_pair, recompute (missing handler for input
>> port_binding_logical_switch_port) took 190ms
>> 2025-07-03T11:50:00.573Z|00120|inc_proc_eng|INFO|node:
>> port_binding_paired_logical_switch_port, recompute (missing handler
>> for input port_binding_pair) took 85ms
>> 2025-07-03T11:50:00.609Z|00121|inc_proc_eng|INFO|node: sync_from_sb,
>> recompute (missing handler for input SB_port_binding) took 27ms
>>
>> While on main most of this is done incrementally (only the sync_from_sb
>> node takes some time).
>>
>> I plan to still review the series but I'm afraid that we cannot merge
>> it until we figure out how to reduce performance impact due to these
>> these new nodes.
> 
> Thanks for looking into this Dumitru. I was aware there could be some
> performance regression due to the lack of incremental processing in the
> new engine nodes, but I wasn't aware it was going to be so severe. I
> think this particular test shows the issues in the port binding nodes,
> but other tests could be crafted to show a similar issue in the datapath
> nodes.
> 

I guess we'd have to test to confirm.

> I also agree that merging this with such a performance regression would
> be ill-advised, especially since we don't have much time to fix it up
> before soft freeze for ovn 25.09 (18 July). I think I have two choices
> here: try to get incremental patches developed and posted before the
> soft freeze, or defer this series until ovn 26.03 and take the time to
> develop incremental patches during that period.
> 

It might be easier to implement incremental processing for the datapath
nodes added in patch 3/4.  We could also consider accepting the first 3
patches in 25.09 in that case.

Regards,
Dumitru

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to