Hi Ales,

+Lorenzo & Numan in CC

On 5/19/26 11:21 PM, Dumitru Ceara wrote:
> On 5/14/26 10:13 AM, Ales Musil wrote:
>> On Wed, May 13, 2026 at 11:15 AM Dumitru Ceara via dev <
>> [email protected]> wrote:
>>
>>> The pflow_output SB_port_binding handler triggers a full
>>> recompute when the type column is updated on a port binding.
>>> However, for newly created port bindings, the OVSDB IDL
>>> marks all non-default columns as "updated", even though no
>>> actual update occurred.  This caused every new port binding
>>> with a non-default type (e.g., remote, patch, localnet,
>>> router) to unnecessarily trigger a full pflow_output
>>> recompute, severely impacting ovn-controller performance
>>> at scale.
>>>
>>> This is particularly problematic in deployments that use
>>> remote LSPs, such as ovn-kubernetes with L2 UDNs, where
>>> frequent creation of remote port bindings leads to
>>> continuous full recomputes and high CPU usage.
>>>
>>> Guard the type-update check with sbrec_port_binding_is_new()
>>> and sbrec_port_binding_is_deleted() so that only genuine
>>> type changes on existing port bindings trigger a recompute.
>>> This matches the pattern already used in binding.c for the
>>> tunnel_key column.
>>>
>>> Also fix a typo in the test name ("path" -> "patch").
>>>
>>> Fixes: 73a10345a29c ("controller: Update physical flows for peer port when
>>> the patch port is removed.")
>>> Reported-at: https://redhat.atlassian.net/browse/FDP-3819
>>> Reported-by: Patryk Diak <[email protected]>
>>> Assisted-by: Claude Opus 4.6, Claude Code
>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>> ---

[...]
>>>
>> Thank you Dumitru,
>>
>> applied to main and backported down to 25.09.
>>
> 
> Hi Ales,
> 
> Thanks for that!  I actually tried to cherry pick this on 25.03 too
> today because Red Hat's ovn-kubernetes needs it there too but I'm
> hitting a test failure:
> 
> rcv_n=0 exp_n=1
> ovn-macros.at:12: wait failed after 30 seconds
> Expected:
> f0f00000001100000101020708004500001c000000003e111726ac1f0064ac1f000a0035111100080000
> Received:
> Diff:
> --- vif-north.expected.sorted 2026-05-19 09:38:11.881677567 +0000
> +++ hv4/vif-north-tx.packets.sorted   2026-05-19 09:38:11.883375243 +0000
> @@ -1 +0,0 @@
> -f0f00000001100000101020708004500001c000000003e111726ac1f0064ac1f000a0035111100080000
> ../../../tests/ovs-macros.at:256: hard failure
> 327. ovn.at:26002: 327. Stateless Floating IP -- parallelization=yes --
> ovn_monitor_all=no (ovn.at:26002): FAILED (ovs-macros.at:256)
> 
> After quite some debugging, I think this might be an actual incremental
> processing bug that we now hit more often.  There's a dependency between
> localnet ports and LP_CHASSISREDIRECT ports when redirect-type=bridged.
> 
> If the localnet port binding creation is received in a later iteration
> of ovn-controller, after the chassis-redirect port was created, we now
> fail to create the redirect flows for the CR port (i.e., we don't call
> put_remote_port_redirect_bridged() anymore).
> 
> I'll try to come up with a simpler test that proves the bug (I think it
> was present before 73a10345a29c ("controller: Update physical flows for
> peer port when the patch port is removed.")) and a fix for it.
> 

I went ahead and posted a fix for this:
https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to