Hi Ales, +Lorenzo & Numan in CC
On 5/19/26 11:21 PM, Dumitru Ceara wrote: > On 5/14/26 10:13 AM, Ales Musil wrote: >> On Wed, May 13, 2026 at 11:15 AM Dumitru Ceara via dev < >> [email protected]> wrote: >> >>> The pflow_output SB_port_binding handler triggers a full >>> recompute when the type column is updated on a port binding. >>> However, for newly created port bindings, the OVSDB IDL >>> marks all non-default columns as "updated", even though no >>> actual update occurred. This caused every new port binding >>> with a non-default type (e.g., remote, patch, localnet, >>> router) to unnecessarily trigger a full pflow_output >>> recompute, severely impacting ovn-controller performance >>> at scale. >>> >>> This is particularly problematic in deployments that use >>> remote LSPs, such as ovn-kubernetes with L2 UDNs, where >>> frequent creation of remote port bindings leads to >>> continuous full recomputes and high CPU usage. >>> >>> Guard the type-update check with sbrec_port_binding_is_new() >>> and sbrec_port_binding_is_deleted() so that only genuine >>> type changes on existing port bindings trigger a recompute. >>> This matches the pattern already used in binding.c for the >>> tunnel_key column. >>> >>> Also fix a typo in the test name ("path" -> "patch"). >>> >>> Fixes: 73a10345a29c ("controller: Update physical flows for peer port when >>> the patch port is removed.") >>> Reported-at: https://redhat.atlassian.net/browse/FDP-3819 >>> Reported-by: Patryk Diak <[email protected]> >>> Assisted-by: Claude Opus 4.6, Claude Code >>> Signed-off-by: Dumitru Ceara <[email protected]> >>> --- [...] >>> >> Thank you Dumitru, >> >> applied to main and backported down to 25.09. >> > > Hi Ales, > > Thanks for that! I actually tried to cherry pick this on 25.03 too > today because Red Hat's ovn-kubernetes needs it there too but I'm > hitting a test failure: > > rcv_n=0 exp_n=1 > ovn-macros.at:12: wait failed after 30 seconds > Expected: > f0f00000001100000101020708004500001c000000003e111726ac1f0064ac1f000a0035111100080000 > Received: > Diff: > --- vif-north.expected.sorted 2026-05-19 09:38:11.881677567 +0000 > +++ hv4/vif-north-tx.packets.sorted 2026-05-19 09:38:11.883375243 +0000 > @@ -1 +0,0 @@ > -f0f00000001100000101020708004500001c000000003e111726ac1f0064ac1f000a0035111100080000 > ../../../tests/ovs-macros.at:256: hard failure > 327. ovn.at:26002: 327. Stateless Floating IP -- parallelization=yes -- > ovn_monitor_all=no (ovn.at:26002): FAILED (ovs-macros.at:256) > > After quite some debugging, I think this might be an actual incremental > processing bug that we now hit more often. There's a dependency between > localnet ports and LP_CHASSISREDIRECT ports when redirect-type=bridged. > > If the localnet port binding creation is received in a later iteration > of ovn-controller, after the chassis-redirect port was created, we now > fail to create the redirect flows for the CR port (i.e., we don't call > put_remote_port_redirect_bridged() anymore). > > I'll try to come up with a simpler test that proves the bug (I think it > was present before 73a10345a29c ("controller: Update physical flows for > peer port when the patch port is removed.")) and a fix for it. > I went ahead and posted a fix for this: https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/ Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
