Hi Tiago, Thank you for your quick reply. We have actually just tested this in one of our clusters and it seems that the load issue is gone. Are you aware of any possible drawbacks of setting this?
Thank you again, Cristi On Tue, Dec 9, 2025 at 1:07 PM Tiago Pires <[email protected]> wrote: > Hi Cristian, > > On Tue, Dec 9, 2025 at 6:42 AM Cristian Contescu via discuss > <[email protected]> wrote: > > > > Hello everyone > > > > We wanted to check with the community a strange issue which we saw > happening in the following scenario > > > > In order to scale out one of our environments we decided to increase the > IPv4 provider network from a /22 to a /20 on Openstack, after doing so > > We noticed that OVS started using 100% of CPU (also from the logs): > > > > > > 2025-12-04T14:18:07Z|26750|poll_loop|INFO|wakeup due to [POLLIN] on fd > 425 (/var/run/openvswitch/br-int.mgmt<->) at ../lib/stream-fd.c:157 (101% > CPU usage) > > > > > > > > When the CPU spiked to 100% (ovs-vswitchd main process, while handlers > and revalidator threads are not as used) we started having packetloss > regardless of traffic (IPv4 / IPv6) > > > > After that we reverted the change and saw that the same issue happens > when we gradually increase the number of virtual routers on another > environment (with less traffic) with another /20 provider network. > > > > Do you know of any recent fixes related to the following or does anyone > have experienced a similar issue and can point us to some options to > evaluate? > > > > > > Our setup: > > > > dual stack external network (VLAN type) with a /20 (increased from /22 > before) IPv4 subnet and a /64 IPv6 subnet > > virtual routers are connected to the external network: > > > > dual-stack tenant networks are possible > > > > for IPv4 we use distributed floating IPs and SNAT(+DNAT) > > for IPv6 tenant networks are public and advertised via the ovn-bgp-agent > to the physical routers with next-hop being on the external network > > > > our Openstack setup is based on openstack-helm deployed on physical nodes > > > > > > So as of now our current findings are: > > - Correlation between number of virtual routers and CPU usage increase > > - Potential correlation between provider network being /20 instead of > /22 (increase in broadcast domain / traffic) > > > Did you set the broadcast-arps-to-all-routers=false in the provider > network's logical switch? > Ex: ovn-nbctl --no-leader-only set logical_switch <ID-Logical-Switch> > other_config:broadcast-arps-to-all-routers=false > > Checking your scenario I think it will probably decrease this load > spike in the ovs-vswitchd. > > > - Potential RA IPv6 / multicast flood when the issue happens by > investigating tcpdumps > > > > What you did that make the problem appear. > > In order to replicate this issue we increased the provider network from > /22 to /20 in Openstack and on the physical routers connecting the external > network. > > > > Another way to replicate the issue is to just increase the number of > virtual routers on an existing /20 provider network on a different > Openstack environment > > > > What you expected to happen. > > - OVS has the same load as before and doesn't reach 100% of CPU usage, > thus no packetloss > > - OVS is able to sustain the /20 provider network from Openstack > > > > What actually happened. > > - As soon as OVS main process reached 100% CPU usage from the log lines, > we started detecting packet loss > > > > - Other errors detected are "|WARN|over 4096 resubmit actions on bridge" > > > > neutron openvswitch-zg86k openvswitch-vswitchd > 2025-12-08T14:33:28Z|00016|ofproto_dpif_xlate(handler35)|WARN|over 4096 > resubmit actions on bridge br-int while processing > icmp6,in_port=1,dl_vlan=101,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:00:5e:00:02:65,dl_dst=33:33:00:00:00:01,ipv6_src=fe80::200:5eff:fe00:265,ipv6_dst=ff02::1,ipv6_label=0x00000,nw_tos=224,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=134,icmp_code=0 > > > > > > Versions of various components: > > ovs-vswitchd --version > > ovs-vswitchd (Open vSwitch) 3.3.4 > > > > ovn-controller --version > > ovn-controller 24.03.6 > > Open vSwitch Library 3.3.4 > > OpenFlow versions 0x6:0x6 > > SB DB Schema 20.33.0 > > > > No local patches > > > > Kernel: > > # cat /proc/version > > Linux version 6.8.0-52-generic (buildd@lcy02-amd64-046) > (x86_64-linux-gnu-gcc-13 (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, GNU ld (GNU > Binutils for Ubuntu) 2.42) #53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 > 00:06:25 UTC 2025 > > > > # ovs-dpctl show > > system@ovs-system: > > lookups: hit:25791010986 missed:954723789 lost:3319115 > > flows: 1185 > > masks: hit:96181163550 total:35 hit/pkt:3.60 > > cache: hit:19162151183 hit-rate:71.65% > > caches: > > masks-cache: size:256 > > port 0: ovs-system (internal) > > port 1: br-ex (internal) > > port 2: bond0 > > port 3: br-int (internal) > > port 4: genev_sys_6081 (geneve: packet_type=ptap) > > port 5: tap196a9595-b2 > > port 6: tap72f307a7-37 > > .. > > .. > > > > The only workaround seems to decrease the number of virtual routers > which is not sustainable in a used environment > > > > We checked flows and seems they grow from 70k~ up to 120/130k~ when the > issue happens > > # ovs-ofctl dump-flows br-int | wc -l > > 78413 > > > > > > Thank you for your help, > > > > Cristi > > > Regards, > > Tiago Pires > > > > _______________________________________________ > > discuss mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > -- > > > > > _‘Esta mensagem é direcionada apenas para os endereços constantes no > cabeçalho inicial. Se você não está listado nos endereços constantes no > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas > estão > imediatamente anuladas e proibidas’._ > > > * **‘Apesar do Magazine Luiza tomar > todas as precauções razoáveis para assegurar que nenhum vírus esteja > presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por > quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* > > > >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
