On Mon, Sep 15, 2025 at 05:16:03PM +0700, Shawn Ming via discuss wrote: > Hello all, > > I am running OpenStack (deployed via Kolla-Ansible) with Neutron using > OVN as the networking backend. The `distributed_floating_ip` option is > not enabled. > I have encountered an issue related to large provider networks (CIDR > /21) and would like to seek advice from the community.
Hi Shawn, i'll note below what i saw, maybe it is helpful to you. > > I./ Environment / Steps to reproduce: > - OpenStack Caracal (2024.1) deployed with Kolla-Ansible. > - Neutron backend: OVN version 24.03.2 (not setting distributed_floating_ip). > - Create a provider network with CIDR /21. > - Deploy some VMs directly attached to this network. > - Observe traffic and system behavior. > > II./ Observed behavior: > Note: The actual gateway IP address in the logs has been replaced for > some reasons > 1. VMs attached to the /21 network frequently have latency spike and packet > loss > root@vm4:~# ping 192.168.1.254 > PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. > 64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=6.86 ms > 64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=49.1 ms > 64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=7.74 ms > 64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=7.68 ms > 64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=0.850 ms > 64 bytes from 192.168.1.254: icmp_seq=10 ttl=64 time=1.40 ms > 64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=2317 ms > 64 bytes from 192.168.1.254: icmp_seq=11 ttl=64 time=5.31 ms > 64 bytes from 192.168.1.254: icmp_seq=13 ttl=64 time=0.749 ms > 64 bytes from 192.168.1.254: icmp_seq=14 ttl=64 time=4.06 ms > 64 bytes from 192.168.1.254: icmp_seq=15 ttl=64 time=1.67 ms > 64 bytes from 192.168.1.254: icmp_seq=16 ttl=64 time=8.24 ms > 64 bytes from 192.168.1.254: icmp_seq=17 ttl=64 time=9.61 ms > 64 bytes from 192.168.1.254: icmp_seq=18 ttl=64 time=5.71 ms > ^C > --- 192.168.1.254 ping statistics --- > 18 packets transmitted, 14 received, 22.2222% packet loss, time 17148ms > rtt min/avg/max/mdev = 0.749/173.252/2316.610/594.574 ms, pipe 3 Not only is the latency spike strange, but also the packet reordering. > > Meanwhile, VMs attached to /23 network do not > root@vm5:~# ping 192.168.2.254 > PING 192.168.2.254 (192.168.2.254) 56(84) bytes of data. > 64 bytes from 192.168.2.254: icmp_seq=1 ttl=64 time=1.04 ms > 64 bytes from 192.168.2.254: icmp_seq=2 ttl=64 time=25.9 ms > 64 bytes from 192.168.2.254: icmp_seq=3 ttl=64 time=5.05 ms > 64 bytes from 192.168.2.254: icmp_seq=4 ttl=64 time=2.05 ms > 64 bytes from 192.168.2.254: icmp_seq=5 ttl=64 time=0.523 ms > 64 bytes from 192.168.2.254: icmp_seq=6 ttl=64 time=4.16 ms > 64 bytes from 192.168.2.254: icmp_seq=7 ttl=64 time=0.798 ms > 64 bytes from 192.168.2.254: icmp_seq=8 ttl=64 time=70.9 ms > 64 bytes from 192.168.2.254: icmp_seq=9 ttl=64 time=1.54 ms > 64 bytes from 192.168.2.254: icmp_seq=10 ttl=64 time=4.14 ms > 64 bytes from 192.168.2.254: icmp_seq=11 ttl=64 time=6.88 ms > 64 bytes from 192.168.2.254: icmp_seq=12 ttl=64 time=0.733 ms > 64 bytes from 192.168.2.254: icmp_seq=13 ttl=64 time=1.01 ms > 64 bytes from 192.168.2.254: icmp_seq=14 ttl=64 time=2.70 ms > 64 bytes from 192.168.2.254: icmp_seq=15 ttl=64 time=26.5 ms > ^C > --- 192.168.2.254 ping statistics --- > 15 packets transmitted, 15 received, 0% packet loss, time 14056ms > rtt min/avg/max/mdev = 0.523/10.263/70.898/18.157 ms While the latency spikes are better this is still a high amount of variation. That still does not feel healthy. > > 2. We dedicated compute nodes hosting only one VM each for comparison > (hosting VM in /21 network and in /23 network): > 2.1. Compute node has VM in /21 network > - OVS shows high CPU usage > CONTAINER ID NAME CPU % MEM % NET I/O > BLOCK I/O PIDS > d28dc099cc43 openvswitch_vswitchd 210.81% 0.12% 0B / 0B > 0B / 401kB 126 > > - OVS shows ARP flow records changing quickly and frequently > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 1184 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 628 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 1256 > (in second n+3)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 962 I just compared this to what we see on our nodes. There we mostly have 1 flow for newly sent arp requests. Since the peer sending the arp request should cache the response there should be no reason to send them regularly. I would propose you look deeper into these different flows for a single IP. It would probably be interesting what are the differences between them. Maybe you also see something interesting if you do a tcpdump that filters on arp requests and that ip address. If you see a lot of these requests you can maybe find the cause of them. > > - The number of OVS flows fluctuates, and packet drops occur even when > no traffic is generated by the VM. > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54725926968 missed:525971260 lost:69166 > flows: 1962 > masks: hit:57009090988 total:19 hit/pkt:1.03 > cache: hit:54183648664 hit-rate:98.07% > caches: > masks-cache: size:256 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54725931065 missed:525972068 lost:69509 > flows: 2474 > masks: hit:57009110492 total:19 hit/pkt:1.03 > cache: hit:54183652139 hit-rate:98.07% > caches: > masks-cache: size:256 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54725936481 missed:525972862 lost:69509 > flows: 225 > masks: hit:57009126369 total:12 hit/pkt:1.03 > cache: hit:54183657403 hit-rate:98.07% > caches: > masks-cache: size:256 What might be interesting here would be "ovs-appctl upcall/show". It shows how many flows are installed over time and what the current flow limit and dump duration is. > > 2.2. Compute node has VM in /23 network show better results: > - ARP flow count is stable. > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.2.254" | wc -l > 403 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.2.254" | wc -l > 403 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.2.254" | wc -l > 402 > (in second n+3)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.2.254" | wc -l > 397 > > - Flow entries are stable. > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54763442917 missed:539025268 lost:4603675 > flows: 2666 > masks: hit:60577538636 total:30 hit/pkt:1.10 > cache: hit:54123911742 hit-rate:97.87% > caches: > masks-cache: size:256 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54763450196 missed:539025306 lost:4603675 > flows: 2670 > masks: hit:60577547869 total:31 hit/pkt:1.10 > cache: hit:54123918904 hit-rate:97.87% > caches: > masks-cache: size:256 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:54763458923 missed:539025355 lost:4603675 > flows: 2669 > masks: hit:60577558873 total:31 hit/pkt:1.10 > cache: hit:54123927487 hit-rate:97.87% > caches: > masks-cache: size:256 > port 0: ovs-system (internal) > port 1: br-ex (internal) > port 2: bond1 > port 3: br-int (internal) > port 4: genev_sys_6081 (geneve: packet_type=ptap) > port 5: tap19510eb6-89 > port 6: tap6ff45ca7-64 > port 7: tap6b851650-70 > port 8: tap0d5f11f9-80 > > - OVS CPU usage remains normal (almost no spike). > CONTAINER ID NAME CPU % MEM % NET I/O > BLOCK I/O PIDS > 6262f4bc6ab1 openvswitch_vswitchd 11.16% 0.16% 0B / 0B > 0B / 45.7MB 127 > > 3. Workaround > - As a temporary workaround, increasing `max-idle` to 1000000 and > `max-revalidator` values to 10000 appears to reduce the problem for > the VM in CIDR /21 (default values: `max-idle=10000ms~10s`, > `max-revalidator=500ms` per documentation). > 3.1. OVS ARP flow count remains stable (no fluctuation). > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 1902 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 1902 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > dump-flows | grep arp | grep "192.168.1.254" | wc -l > 1903 > > 3.2. Flow entries fluctuate less. > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:53647897445 missed:569251988 lost:8108302 > flows: 2697 > masks: hit:56660348845 total:31 hit/pkt:1.05 > cache: hit:52974334763 hit-rate:97.71% > caches: > masks-cache: size:256 > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:53647898935 missed:569251992 lost:8108302 > flows: 2701 > masks: hit:56660350555 total:31 hit/pkt:1.05 > cache: hit:52974336237 hit-rate:97.71% > caches: > masks-cache: size:256 > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > system@ovs-system: > lookups: hit:53647900325 missed:569251995 lost:8108302 > flows: 2704 > masks: hit:56660352110 total:31 hit/pkt:1.05 > cache: hit:52974337617 hit-rate:97.71% > caches: > masks-cache: size:256 > > 3.3. But OVS CPU usage remains high. > CONTAINER ID NAME CPU % MEM % NET I/O > BLOCK I/O PIDS > 67fea86bbb86 openvswitch_vswitchd 487.15% 0.20% 0B / 0B > 0B / 333MB 137 > > 3.4. Ping to the gateway (from inside the VM) shows improved results. > root@vm4:~# ping 192.168.1.254 > PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. > 64 bytes from 192.168.1.254: icmp_seq=1 ttl=64 time=7.33 ms > 64 bytes from 192.168.1.254: icmp_seq=2 ttl=64 time=1.78 ms > 64 bytes from 192.168.1.254: icmp_seq=3 ttl=64 time=0.624 ms > 64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=24.2 ms > 64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=1.81 ms > 64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=4.98 ms > 64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=5.89 ms > 64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=6.57 ms > 64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=5.26 ms > 64 bytes from 192.168.1.254: icmp_seq=10 ttl=64 time=1.07 ms > 64 bytes from 192.168.1.254: icmp_seq=11 ttl=64 time=3.34 ms > 64 bytes from 192.168.1.254: icmp_seq=12 ttl=64 time=2.53 ms > 64 bytes from 192.168.1.254: icmp_seq=13 ttl=64 time=14.8 ms > 64 bytes from 192.168.1.254: icmp_seq=14 ttl=64 time=29.4 ms > 64 bytes from 192.168.1.254: icmp_seq=15 ttl=64 time=2.11 ms > ^C > --- 192.168.1.254 ping statistics --- > 15 packets transmitted, 15 received, 0% packet loss, time 14039ms > rtt min/avg/max/mdev = 0.624/7.442/29.369/8.363 ms > > Has anyone encountered a similar problem before? If so, could you > share what the root cause was in your case, and what you found to be > the most effective solution? The workaround above seems to mostly work by just keeping flows longer in the datapath installed. If it works that means that there seem to be a lot of different clients (probably around these 1902) that regularly send arp requests. But they do not send them often enough that the flows stay in the datapath. So i would propose to investigate where these arp requests come from. Hope it helps in some way. Felix > Thanks in advance! > _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
