On Mon, Sep 22, 2025 at 09:18:28AM +0700, Shawn Ming wrote: > Hi Felix, all,
Hi Shawn, > > Thanks Felix for your detailed notes and suggestions. I did some > additional investigation as you recommended, and here’s what I found: > > 1. Using tcpdump, I observed that roughly ~2,000 ARP requests are > being sent from the router to all physical nodes - including those > without any VMs in the /21 network. > Most ARP requests are for IPs that appear to be invalid (either > unallocated or currently down), while valid IPs do not generate nearly > as many requests. i would guess that this /21 network is a publicly routable one that is internet accessible? Then this is from my experience quite normal. Random scanners on the internet will scan your /21 range and send requests there. Upstream routers do not seem to cache arp misses and therefor will send and arp request for each packet they get for unused ips. > > 2. Interestingly, only the compute nodes that host VMs attached to the > /21 CIDR show latency spikes and high CPU usage. > Other nodes receive the same ARP flood but don’t seem to be affected > in the same way. Could you try setting other_config:broadcast-arps-to-all-routers=false on the Logical_Switch that represents this /21 network (if the implicantions are acceptable)? If a Logical_Switch gets an arp request for a IP it does not know it will flood it to all attached Logical_Switch_Ports and potentially routers. If you have a lot of such requests that might be quite inefficient. We built the setting above for exactly that purpose. Please note that if you set this only arp requests with a known destination ip will be processed. If you rely on GARPs of your upstream router for some kind of failover mechanism that would no longer work. Thanks a lot, Felix > > 3. From what I’ve gathered, this behavior might be linked somehow to > how megaflow handling works in OVS/OVN. I’m still digging into the > details. > > I’d appreciate any further insights from you and everyone in the community! > > Best regards, > Shawn > > On Tue, Sep 16, 2025 at 9:19 PM Felix Huettner > <[email protected]> wrote: > > > > On Mon, Sep 15, 2025 at 05:16:03PM +0700, Shawn Ming via discuss wrote: > > > Hello all, > > > > > > I am running OpenStack (deployed via Kolla-Ansible) with Neutron using > > > OVN as the networking backend. The `distributed_floating_ip` option is > > > not enabled. > > > I have encountered an issue related to large provider networks (CIDR > > > /21) and would like to seek advice from the community. > > > > Hi Shawn, > > > > i'll note below what i saw, maybe it is helpful to you. > > > > > > > > I./ Environment / Steps to reproduce: > > > - OpenStack Caracal (2024.1) deployed with Kolla-Ansible. > > > - Neutron backend: OVN version 24.03.2 (not setting > > > distributed_floating_ip). > > > - Create a provider network with CIDR /21. > > > - Deploy some VMs directly attached to this network. > > > - Observe traffic and system behavior. > > > > > > II./ Observed behavior: > > > Note: The actual gateway IP address in the logs has been replaced for > > > some reasons > > > 1. VMs attached to the /21 network frequently have latency spike and > > > packet loss > > > root@vm4:~# ping 192.168.1.254 > > > PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. > > > 64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=6.86 ms > > > 64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=49.1 ms > > > 64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=7.74 ms > > > 64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=7.68 ms > > > 64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=0.850 ms > > > 64 bytes from 192.168.1.254: icmp_seq=10 ttl=64 time=1.40 ms > > > 64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=2317 ms > > > 64 bytes from 192.168.1.254: icmp_seq=11 ttl=64 time=5.31 ms > > > 64 bytes from 192.168.1.254: icmp_seq=13 ttl=64 time=0.749 ms > > > 64 bytes from 192.168.1.254: icmp_seq=14 ttl=64 time=4.06 ms > > > 64 bytes from 192.168.1.254: icmp_seq=15 ttl=64 time=1.67 ms > > > 64 bytes from 192.168.1.254: icmp_seq=16 ttl=64 time=8.24 ms > > > 64 bytes from 192.168.1.254: icmp_seq=17 ttl=64 time=9.61 ms > > > 64 bytes from 192.168.1.254: icmp_seq=18 ttl=64 time=5.71 ms > > > ^C > > > --- 192.168.1.254 ping statistics --- > > > 18 packets transmitted, 14 received, 22.2222% packet loss, time 17148ms > > > rtt min/avg/max/mdev = 0.749/173.252/2316.610/594.574 ms, pipe 3 > > > > Not only is the latency spike strange, but also the packet reordering. > > > > > > > > Meanwhile, VMs attached to /23 network do not > > > root@vm5:~# ping 192.168.2.254 > > > PING 192.168.2.254 (192.168.2.254) 56(84) bytes of data. > > > 64 bytes from 192.168.2.254: icmp_seq=1 ttl=64 time=1.04 ms > > > 64 bytes from 192.168.2.254: icmp_seq=2 ttl=64 time=25.9 ms > > > 64 bytes from 192.168.2.254: icmp_seq=3 ttl=64 time=5.05 ms > > > 64 bytes from 192.168.2.254: icmp_seq=4 ttl=64 time=2.05 ms > > > 64 bytes from 192.168.2.254: icmp_seq=5 ttl=64 time=0.523 ms > > > 64 bytes from 192.168.2.254: icmp_seq=6 ttl=64 time=4.16 ms > > > 64 bytes from 192.168.2.254: icmp_seq=7 ttl=64 time=0.798 ms > > > 64 bytes from 192.168.2.254: icmp_seq=8 ttl=64 time=70.9 ms > > > 64 bytes from 192.168.2.254: icmp_seq=9 ttl=64 time=1.54 ms > > > 64 bytes from 192.168.2.254: icmp_seq=10 ttl=64 time=4.14 ms > > > 64 bytes from 192.168.2.254: icmp_seq=11 ttl=64 time=6.88 ms > > > 64 bytes from 192.168.2.254: icmp_seq=12 ttl=64 time=0.733 ms > > > 64 bytes from 192.168.2.254: icmp_seq=13 ttl=64 time=1.01 ms > > > 64 bytes from 192.168.2.254: icmp_seq=14 ttl=64 time=2.70 ms > > > 64 bytes from 192.168.2.254: icmp_seq=15 ttl=64 time=26.5 ms > > > ^C > > > --- 192.168.2.254 ping statistics --- > > > 15 packets transmitted, 15 received, 0% packet loss, time 14056ms > > > rtt min/avg/max/mdev = 0.523/10.263/70.898/18.157 ms > > > > While the latency spikes are better this is still a high amount of > > variation. That still does not feel healthy. > > > > > > > > 2. We dedicated compute nodes hosting only one VM each for comparison > > > (hosting VM in /21 network and in /23 network): > > > 2.1. Compute node has VM in /21 network > > > - OVS shows high CPU usage > > > CONTAINER ID NAME CPU % MEM % NET I/O > > > BLOCK I/O PIDS > > > d28dc099cc43 openvswitch_vswitchd 210.81% 0.12% 0B / 0B > > > 0B / 401kB 126 > > > > > > - OVS shows ARP flow records changing quickly and frequently > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 1184 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 628 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 1256 > > > (in second n+3)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 962 > > > > I just compared this to what we see on our nodes. > > There we mostly have 1 flow for newly sent arp requests. Since the peer > > sending the arp request should cache the response there should be no > > reason to send them regularly. > > > > I would propose you look deeper into these different flows for a single > > IP. It would probably be interesting what are the differences between > > them. > > Maybe you also see something interesting if you do a tcpdump that > > filters on arp requests and that ip address. If you see a lot of these > > requests you can maybe find the cause of them. > > > > > > > > - The number of OVS flows fluctuates, and packet drops occur even when > > > no traffic is generated by the VM. > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54725926968 missed:525971260 lost:69166 > > > flows: 1962 > > > masks: hit:57009090988 total:19 hit/pkt:1.03 > > > cache: hit:54183648664 hit-rate:98.07% > > > caches: > > > masks-cache: size:256 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54725931065 missed:525972068 lost:69509 > > > flows: 2474 > > > masks: hit:57009110492 total:19 hit/pkt:1.03 > > > cache: hit:54183652139 hit-rate:98.07% > > > caches: > > > masks-cache: size:256 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54725936481 missed:525972862 lost:69509 > > > flows: 225 > > > masks: hit:57009126369 total:12 hit/pkt:1.03 > > > cache: hit:54183657403 hit-rate:98.07% > > > caches: > > > masks-cache: size:256 > > > > What might be interesting here would be "ovs-appctl upcall/show". > > It shows how many flows are installed over time and what the current > > flow limit and dump duration is. > > > > > > > > 2.2. Compute node has VM in /23 network show better results: > > > - ARP flow count is stable. > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.2.254" | wc -l > > > 403 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.2.254" | wc -l > > > 403 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.2.254" | wc -l > > > 402 > > > (in second n+3)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.2.254" | wc -l > > > 397 > > > > > > - Flow entries are stable. > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54763442917 missed:539025268 lost:4603675 > > > flows: 2666 > > > masks: hit:60577538636 total:30 hit/pkt:1.10 > > > cache: hit:54123911742 hit-rate:97.87% > > > caches: > > > masks-cache: size:256 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54763450196 missed:539025306 lost:4603675 > > > flows: 2670 > > > masks: hit:60577547869 total:31 hit/pkt:1.10 > > > cache: hit:54123918904 hit-rate:97.87% > > > caches: > > > masks-cache: size:256 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:54763458923 missed:539025355 lost:4603675 > > > flows: 2669 > > > masks: hit:60577558873 total:31 hit/pkt:1.10 > > > cache: hit:54123927487 hit-rate:97.87% > > > caches: > > > masks-cache: size:256 > > > port 0: ovs-system (internal) > > > port 1: br-ex (internal) > > > port 2: bond1 > > > port 3: br-int (internal) > > > port 4: genev_sys_6081 (geneve: packet_type=ptap) > > > port 5: tap19510eb6-89 > > > port 6: tap6ff45ca7-64 > > > port 7: tap6b851650-70 > > > port 8: tap0d5f11f9-80 > > > > > > - OVS CPU usage remains normal (almost no spike). > > > CONTAINER ID NAME CPU % MEM % NET I/O > > > BLOCK I/O PIDS > > > 6262f4bc6ab1 openvswitch_vswitchd 11.16% 0.16% 0B / 0B > > > 0B / 45.7MB 127 > > > > > > 3. Workaround > > > - As a temporary workaround, increasing `max-idle` to 1000000 and > > > `max-revalidator` values to 10000 appears to reduce the problem for > > > the VM in CIDR /21 (default values: `max-idle=10000ms~10s`, > > > `max-revalidator=500ms` per documentation). > > > 3.1. OVS ARP flow count remains stable (no fluctuation). > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 1902 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 1902 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-dpctl > > > dump-flows | grep arp | grep "192.168.1.254" | wc -l > > > 1903 > > > > > > 3.2. Flow entries fluctuate less. > > > (in second n)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:53647897445 missed:569251988 lost:8108302 > > > flows: 2697 > > > masks: hit:56660348845 total:31 hit/pkt:1.05 > > > cache: hit:52974334763 hit-rate:97.71% > > > caches: > > > masks-cache: size:256 > > > (in second n+1)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:53647898935 missed:569251992 lost:8108302 > > > flows: 2701 > > > masks: hit:56660350555 total:31 hit/pkt:1.05 > > > cache: hit:52974336237 hit-rate:97.71% > > > caches: > > > masks-cache: size:256 > > > (in second n+2)(openvswitch-vswitchd)[compute-node]# ovs-appctl dpctl/show > > > system@ovs-system: > > > lookups: hit:53647900325 missed:569251995 lost:8108302 > > > flows: 2704 > > > masks: hit:56660352110 total:31 hit/pkt:1.05 > > > cache: hit:52974337617 hit-rate:97.71% > > > caches: > > > masks-cache: size:256 > > > > > > 3.3. But OVS CPU usage remains high. > > > CONTAINER ID NAME CPU % MEM % NET I/O > > > BLOCK I/O PIDS > > > 67fea86bbb86 openvswitch_vswitchd 487.15% 0.20% 0B / 0B > > > 0B / 333MB 137 > > > > > > 3.4. Ping to the gateway (from inside the VM) shows improved results. > > > root@vm4:~# ping 192.168.1.254 > > > PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. > > > 64 bytes from 192.168.1.254: icmp_seq=1 ttl=64 time=7.33 ms > > > 64 bytes from 192.168.1.254: icmp_seq=2 ttl=64 time=1.78 ms > > > 64 bytes from 192.168.1.254: icmp_seq=3 ttl=64 time=0.624 ms > > > 64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=24.2 ms > > > 64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=1.81 ms > > > 64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=4.98 ms > > > 64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=5.89 ms > > > 64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=6.57 ms > > > 64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=5.26 ms > > > 64 bytes from 192.168.1.254: icmp_seq=10 ttl=64 time=1.07 ms > > > 64 bytes from 192.168.1.254: icmp_seq=11 ttl=64 time=3.34 ms > > > 64 bytes from 192.168.1.254: icmp_seq=12 ttl=64 time=2.53 ms > > > 64 bytes from 192.168.1.254: icmp_seq=13 ttl=64 time=14.8 ms > > > 64 bytes from 192.168.1.254: icmp_seq=14 ttl=64 time=29.4 ms > > > 64 bytes from 192.168.1.254: icmp_seq=15 ttl=64 time=2.11 ms > > > ^C > > > --- 192.168.1.254 ping statistics --- > > > 15 packets transmitted, 15 received, 0% packet loss, time 14039ms > > > rtt min/avg/max/mdev = 0.624/7.442/29.369/8.363 ms > > > > > > Has anyone encountered a similar problem before? If so, could you > > > share what the root cause was in your case, and what you found to be > > > the most effective solution? > > > > The workaround above seems to mostly work by just keeping flows longer > > in the datapath installed. If it works that means that there seem to be > > a lot of different clients (probably around these 1902) that regularly > > send arp requests. But they do not send them often enough that the flows > > stay in the datapath. > > > > So i would propose to investigate where these arp requests come from. > > > > Hope it helps in some way. > > Felix > > > > > Thanks in advance! > > > _______________________________________________ > > > discuss mailing list > > > [email protected] > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
