On 11/1/22 10:50, Eli Britstein wrote: > > >> -----Original Message----- >> From: Ilya Maximets <[email protected]> >> Sent: Monday, 31 October 2022 23:54 >> To: Donald Sharp <[email protected]>; ovs- >> [email protected]; [email protected]; Eli Britstein >> <[email protected]> >> Cc: [email protected] >> Subject: Re: [ovs-discuss] ovs-vswitchd running at 100% cpu >> >> External email: Use caution opening links or attachments >> >> >> On 10/31/22 17:25, Donald Sharp via discuss wrote: >>> Hi! >>> >>> I work on the FRRouting project (https://frrouting/org >> <https://frrouting/org> ) and am doing work with FRR and have noticed that >> when I have a full BGP feed on a system that is also running ovs-vswitchd >> that >> ovs-vswitchd sits at 100% cpu: >>> >>> top - 09:43:12 up 4 days, 22:53, 3 users, load average: 1.06, 1.08, 1.08 >>> Tasks: 188 total, 3 running, 185 sleeping, 0 stopped, 0 zombie >>> %Cpu(s): 12.3 us, 14.7 sy, 0.0 ni, 72.8 id, 0.0 wa, 0.0 hi, 0.2 si, >>> 0.0 st >>> MiB Mem : 7859.3 total, 2756.5 free, 2467.2 used, 2635.6 buff/cache >>> MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 5101.9 avail Mem >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>> COMMAND >>> 730 root 10 -10 146204 146048 11636 R 98.3 1.8 6998:13 >>> ovs-vswitchd >>> 169620 root 20 0 0 0 0 I 3.3 0.0 1:34.83 >>> kworker/0:3-events >>> 21 root 20 0 0 0 0 S 1.3 0.0 14:09.59 >>> ksoftirqd/1 >>> 131734 frr 15 -5 2384292 609556 6612 S 1.0 7.6 21:57.51 zebra >>> 131739 frr 15 -5 1301168 1.0g 7420 S 1.0 13.3 18:16.17 bgpd >>> >>> When I turn off FRR ( or turn off the bgp feed ) ovs-vswitchd stops running >> at 100%: >>> >>> top - 09:48:12 up 4 days, 22:58, 3 users, load average: 0.08, 0.60, 0.89 >>> Tasks: 169 total, 1 running, 168 sleeping, 0 stopped, 0 zombie >>> %Cpu(s): 0.2 us, 0.4 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.1 si, >>> 0.0 st >>> MiB Mem : 7859.3 total, 4560.6 free, 663.1 used, 2635.6 buff/cache >>> MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 6906.1 avail Mem >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>> COMMAND >>> 179064 sharpd 20 0 11852 3816 3172 R 1.0 0.0 0:00.09 top >>> 1037 zerotie+ 20 0 291852 113180 7408 S 0.7 1.4 19:09.17 >>> zerotier-one >>> 1043 Debian-+ 20 0 34356 21988 7588 S 0.3 0.3 22:04.42 snmpd >>> 178480 root 20 0 0 0 0 I 0.3 0.0 0:01.21 >>> kworker/1:2-events >>> 178622 sharpd 20 0 14020 6364 4872 S 0.3 0.1 0:00.10 sshd >>> 1 root 20 0 169872 13140 8272 S 0.0 0.2 2:33.26 >>> systemd >>> 2 root 20 0 0 0 0 S 0.0 0.0 0:00.60 >>> kthreadd >>> >>> I do not have any particular ovs configuration on this box: >>> sharpd@janelle:~$ sudo ovs-vsctl show >>> c72d327c-61eb-4877-b4e7-dcf7e07e24fc >>> ovs_version: "2.13.8" >>> >>> >>> sharpd@janelle:~$ sudo ovs-vsctl list o . >>> _uuid : c72d327c-61eb-4877-b4e7-dcf7e07e24fc >>> bridges : [] >>> cur_cfg : 0 >>> datapath_types : [netdev, system] >>> datapaths : {} >>> db_version : "8.2.0" >>> dpdk_initialized : false >>> dpdk_version : none >>> external_ids : {hostname=janelle, rundir="/var/run/openvswitch", >> system-id="a1031fcf-8acc-40a9-9fd6-521716b0faaa"} >>> iface_types : [erspan, geneve, gre, internal, ip6erspan, ip6gre, >>> lisp, >> patch, stt, system, tap, vxlan] >>> manager_options : [] >>> next_cfg : 0 >>> other_config : {} >>> ovs_version : "2.13.8" >>> ssl : [] >>> statistics : {} >>> system_type : ubuntu >>> system_version : "20.04" >>> >>> sharpd@janelle:~$ sudo ovs-appctl dpctl/dump-flows -m >>> ovs-vswitchd: no datapaths exist >>> ovs-vswitchd: datapath not found (Invalid argument) >>> ovs-appctl: ovs-vswitchd: server returned an error >>> >>> Eli Britstein suggested I update ovs-openvswitch to latest and I did >>> and saw the same behavior. When I pulled up the running code in a >> debugger I see that ovs-vswitchd is running in this loop below pretty much >> 100% of the time: >>> >>> (gdb) f 4 >>> #4 0x0000559498b4e476 in route_table_run () at lib/route-table.c:133 >>> 133 nln_run(nln); >>> (gdb) l >>> 128 OVS_EXCLUDED(route_table_mutex) >>> 129 { >>> 130 ovs_mutex_lock(&route_table_mutex); >>> 131 if (nln) { >>> 132 rtnetlink_run(); >>> 133 nln_run(nln); >>> 134 >>> 135 if (!route_table_valid) { >>> 136 route_table_reset(); >>> 137 } >>> (gdb) l >>> 138 } >>> 139 ovs_mutex_unlock(&route_table_mutex); >>> 140 } >>> >>> I pulled up where route_table_valid is set: >>> >>> 298 static void >>> 299 route_table_change(const struct route_table_msg *change >> OVS_UNUSED, >>> 300 void *aux OVS_UNUSED) >>> 301 { >>> 302 route_table_valid = false; >>> 303 } >>> >>> >>> If I am reading the code correctly, every RTM_NEWROUTE netlink message >>> that ovs-vswitchd is getting is setting the route_table_valid global >>> variable to >> false and causing route_table_reset() to be run. >>> This makes sense in context of what FRR is doing. A full BGP feed >>> *always* has churn. So ovs-vswitchd is receiving. RTM_NEWROUTE >>> message, parsing it and deciding in route_table_change() that the >>> route table is no longer valid and causing it to call route_table_reset() >>> which >> redumps the entire routing table to ovs-vswitchd. In this case there are >> ~115k >> ipv6 routes in the linux fib. >>> >>> I hesitate to make any changes here since I really don't understand what the >> end goal here is. >>> ovs-vswitchd is receiving a route change from the kernel but is in >>> turn causing it to redump the entire routing table again. What should be >>> the >> correct behavior be from ovs-vswitchd's perspective here? >> >> Hi, Donald. >> >> Your analysis is correct. OVS will invalidate the cached routing table and >> re- >> dump it in full on the next access on each netlink notification about route >> changes. >> >> Looking back into commit history, OVS did maintain the cache and only >> added/removed what was in the netlink message incrementally. >> But that changed in 2011 with the following commit: >> >> commit f0e167f0dbadbe2a8d684f63ad9faf68d8cb9884 >> Author: Ethan J. Jackson <[email protected]> >> Date: Thu Jan 13 16:29:31 2011 -0800 >> >> route-table: Handle route updates more robustly. >> >> The kernel does not broadcast rtnetlink route messages in all cases >> one would expect. This can cause stale entires to end up in the >> route table which may cause incorrect results for >> route_table_get_ifindex() queries. This commit causes rtnetlink >> route messages to dump the entire route table on the next >> route_table_get_ifindex() query. >> >> And indeed, looking at the history of attempts of different projects to use >> route notifications, they all are facing issues and it seems like none of >> them is >> actually able to fully correctly handle all the notifications, just because >> these >> notifications are notoriously bad. >> It seems to be impossible in certain cases to tell what exactly changed and >> how. There could be duplicates or missing notifications. >> And the code of projects that are trying to maintain a route cache in >> userspace >> is insanely complex and doesn't handle 100% of cases anyway. >> >> There were attempts to convince kernel developers to add unique identifiers >> to routes, so userspace can tell them apart, but all of them seems to die >> leaving the problem unresolved. >> >> These are some discussions/bugs that I found: >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzil >> la.redhat.com%2Fshow_bug.cgi%3Fid%3D1337855&data=05%7C01%7Celi >> br%40nvidia.com%7C71010b27b13b4928f2d708dabb8a7bce%7C43083d157273 >> 40c1b7db39efd9ccc17a%7C0%7C0%7C638028500722289547%7CUnknown%7CT >> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJ >> XVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LWOW4uNIhpSbEtBBVlhyy0 >> TiPyKXYxXv%2B%2Fwppp5bMpM%3D&reserved=0 >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzil >> la.redhat.com%2Fshow_bug.cgi%3Fid%3D1722728&data=05%7C01%7Celi >> br%40nvidia.com%7C71010b27b13b4928f2d708dabb8a7bce%7C43083d157273 >> 40c1b7db39efd9ccc17a%7C0%7C0%7C638028500722289547%7CUnknown%7CT >> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJ >> XVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vOfVjOADZpRIt1mEIj9ygrkD >> UE2k4paCTiAB51Nj97w%3D&reserved=0 >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu >> b.com%2Fthom311%2Flibnl%2Fissues%2F226&data=05%7C01%7Celibr%4 >> 0nvidia.com%7C71010b27b13b4928f2d708dabb8a7bce%7C43083d15727340c1b >> 7db39efd9ccc17a%7C0%7C0%7C638028500722289547%7CUnknown%7CTWFp >> bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI >> 6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BCO0Ns6HTfiqjHYb3M6rHTh >> W7d01OtMAkcAqWDnQwVE%3D&reserved=0 >> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu >> b.com%2Fthom311%2Flibnl%2Fissues%2F224&data=05%7C01%7Celibr%4 >> 0nvidia.com%7C71010b27b13b4928f2d708dabb8a7bce%7C43083d15727340c1b >> 7db39efd9ccc17a%7C0%7C0%7C638028500722289547%7CUnknown%7CTWFp >> bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI >> 6Mn0%3D%7C3000%7C%7C%7C&sdata=p9rviRFrZjayuCmcfn4jij8lRWwTb >> 0Jsy6eeN5UfUJ0%3D&reserved=0 >> >> None of the bugs seems to be resolved. Most are closed for non-technical >> reasons. >> >> I suppose, Ethan just decided to not deal with that horribly unreliable >> kernel >> interface and just re-dump the route table on changes. >> >> >> For your actual problem here, I'm not sure if we can fix it that easily. >> >> Is it necessary for OVS to know about these routes? >> If no, it might be possible to isolate them in a separate network namespace, >> so OVS will not receive all the route updates? >> >> Do you know how long it takes to dump a route table once? >> Maybe it worth limiting that process to only dump once a second or once in a >> few seconds. That should alleviate the load if the actual dump is relatively >> fast. > In this setup OVS just runs without any use. There is no datapath (no > bridges/ports) configured. It is useless to run this mechanism at all for it. > We can bind this mechanism to at least one datapath is configured (or even > only when there is at least one tunnel configured). > What do you think?
Hmm. Why don't you just stop/disable the service then? >> >> Best regards, Ilya Maximets. _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
