On 3/20/24 19:47, Ilya Maximets wrote:
> Currently, ovs-vswitchd is subscribed to all the routing changes in the
> kernel.  On each change, it marks the internal routing table cache as
> invalid, then resets it and dumps all the routes from the kernel from
> scratch.  The reason for that is kernel routing updates not being
> reliable in a sense that it's hard to tell which route is getting
> removed or modified.  Userspace application has to track the order in
> which route entries are dumped from the kernel.  Updates can get lost
> or even duplicated and the kernel doesn't provide a good mechanism to
> distinguish one route from another.  To my knowledge, dumping all the
> routes from a kernel after each change is the only way to keep the
> cache consistent.  Some more info can be found in the following never
> addressed issues:
>   https://bugzilla.redhat.com/1337860
>   https://bugzilla.redhat.com/1337855
> 
> It seems to be believed that NetworkManager "mostly" does incremental
> updates right.  But it is still not completely correct, will re-dump
> the whole table in certain cases, and it takes a huge amount of very
> complicated code to do the accounting and route comparisons.
> 
> Going back to ovs-vswitchd, it currently dumps routes from all the
> routing tables.  If it will get conflicting routes from multiple
> tables, the cache will not be useful.  The routing cache in userspace
> is primarily used for checking the egress port for tunneled traffic
> and this way also detecting link state changes for a tunnel port.
> For userspace datapath it is used for actual routing of the packet
> after sending to a native tunnel.
> With kernel datapath we don't really have a mechanism to know which
> routing table will actually be used by the kernel after encapsulation,
> so our lookups on a cache may be incorrect because of this as well.
> 
> So, unless all the relevant routes are in the standard tables, the
> lookup in userspace route cache is unreliable.
> 
> Luckily, most setups are not using any complicated routing in
> non-standard tables that OVS has to be aware of.
> 
> It is possible, but unlikely, that standard routing tables are
> completely empty while some other custom table is not, and all the OVS
> tunnel traffic is directed to that table.  That would be the only
> scenario where dumping non-standard tables would make sense.  But it
> seems like this kind of setup will likely need a way to tell OVS from
> which table the routes should be taken, or we'll need to dump routing
> rules and keep a separate cache for each table, so we can first match
> on rules and then lookup correct routes in a specific table.  I'm not
> sure if trying to implement all that is justified.
> 
> For now, stop considering routes from non-standard tables to avoid
> mixing different tables together and also wasting CPU resources.
> 
> This fixes a high CPU usage in ovs-vswitchd in case a BGP daemon is
> running on a same host and in a same network namespace with OVS using
> its own custom routing table.
> 
> Unfortunately, there seems to be no way to tell the kernel to send
> updates only for particular tables.  So, we'll still receive and parse
> all of them.  But they will not result in a full cache invalidation in
> most cases.
> 
> Linux kernel v4.20 introduced filtering support for RTM_GETROUTE dumps.
> So, we can make use of it and dump only standard tables when we get a
> relevant route update.  NETLINK_GET_STRICT_CHK has to be enabled on
> the socket for filtering to work.  There is no reason to not enable it
> by default, if supported.  It is not used outside of NETLINK_ROUTE.
> 
> Fixes: f0e167f0dbad ("route-table: Handle route updates more robustly.")
> Fixes: ea83a2fcd0d3 ("lib: Show tunnel egress interface in ovsdb")
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/185
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052091.html
> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
> ---
> 
> Version 2:
>   * Changed log level for the NETLINK_GET_STRICT_CHK failure to
>     DBG if not supported and WARN if there is an actual failure.
>   * While at it added rate limiting.  It is not necessary, but
>     doesn't hurt.
> 
> Note: GitHub actions will likely fail due to ongoing major outage
>       of ppa.launchpad.net from where it installs 32bit toolchains.
>       Will need to request a re-check once the incident is over.

Should be resolved now.

Recheck-request: github-robot
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to