On Tue, Dec 01, 2020 at 08:48:56PM -0500, Ihar Hrachyshka wrote: > Is there a good reason why netdev-dummy/ip4addr doesn't set local > routing entries for the new IP? Or is it just an oversight in > userspace routing implementation?
Oh my. If I ever knew this, I don't now. I thought it was Pravin who originally wrote this code, although the history suggests that Ethan was also involved. Pravin, Ethan, do you have any thoughts here? Lots of context below. > Hi Ben, > > I was trying to make local_ip work for my needs and hit some > unexpected connectivity loss between tunnel endpoints when source IPs > enforced in port config. Perhaps there's something I am missing about > using local_ip with ovs userspace routing. Perhaps you'll be able to > validate my thinking below. > > So, for OVN purposes, I would like to have two tunnel ports that > differ in local_ip only (same remote_ip). In OVN test suite, the > following configuration is set up: > > (br-int<->br-phys) - <n1> - (br-phys<->br-int) > > Each (...) block is a separate (emulated) host with its own OVS/OVN > stack, interconnected through a n1 global bridge that emulates fabric > (everything runs on the same physical host using three sets of OVS > software). Each br-int has a geneve port that has both remote_ip and > local_ip set (point to each other). When br-phys bridges are created, > the following commands are executed to set up userspace IP/routing > stack inside OVS: > > ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1 > ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1 > > where $ip on two sides belong to the same /$masklen subnet, > $bridge==br-phys. By doing this, OVS route entries are created for the > subnet. > > Without local_ip set for tunnel ports, traffic gets tagged on br-int, > and, as per created route entries, sent over br-phys->n1->br-phys > chain. When local_ip are set on tunnel ports, then packets are lost > (they don't reach br-phys). > > After some debugging, I realized this happens because > ovs_router_lookup returns false when looking for a proper routing > entry. > > This happens because a code path added by > 8e4e45887ec3eb5f5833fd7b415b63ff47fc9642 ("ofproto-dpif-xlate: makes > OVS native tunneling honor tunnel-specified source addresses") is > triggered, where an attempt is made to check that the source IP indeed > belongs to the host. For that, the entry added by the ovs/route/add > command above is extracted and then its corresponding route entry > 'local' property is checked. Since all route entries added by the > command are set to local==false, the code assumes it's not a local IP > address and it denies to steer the packet forward. > > There's probably a reason why ovs/route/add doesn't support setting > local==true. My understanding is that in real life, local routing > table entries are created *implicitly* when a new IP address is added > to an interface. If one were to try to emulate the same behavior in > OVS routing userspace, one would probably have to add corresponding > routing entry points when adding a IP address to the bridge via > netdev-dummy/ip[46]addr commands. > > When I modify netdev_dummy_ip4addr to do the following after setting > the IP address for the bridge, the test in question now works: > > diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c > index 71df29184..ee01381d2 100644 > --- a/lib/netdev-dummy.c > +++ b/lib/netdev-dummy.c > @@ -39,6 +39,7 @@ > #include "pcap-file.h" > #include "openvswitch/poll-loop.h" > #include "openvswitch/shash.h" > +#include "ovs-router.h" > #include "sset.h" > #include "stream.h" > #include "unaligned.h" > @@ -1941,6 +1942,12 @@ netdev_dummy_ip4addr(struct unixctl_conn *conn, > int argc OVS_UNUSED, > error = ip_parse_masked(argv[2], &ip.s_addr, &mask.s_addr); > if (!error) { > netdev_dummy_set_in4(netdev, ip, mask); > + > + // insert local route entry for the new address > + struct in6_addr ip6; > + in6_addr_set_mapped_ipv4(&ip6, ip.s_addr); > + ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any); > + > unixctl_command_reply(conn, "OK"); > } else { > unixctl_command_reply_error(conn, error); > @@ -1970,6 +1977,10 @@ netdev_dummy_ip6addr(struct unixctl_conn *conn, > int argc OVS_UNUSED, > > mask = ipv6_create_mask(plen); > netdev_dummy_set_in6(netdev, &ip6, &mask); > + > + // insert local route entry for the new address > + ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any); > + > unixctl_command_reply(conn, "OK"); > } else { > unixctl_command_reply_error(conn, error); > > (I believe the code should also remove the local entries on IP removed.) > > Is there a good reason why netdev-dummy/ip4addr doesn't set local > routing entries for the new IP? Or is it just an oversight in > userspace routing implementation? > > Thanks for any suggestions, > Ihar > > On Fri, Nov 20, 2020 at 2:32 PM Ben Pfaff <b...@ovn.org> wrote: > > > > On Fri, Nov 20, 2020 at 02:23:04PM -0500, Ihar Hrachyshka wrote: > > > On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <b...@ovn.org> wrote: > > > > > > > > On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote: > > > > > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <b...@ovn.org> wrote: > > > > > > > > > > > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote: > > > > > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <b...@ovn.org> wrote: > > > > > > > > > > > > > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote: > > > > > > > > > It's a legal setup where tunnel ports with the same config > > > > > > > > > are created > > > > > > > > > on different bridges served by Open vSwitch. Specifically, > > > > > > > > > multiple > > > > > > > > > OVN controllers may emulate multiple chassis running on the > > > > > > > > > same > > > > > > > > > physical host, in which case they may need to create separate > > > > > > > > > tunnel > > > > > > > > > ports to connect to the same remote chassis on their > > > > > > > > > respective > > > > > > > > > bridges. > > > > > > > > > > > > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrac...@redhat.com> > > > > > > > > > > > > > > > > That makes sense for sending packets, but which one is supposed > > > > > > > > to > > > > > > > > receive a packet when one arrives for that tunnel? > > > > > > > > > > > > > > > > > > > > > > If we are talking in OVN context, each virtual ovn-controller > > > > > > > chassis > > > > > > > on the same host has to have a different IP / port to distinguish > > > > > > > between chassis. I don't think it's a useful configuration to > > > > > > > have two > > > > > > > two incoming tunnel ports with the same config for the same IP > > > > > > > that > > > > > > > are not served by separate DST IP addresses. > > > > > > > > > > > > I think that this code rejects tunnel ports with exactly the same > > > > > > configuration, though. If the two ports were configured with > > > > > > different > > > > > > local IP addresses, then this code would not flag a conflict. Do I > > > > > > misunderstand? > > > > > > > > > > > > > > > > Local IP addresses are not part of tunnel interface options map. Only > > > > > remote_ip and dst_port are. Does it address your concern? > > > > > > > > I don't see where you're getting that. Tunnel configuration allows both > > > > local and remote IPs to be specified. The match structure has slots for > > > > both IP addresses, and both of them may be specified. > > > > > > > > If the remote (or local) IP addresses are different, then this warning > > > > won't come up. And you said that the remote IP addresses are different: > > > > "I don't think it's a useful configuration to have two two incoming > > > > tunnel ports with the same config for the same IP that are not served by > > > > separate DST IP addresses." > > > > > > > > tnl_port_receive() looks up the tunnel that should receive a packet. It > > > > calls into tnl_find(). This calls tnl_find_exact() for each of several > > > > lookup tables. In the case where a warning would be given here, we know > > > > that tnl_find_exact() would find two matching elements in the map, > > > > because that's the function that we use to know to issue the warning. > > > > If we disable the warning and insert the second match anyway, there will > > > > be two elements in the map with exactly the same key, tnl_find_exact() > > > > will only ever return one of them, thus the other will never receive any > > > > packets. > > > > > > > > What am I missing? > > > > > > > > > > Oh I missed that local_ip *may* be configured for a tunnel endpoint, > > > it's just not configured by OVN controller *right now*. We can deal > > > with it in OVN controller instead of OVS then. Sorry for the noise, > > > this patch can be dropped. > > > > Oh, wonderful, we understand each other now, and there's nothing > > mysterious going on in the code that I didn't know about. Cool! > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev