On Tue, Dec 01, 2020 at 08:48:56PM -0500, Ihar Hrachyshka wrote:
> Is there a good reason why netdev-dummy/ip4addr doesn't set local
> routing entries for the new IP? Or is it just an oversight in
> userspace routing implementation?

Oh my.  If I ever knew this, I don't now.  I thought it was Pravin who
originally wrote this code, although the history suggests that Ethan was
also involved.

Pravin, Ethan, do you have any thoughts here?  Lots of context below.

> Hi Ben,
> 
> I was trying to make local_ip work for my needs and hit some
> unexpected connectivity loss between tunnel endpoints when source IPs
> enforced in port config. Perhaps there's something I am missing about
> using local_ip with ovs userspace routing. Perhaps you'll be able to
> validate my thinking below.
> 
> So, for OVN purposes, I would like to have two tunnel ports that
> differ in local_ip only (same remote_ip). In OVN test suite, the
> following configuration is set up:
> 
> (br-int<->br-phys) - <n1> - (br-phys<->br-int)
> 
> Each (...) block is a separate (emulated) host with its own OVS/OVN
> stack, interconnected through a n1 global bridge that emulates fabric
> (everything runs on the same physical host using three sets of OVS
> software). Each br-int has a geneve port that has both remote_ip and
> local_ip set (point to each other). When br-phys bridges are created,
> the following commands are executed to set up userspace IP/routing
> stack inside OVS:
> 
> ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
> ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1
> 
> where $ip on two sides belong to the same /$masklen subnet,
> $bridge==br-phys. By doing this, OVS route entries are created for the
> subnet.
> 
> Without local_ip set for tunnel ports, traffic gets tagged on br-int,
> and, as per created route entries, sent over br-phys->n1->br-phys
> chain. When local_ip are set on tunnel ports, then packets are lost
> (they don't reach br-phys).
> 
> After some debugging, I realized this happens because
> ovs_router_lookup returns false when looking for a proper routing
> entry.
> 
> This happens because a code path added by
> 8e4e45887ec3eb5f5833fd7b415b63ff47fc9642 ("ofproto-dpif-xlate: makes
> OVS native tunneling honor tunnel-specified source addresses") is
> triggered, where an attempt is made to check that the source IP indeed
> belongs to the host. For that, the entry added by the ovs/route/add
> command above is extracted and then its corresponding route entry
> 'local' property is checked. Since all route entries added by the
> command are set to local==false, the code assumes it's not a local IP
> address and it denies to steer the packet forward.
> 
> There's probably a reason why ovs/route/add doesn't support setting
> local==true. My understanding is that in real life, local routing
> table entries are created *implicitly* when a new IP address is added
> to an interface. If one were to try to emulate the same behavior in
> OVS routing userspace, one would probably have to add corresponding
> routing entry points when adding a IP address to the bridge via
> netdev-dummy/ip[46]addr commands.
> 
> When I modify netdev_dummy_ip4addr to do the following after setting
> the IP address for the bridge, the test in question now works:
> 
> diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> index 71df29184..ee01381d2 100644
> --- a/lib/netdev-dummy.c
> +++ b/lib/netdev-dummy.c
> @@ -39,6 +39,7 @@
>  #include "pcap-file.h"
>  #include "openvswitch/poll-loop.h"
>  #include "openvswitch/shash.h"
> +#include "ovs-router.h"
>  #include "sset.h"
>  #include "stream.h"
>  #include "unaligned.h"
> @@ -1941,6 +1942,12 @@ netdev_dummy_ip4addr(struct unixctl_conn *conn,
> int argc OVS_UNUSED,
>          error = ip_parse_masked(argv[2], &ip.s_addr, &mask.s_addr);
>          if (!error) {
>              netdev_dummy_set_in4(netdev, ip, mask);
> +
> +            // insert local route entry for the new address
> +            struct in6_addr ip6;
> +            in6_addr_set_mapped_ipv4(&ip6, ip.s_addr);
> +            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
> +
>              unixctl_command_reply(conn, "OK");
>          } else {
>              unixctl_command_reply_error(conn, error);
> @@ -1970,6 +1977,10 @@ netdev_dummy_ip6addr(struct unixctl_conn *conn,
> int argc OVS_UNUSED,
> 
>              mask = ipv6_create_mask(plen);
>              netdev_dummy_set_in6(netdev, &ip6, &mask);
> +
> +            // insert local route entry for the new address
> +            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
> +
>              unixctl_command_reply(conn, "OK");
>          } else {
>              unixctl_command_reply_error(conn, error);
> 
> (I believe the code should also remove the local entries on IP removed.)
> 
> Is there a good reason why netdev-dummy/ip4addr doesn't set local
> routing entries for the new IP? Or is it just an oversight in
> userspace routing implementation?
> 
> Thanks for any suggestions,
> Ihar
> 
> On Fri, Nov 20, 2020 at 2:32 PM Ben Pfaff <b...@ovn.org> wrote:
> >
> > On Fri, Nov 20, 2020 at 02:23:04PM -0500, Ihar Hrachyshka wrote:
> > > On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <b...@ovn.org> wrote:
> > > >
> > > > On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> > > > > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <b...@ovn.org> wrote:
> > > > > >
> > > > > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > > > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <b...@ovn.org> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > > > > > It's a legal setup where tunnel ports with the same config 
> > > > > > > > > are created
> > > > > > > > > on different bridges served by Open vSwitch. Specifically, 
> > > > > > > > > multiple
> > > > > > > > > OVN controllers may emulate multiple chassis running on the 
> > > > > > > > > same
> > > > > > > > > physical host, in which case they may need to create separate 
> > > > > > > > > tunnel
> > > > > > > > > ports to connect to the same remote chassis on their 
> > > > > > > > > respective
> > > > > > > > > bridges.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrac...@redhat.com>
> > > > > > > >
> > > > > > > > That makes sense for sending packets, but which one is supposed 
> > > > > > > > to
> > > > > > > > receive a packet when one arrives for that tunnel?
> > > > > > > >
> > > > > > >
> > > > > > > If we are talking in OVN context, each virtual ovn-controller 
> > > > > > > chassis
> > > > > > > on the same host has to have a different IP / port to distinguish
> > > > > > > between chassis. I don't think it's a useful configuration to 
> > > > > > > have two
> > > > > > > two incoming tunnel ports with the same config for the same IP 
> > > > > > > that
> > > > > > > are not served by separate DST IP addresses.
> > > > > >
> > > > > > I think that this code rejects tunnel ports with exactly the same
> > > > > > configuration, though.  If the two ports were configured with 
> > > > > > different
> > > > > > local IP addresses, then this code would not flag a conflict.  Do I
> > > > > > misunderstand?
> > > > > >
> > > > >
> > > > > Local IP addresses are not part of tunnel interface options map. Only
> > > > > remote_ip and dst_port are. Does it address your concern?
> > > >
> > > > I don't see where you're getting that.  Tunnel configuration allows both
> > > > local and remote IPs to be specified.  The match structure has slots for
> > > > both IP addresses, and both of them may be specified.
> > > >
> > > > If the remote (or local) IP addresses are different, then this warning
> > > > won't come up.  And you said that the remote IP addresses are different:
> > > > "I don't think it's a useful configuration to have two two incoming
> > > > tunnel ports with the same config for the same IP that are not served by
> > > > separate DST IP addresses."
> > > >
> > > > tnl_port_receive() looks up the tunnel that should receive a packet.  It
> > > > calls into tnl_find().  This calls tnl_find_exact() for each of several
> > > > lookup tables.  In the case where a warning would be given here, we know
> > > > that tnl_find_exact() would find two matching elements in the map,
> > > > because that's the function that we use to know to issue the warning.
> > > > If we disable the warning and insert the second match anyway, there will
> > > > be two elements in the map with exactly the same key, tnl_find_exact()
> > > > will only ever return one of them, thus the other will never receive any
> > > > packets.
> > > >
> > > > What am I missing?
> > > >
> > >
> > > Oh I missed that local_ip *may* be configured for a tunnel endpoint,
> > > it's just not configured by OVN controller *right now*. We can deal
> > > with it in OVN controller instead of OVS then. Sorry for the noise,
> > > this patch can be dropped.
> >
> > Oh, wonderful, we understand each other now, and there's nothing
> > mysterious going on in the code that I didn't know about.  Cool!
> >
> 
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to