On Fri, Sep 21, 2018 at 11:06 PM Guru Shetty <g...@ovn.org> wrote:

> >
> >
> > I have tried to make sense of this patch series a few times. I think
> > adding increasing complications like this will make gateway code
> > unmaintainable. The whole gateway redirect chassis already makes it
> > un-understandable and now this will mean that no one will be able to
> > understand it as we go forward. [/rant off]
> >
>
> Sorry about the "rant" if it came out as rude - was not my intention. I
> will try to offer something more constructive. When you initially looked at
> the gateway redirect code, how long did it take for you to understand it?
> My main concern is not with your code, but the underlying code. I initially
> reviewed the underlying redirect code. And it took me a good 2 days to
> understand it. Back then, we wanted something working and it looked like
> something that can eventually be simplified. But, we have been adding more
> things to it. And right now, if this patch series is added, it will likely
> get very very hard to understand it as we go forward. Do you reckon that a
> year from now, you will be able to debug a corner case in the code?
>
> If there are others who have understood the "redirect" gateway code easily,
> please do chime in.
>
> My suggestion is to re-look at the redirect gateway design and see whether
> the code can be simplified. Or whether we can write detailed documentation
> about it to make it easy to understand. Only once we are satisfied with it,
> we can attempt to add more stuff to it.
>
>
I always forget the gateway code and the design. (May be because I haven't
worked
on this part of the code much) and I have to refresh my memory everytime.
I agree that we need to have a detailed documentation. May be "a journey of
a packet "
kind of documentation in ovn-architecture would also help.

The proposed solution in this patch series, If I understand it correctly
runs the router pipeline
in the source chassis and then resumes it on the gateway chassis. Recently
I was looking into
SR-IOV kind of scenarios, where the VM traffic by passes the host kernel,
and I think the
proposed solution will not work. In order to support N/S traffic for SR-IOV
instances
we have to find probably another way and that would also complicate the
code further.

However I think it is possible to have one solution to support gateway
traffic for
VLAN tenant networks which covers SR-IOV and the normal case. I think the
router pipeline
should be executed only on the gateway chassis. I will have a discussion
with
Anil on this, this monday and see if it's possible.

But unfortunately, supporting these use cases will add some amount of
complexity :(.

Thanks
Numan


>
> >
> >> ---
> >>
> >> v6->v7:
> >> * Added this patch
> >>
> >>
> >>  ovn/controller/physical.c   | 60
> >> ++++++++++++++++++++++++++++++++++++++++++---
> >>  ovn/northd/ovn-northd.8.xml | 10 ++++++++
> >>  ovn/northd/ovn-northd.c     | 29 ++++++++++++++++++++++
> >>  ovn/ovn-architecture.7.xml  |  4 ++-
> >>  4 files changed, 99 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> >> index f269a1d..1f41f59 100644
> >> --- a/ovn/controller/physical.c
> >> +++ b/ovn/controller/physical.c
> >> @@ -190,7 +190,9 @@ get_zone_ids(const struct sbrec_port_binding
> *binding,
> >>  static void
> >>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
> >>                         bool nested_container, const struct zone_ids
> >> *zone_ids,
> >> -                       struct ofpbuf *ofpacts_p, struct hmap
> *flow_table)
> >> +                       struct ofpbuf *ofpacts_p, struct hmap
> *flow_table,
> >> +                       struct local_datapath *ld,
> >> +                       const struct hmap *local_datapaths)
> >>  {
> >>      struct match match;
> >>
> >> @@ -221,11 +223,63 @@ put_local_common_flows(uint32_t dp_key, uint32_t
> >> port_key,
> >>          }
> >>      }
> >>
> >> +    struct ofpbuf *clone = NULL;
> >> +    clone = ofpbuf_clone(ofpacts_p);
> >> +
> >>      /* Resubmit to table 34. */
> >>      put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p);
> >>      ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, 0,
> >>                      &match, ofpacts_p);
> >>
> >> +    /* For a reply packet from gateway with VLAN switch port as
> >> destination
> >> +     * (excluding localnet_port and external VLAN networks), gateway
> >> router
> >> +     * will use gateway MAC address as source MAC instead of router
> >> internal
> >> +     * port MAC, so that external switches can learn gateway MAC
> address.
> >> +     * Here (before packet is given to the port) we replace router
> >> gateway
> >> +     * MAC address with router internal port MAC. */
> >> +    if (ld->localnet_port && (port_key !=
> >> ld->localnet_port->tunnel_key)) {
> >> +        for (int i = 0; i < ld->n_peer_dps; i++) {
> >> +            struct local_datapath *peer_ldp = get_local_datapath(
> >> +                local_datapaths, ld->peer_dps[i]->peer_dp->tunnel_key);
> >> +            const struct sbrec_port_binding *crp;
> >> +            crp = peer_ldp->chassisredirect_port;
> >> +            if (!crp) {
> >> +                continue;
> >> +            }
> >> +
> >> +            if (strcmp(smap_get(&crp->options, "distributed-port"),
> >> +                       ld->peer_dps[i]->peer->logical_port) &&
> >> +                (port_key != ld->peer_dps[i]->patch->tunnel_key)) {
> >> +                for (int j = 0; j < crp->n_mac; j++) {
> >> +                    struct lport_addresses laddrs;
> >> +                    if (!extract_lsp_addresses(crp->mac[j], &laddrs)) {
> >> +                        continue;
> >> +                    }
> >> +                    match_set_dl_src(&match, laddrs.ea);
> >> +                    destroy_lport_addresses(&laddrs);
> >> +                    break;
> >> +                }
> >> +                for (int j = 0; j < ld->peer_dps[i]->peer->n_mac; j++)
> {
> >> +                    struct lport_addresses laddrs;
> >> +                    uint64_t mac64;
> >> +                    if (!extract_lsp_addresses(
> >> +                        ld->peer_dps[i]->peer->mac[j], &laddrs)) {
> >> +                        continue;
> >> +                    }
> >> +                    mac64 = eth_addr_to_uint64(laddrs.ea);
> >> +                    put_load(mac64,
> >> +                             MFF_ETH_SRC, 0, 48, clone);
> >> +                    destroy_lport_addresses(&laddrs);
> >> +                    break;
> >> +                }
> >> +                put_resubmit(OFTABLE_CHECK_LOOPBACK, clone);
> >> +                ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 150,
> 0,
> >> +                    &match, clone);
> >> +            }
> >> +        }
> >> +    }
> >> +    ofpbuf_delete(clone);
> >> +
> >>      /* Table 34, Priority 100.
> >>       * =======================
> >>       *
> >> @@ -330,7 +384,7 @@ consider_port_binding(struct ovsdb_idl_index
> >> *sbrec_chassis_by_name,
> >>
> >>          struct zone_ids binding_zones = get_zone_ids(binding,
> ct_zones);
> >>          put_local_common_flows(dp_key, port_key, false, &binding_zones,
> >> -                               ofpacts_p, flow_table);
> >> +                               ofpacts_p, flow_table, ld,
> >> local_datapaths);
> >>
> >>          match_init_catchall(&match);
> >>          ofpbuf_clear(ofpacts_p);
> >> @@ -531,7 +585,7 @@ consider_port_binding(struct ovsdb_idl_index
> >> *sbrec_chassis_by_name,
> >>
> >>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
> >>          put_local_common_flows(dp_key, port_key, nested_container,
> >> &zone_ids,
> >> -                               ofpacts_p, flow_table);
> >> +                               ofpacts_p, flow_table, ld,
> >> local_datapaths);
> >>
> >>          /* Table 0, Priority 200, 150 and 100.
> >>           * ==============================
> >> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> >> index 8fa5272..876c121 100644
> >> --- a/ovn/northd/ovn-northd.8.xml
> >> +++ b/ovn/northd/ovn-northd.8.xml
> >> @@ -2013,6 +2013,16 @@ next;
> >>        </li>
> >>
> >>        <li>
> >> +        A priority-100 logical flow with match
> >> +        <code>inport == <var>GW</var> &amp;&amp;
> >> +        flags.rcv_from_vlan == 1</code> has actions
> >> +        <code>eth.dst = <var>E</var>; next;</code>, where
> >> +        <var>GW</var> is the logical router distributed gateway
> >> +        port and <var>E</var> is the MAC address of router
> >> +        distributed gateway port.
> >> +      </li>
> >> +
> >> +      <li>
> >>          For each NAT rule in the OVN Northbound database that can
> >>          be handled in a distributed manner, a priority-100 logical
> >>          flow with match <code>ip4.src == <var>B</var> &amp;&amp;
> >> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> >> index bcf0b66..d012bb8 100644
> >> --- a/ovn/northd/ovn-northd.c
> >> +++ b/ovn/northd/ovn-northd.c
> >> @@ -4419,6 +4419,15 @@ add_route(struct hmap *lflows, const struct
> >> ovn_port *op,
> >>      } else {
> >>          ds_put_format(&actions, "ip%s.dst", is_ipv4 ? "4" : "6");
> >>      }
> >> +
> >> +    if (op->peer && op->peer->od->localnet_port &&
> >> +        op->od->l3dgw_port && op->od->l3redirect_port &&
> >> +        (op != op->od->l3redirect_port) &&
> >> +        (op != op->od->l3dgw_port)) {
> >> +        ds_put_format(&match, " && is_chassis_resident(%s)",
> >> +                      op->od->l3redirect_port->json_key);
> >> +        ds_put_format(&actions, "; flags.rcv_from_vlan = 1");
> >> +    }
> >>      ds_put_format(&actions, "; "
> >>                    "%sreg1 = %s; "
> >>                    "eth.src = %s; "
> >> @@ -6131,6 +6140,26 @@ build_lrouter_flows(struct hmap *datapaths,
> struct
> >> hmap *ports,
> >>                        op->lrp_networks.ipv6_addrs[i].network_s,
> >>                        op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
> >>          }
> >> +
> >> +        /* For a reply packet from gateway with VLAN switch port as
> >> +         * destination, replace router internal port MAC with router
> >> gateway
> >> +         * MAC address, so that external switches can learn gateway MAC
> >> +         * address. Later before delivering the packet to the port,
> >> +         * controller will replace the gateway MAC with router internal
> >> port
> >> +         * MAC in table 33. */
> >> +        if (op->od->l3dgw_port && (op == op->od->l3dgw_port) &&
> >> +            op->od->l3redirect_port) {
> >> +            ds_clear(&actions);
> >> +            ds_clear(&match);
> >> +            ds_put_format(&match, "inport == %s", op->json_key);
> >> +            ds_put_format(&match, " && flags.rcv_from_vlan == 1");
> >> +            ds_put_format(&match, " && is_chassis_resident(%s)",
> >> +                          op->od->l3redirect_port->json_key);
> >> +            ds_put_format(&actions,
> >> +                          "eth.src = %s; next;",
> op->lrp_networks.ea_s);
> >> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_GW_REDIRECT, 100,
> >> +                          ds_cstr(&match), ds_cstr(&actions));
> >> +        }
> >>      }
> >>
> >>      /* Convert the static routes to flows. */
> >> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> >> index ad2101c..0de41d2 100644
> >> --- a/ovn/ovn-architecture.7.xml
> >> +++ b/ovn/ovn-architecture.7.xml
> >> @@ -1067,7 +1067,9 @@
> >>
> >>        <p>
> >>          Flows in table 33 resemble those in table 32 but for logical
> >> ports that
> >> -        reside locally rather than remotely.  For unicast logical
> output
> >> ports
> >> +        reside locally rather than remotely.  If these are VLAN ports
> and
> >> +        packet has router gateway port MAC address as source, replace
> it
> >> with
> >> +        router internal port MAC address. For unicast logical output
> >> ports
> >>          on the local hypervisor, the actions just resubmit to table 34.
> >> For
> >>          multicast output ports that include one or more logical ports
> on
> >> the
> >>          local hypervisor, for each such logical port <var>P</var>, the
> >> actions
> >> --
> >> 1.8.3.1
> >>
> >> _______________________________________________
> >> dev mailing list
> >> d...@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to