Thanks Mark for the review. I will address the review comments. On Thu, Jul 19, 2018 at 6:54 PM, Mark Michelson <[email protected]> wrote:
> I've had a look through and have some notes in-line. I know some of them > will be a bit nit-picky, but...sometimes that's just how I am :) > > > On 06/25/2018 03:53 PM, [email protected] wrote: > >> From: venkata anil <[email protected]> >> >> When a vm on a vlan tenant network sends traffic to an external network, >> it is tunneled from host chassis to gateway chassis. In the earlier >> discussion [1], Russel (also in his doc [2]) suggested if we can figure >> out a way for OVN to do this redirect to the gateway host over a VLAN >> network. This patch implements his suggestion i.e will redirect to >> gateway chassis using incoming tenant vlan network. Gateway chassis are >> expected to be configured with tenant vlan networks. In this approach, >> new logical and physical flows introduced for packet processing in both >> host and gateway chassis. >> >> Packet processing in the host chassis: >> 1) A new ovs flow added in physical table 65, which sets MLF_RCV_FROM_VLAN >> flag for packets from vlan network entering into router pipeline >> 2) A new flow added in lr_in_ip_routing, for packets output through >> distributed gateway port and matching MLF_RCV_FROM_VLAN flag, >> set REGBIT_NAT_REDIRECT i.e >> table=7 (lr_in_ip_routing ), priority=2 , match=( >> ip4.dst == 0.0.0.0/0 && flags.rcv_from_vlan == 1 && >> !is_chassis_resident("cr-alice")), action=(reg9[0] = 1; next;) >> This flow will be set only on chassis not hosting chassisredirect >> port i.e compute node. >> When REGBIT_NAT_REDIRECT set, >> a) lr_in_arp_resolve, will set packet eth.dst to distibuted gateway >> port MAC >> b) lr_in_gw_redirect, will set chassisredirect port as outport >> 3) A new ovs flow added in physical table 32 will use source vlan tenant >> network tag as vlan ID for sending the packet to gateway chassis. >> As this vlan packet destination MAC is distibuted gateway port MAC, >> packet will only reach the gateway chassis. >> table=32,priority=150,reg10=0x20/0x20,reg14=0x3,reg15=0x6,me >> tadata=0x4 >> actions=mod_vlan_vid:2010,output:25,strip_vlan >> This flow will be set only on chassis not hosting chassisredirect >> port i.e compute node. >> >> Packet processing in the gateway chassis: >> 1) A new ovs flow added in physical table 0 for vlan traffic coming >> from localnet port with router distributed gateway port MAC as >> destination MAC address, resubmit to connected router ingress pipeline >> (i.e router attached to vlan tenant network). >> table=0,priority=150,in_port=67,dl_vlan=2010,dl_dst=00:00:02:01:02:03 >> actions=strip_vlan,load:0x4->OXM_OF_METADATA[],load:0x3->NXM >> _NX_REG14[], >> load:0x1->NXM_NX_REG10[5],resubmit(,8) >> This flow will be set only on chassis hosting chassisredirect >> port i.e gateway node. >> 2) A new flow added in lr_in_admission which checks MLF_RCV_FROM_VLAN >> and allows the packet. This flow will be set only on chassis hosting >> chassisredirect port i.e gateway node. >> table=0 (lr_in_admission ), priority=100 , match=( >> flags.rcv_from_vlan == 1 && inport == "lrp-44383893-613a-4bfe-b483- >> e7d0dc3055cd" && is_chassis_resident("cr-lrp-a6e3d2ab-313a-4ea3- >> 8ec4-c3c774a11f49")), action=(next;) >> Then packet will pass through router ingress and egress pipelines and >> then to external switch pipeline. >> >> [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2018-Apri >> l/046557.html >> [2] Point 3 in section 3.3.1 - Future Enhancements >> https://docs.google.com/document/d/1JecGIXPH0RAqfGvD0nmtBdE >> U1zflHACp8WSRnKCFSgg/edit# >> >> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-Apri >> l/046543.html >> >> Signed-off-by: Venkata Anil <[email protected]> >> --- >> >> v5->v6: >> * Rebased >> >> v4->v5: >> * No changes in this patch >> >> v3->v4: >> * Previous v3 patch became this patch of v4 >> * Updated the newly added flow in physical table 0 on gateway chassis >> to check for distributed gateway port MAC and then resubmit to >> router ingress pipeline >> * Improved the test >> * Added more comments >> >> ovn/controller/bfd.c | 3 +- >> ovn/controller/binding.c | 10 +- >> ovn/controller/ovn-controller.c | 3 + >> ovn/controller/ovn-controller.h | 17 ++- >> ovn/controller/physical.c | 121 ++++++++++++++++- >> ovn/lib/logical-fields.c | 4 + >> ovn/lib/logical-fields.h | 2 + >> ovn/northd/ovn-northd.c | 35 +++++ >> tests/ovn.at | 278 ++++++++++++++++++++++++++++++ >> ++++++++++ >> 9 files changed, 465 insertions(+), 8 deletions(-) >> >> diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c >> index 051781f..c696741 100644 >> --- a/ovn/controller/bfd.c >> +++ b/ovn/controller/bfd.c >> @@ -139,8 +139,9 @@ bfd_travel_gw_related_chassis( >> LIST_FOR_EACH_POP (dp_binding, node, &dp_list) { >> dp = dp_binding->dp; >> free(dp_binding); >> + const struct sbrec_datapath_binding *pdp; >> for (size_t i = 0; i < dp->n_peer_dps; i++) { >> - const struct sbrec_datapath_binding *pdp = dp->peer_dps[i]; >> + pdp = dp->peer_dps[i]->peer_dp; >> if (!pdp) { >> continue; >> } >> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c >> index 021ecdd..de4385f 100644 >> --- a/ovn/controller/binding.c >> +++ b/ovn/controller/binding.c >> @@ -145,10 +145,14 @@ add_local_datapath__(struct ovsdb_idl_index >> *sbrec_datapath_binding_by_key, >> const struct sbrec_port_binding *pb; >> SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, target, >> sbrec_port_binding_by_datapath) >> { >> + if (!strcmp(pb->type, "chassisredirect")) { >> + ld->chassisredirect_port = pb; >> + } >> if (!strcmp(pb->type, "patch")) { >> const char *peer_name = smap_get(&pb->options, "peer"); >> if (peer_name) { >> const struct sbrec_port_binding *peer; >> + struct peer_datapath *pdp; >> peer = lport_lookup_by_name(sbrec_por >> t_binding_by_name, >> peer_name); >> @@ -163,9 +167,13 @@ add_local_datapath__(struct ovsdb_idl_index >> *sbrec_datapath_binding_by_key, >> ld->peer_dps = xrealloc( >> ld->peer_dps, >> ld->n_peer_dps * sizeof *ld->peer_dps); >> - ld->peer_dps[ld->n_peer_dps - 1] = >> datapath_lookup_by_key( >> + pdp = xcalloc(1, sizeof(struct peer_datapath)); >> > > Use xmalloc here instead of xcalloc. > > sure > > + pdp->peer_dp = datapath_lookup_by_key( >> sbrec_datapath_binding_by_key, >> peer->datapath->tunnel_key); >> + pdp->patch = pb; >> + pdp->peer = peer; >> + ld->peer_dps[ld->n_peer_dps - 1] = pdp; >> } >> } >> } >> diff --git a/ovn/controller/ovn-controller.c >> b/ovn/controller/ovn-controller.c >> index 6ee72a9..8e04780 100644 >> --- a/ovn/controller/ovn-controller.c >> +++ b/ovn/controller/ovn-controller.c >> @@ -812,6 +812,9 @@ main(int argc, char *argv[]) >> struct local_datapath *cur_node, *next_node; >> HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, >> &local_datapaths) { >> + for (int i = 0; i < cur_node->n_peer_dps; i++) { >> + free(cur_node->peer_dps[i]); >> + } >> free(cur_node->peer_dps); >> hmap_remove(&local_datapaths, &cur_node->hmap_node); >> free(cur_node); >> diff --git a/ovn/controller/ovn-controller.h >> b/ovn/controller/ovn-controller.h >> index 3b15620..6c1afd9 100644 >> --- a/ovn/controller/ovn-controller.h >> +++ b/ovn/controller/ovn-controller.h >> @@ -40,6 +40,17 @@ struct ct_zone_pending_entry { >> enum ct_zone_pending_state state; >> }; >> +/* Represents a peer datapath connected to a given datapath */ >> +struct peer_datapath { >> + const struct sbrec_datapath_binding *peer_dp; >> + >> + /* Patch port connected to local datapath */ >> + const struct sbrec_port_binding *patch; >> + >> + /* Peer patch port connected to peer datapath */ >> + const struct sbrec_port_binding *peer; >> +}; >> + >> /* A logical datapath that has some relevance to this hypervisor. A >> logical >> * datapath D is relevant to hypervisor H if: >> * >> @@ -57,10 +68,14 @@ struct local_datapath { >> /* The localnet port in this datapath, if any (at most one is >> allowed). */ >> const struct sbrec_port_binding *localnet_port; >> + /* The chassisredirect port in this datapath, if any >> + * (at most one is allowed). */ >> + const struct sbrec_port_binding *chassisredirect_port; >> + >> /* True if this datapath contains an l3gateway port located on this >> * hypervisor. */ >> bool has_local_l3gateway; >> - const struct sbrec_datapath_binding **peer_dps; >> + struct peer_datapath **peer_dps; >> size_t n_peer_dps; >> }; >> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c >> index dcf2183..c4cf419 100644 >> --- a/ovn/controller/physical.c >> +++ b/ovn/controller/physical.c >> @@ -304,7 +304,8 @@ consider_port_binding(struct ovsdb_idl_index >> *sbrec_chassis_by_name, >> { >> uint32_t dp_key = binding->datapath->tunnel_key; >> uint32_t port_key = binding->tunnel_key; >> - if (!get_local_datapath(local_datapaths, dp_key)) { >> + struct local_datapath *ld = get_local_datapath(local_datapaths, >> dp_key); >> + if (!ld) { >> return; >> } >> @@ -350,6 +351,16 @@ consider_port_binding(struct ovsdb_idl_index >> *sbrec_chassis_by_name, >> put_load(0, MFF_LOG_REG0 + i, 0, 32, ofpacts_p); >> } >> put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p); >> + >> + /* Set MLF_RCV_FROM_VLAN flag for vlan network */ >> + if (ld->localnet_port) { >> + int vlan_tag = (ld->localnet_port->n_tag ? >> + *ld->localnet_port->tag : 0); >> + if (vlan_tag) { >> + put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, >> + ofpacts_p); >> + } >> > > Since this block doesn't actually use the vlan_tag for anything, it could > be simplified to: > > if (ld->localnet_port->n_tag) { > put_load(...); > } > > I think I'm right here since the "tag" field of the database is > constrained to being between 1 and 4095. > > Sure > > + } >> put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, ofpacts_p); >> clone = ofpbuf_at_assert(ofpacts_p, clone_ofs, sizeof *clone); >> ofpacts_p->header = clone; >> @@ -526,9 +537,15 @@ consider_port_binding(struct ovsdb_idl_index >> *sbrec_chassis_by_name, >> put_local_common_flows(dp_key, port_key, nested_container, >> &zone_ids, >> ofpacts_p, flow_table); >> - /* Table 0, Priority 150 and 100. >> + /* Table 0, Priority 200, 150 and 100. >> * ============================== >> * >> + * Priority 200 is for vlan traffic with distributed gateway >> port MAC >> + * as destination MAC address. For such traffic, set >> MLF_RCV_FROM_VLAN >> + * flag, MFF_LOG_DATAPATH to the router metadata and >> MFF_LOG_INPORT to >> + * the patch port connecting router and vlan network and >> resubmit into >> + * the logical router ingress pipeline. >> + * >> * Priority 150 is for tagged traffic. This may be containers >> in a >> * VM or a VLAN on a local network. For such traffic, match on >> the >> * tags and then strip the tag. >> @@ -540,6 +557,55 @@ consider_port_binding(struct ovsdb_idl_index >> *sbrec_chassis_by_name, >> * input port, MFF_LOG_DATAPATH to the logical datapath, and >> * resubmit into the logical ingress pipeline starting at table >> * 16. */ >> + >> + /* For packet from vlan network with distributed gateway port >> MAC as >> + * destination MAC address, submit it to router ingress pipeline >> */ >> + int vlan_tag = binding->n_tag ? *binding->tag : 0; >> + if (!strcmp(binding->type, "localnet") && vlan_tag) { >> + struct local_datapath *ldp = get_local_datapath( >> + local_datapaths, binding->datapath->tunnel_key); >> > > If I'm reading this correctly, ldp is the same as ld. So I think you can > skip looking up ldp and just ld throughout the rest of this block. Sure > > > + for (int i = 0; i < ldp->n_peer_dps; i++) { >> + struct local_datapath *peer_ldp = get_local_datapath( >> + local_datapaths, ldp->peer_dps[i]->peer_dp->tun >> nel_key); >> > > Is it possible for peer_ldp to be NULL? If so, you should check for it. If > not, then you should probably add an assertion. peer_ldp shoudn't be NULL. How to add assertion for checking not NULL? > > > + const struct sbrec_port_binding *crp; >> + crp = peer_ldp->chassisredirect_port; >> + if (crp && crp->chassis && >> + !strcmp(crp->chassis->name, chassis->name)) { >> + const char *gwp = smap_get(&crp->options, >> + "distributed-port"); >> > > Check for gwp to be NULL here. If there is no options:distributed-port > set, then we're gonna have a bad time. Sure. But gwp can never be NULL. Chassisredirect port will be created from distributed gateway port. > > > + if (strcmp(gwp, ldp->peer_dps[i]->peer->logical_port)) >> { >> + ofpbuf_clear(ofpacts_p); >> + match_init_catchall(&match); >> + >> + match_set_in_port(&match, ofport); >> + match_set_dl_vlan(&match, htons(vlan_tag)); >> + for (int j = 0; j < crp->n_mac; j++) { >> + struct lport_addresses laddrs; >> + if (!extract_lsp_addresses(crp->mac[j], >> &laddrs) >> + || !laddrs.n_ipv4_addrs) { >> + continue; >> > > A couple of notes here: > > * Is there a reason why IPv4 addresses are required? Is this not intended > to work in an IPv6-only scenario? > Thanks Mark. Its my mistake, shouldn't have checked for IPv4 addresses ( I have blindly copied from pinctrl.c). * In the case where extract_lsp_addresses() succeeds, but there are no IPv4 > addresses, this will leak laddrs. > > > + } >> + match_set_dl_dst(&match, laddrs.ea); >> + destroy_lport_addresses(&laddrs); >> + break; >> + } >> > > If execution reaches here and we never added destination ethernet address > to the match, should we continue creating the logical flow? Should we bail? Chassisredirect port will have MAC. But I will add a check and bail if no MAC present. > > > + >> + ofpact_put_STRIP_VLAN(ofpacts_p); >> + put_load(peer_ldp->datapath->tunnel_key, >> + MFF_LOG_DATAPATH, 0, 64, ofpacts_p); >> + put_load(ldp->peer_dps[i]->peer->tunnel_key, >> + MFF_LOG_INPORT, 0, 32, ofpacts_p); >> + put_load(1, MFF_LOG_FLAGS, >> + MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p); >> + put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, >> ofpacts_p); >> + >> + ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, >> + 200, 0, &match, ofpacts_p); >> + } >> + } >> + } >> + } >> + >> ofpbuf_clear(ofpacts_p); >> match_init_catchall(&match); >> match_set_in_port(&match, ofport); >> @@ -633,13 +699,58 @@ consider_port_binding(struct ovsdb_idl_index >> *sbrec_chassis_by_name, >> } else { >> /* Remote port connected by tunnel */ >> - /* Table 32, priority 100. >> - * ======================= >> + /* Table 32, priority 150 and 100. >> + * ============================== >> * >> * Handles traffic that needs to be sent to a remote >> hypervisor. Each >> * flow matches an output port that includes a logical port on >> a remote >> - * hypervisor, and tunnels the packet to that hypervisor. >> + * hypervisor, and tunnels the packet or send through vlan >> network to >> + * that hypervisor. >> */ >> + >> + /* For each vlan network connected to the router, add that >> network's >> + * vlan tag to the packet and output it through localnet port */ >> + struct local_datapath *ldp = get_local_datapath(local_datapaths, >> + dp_key); >> + for (int i = 0; i < ldp->n_peer_dps; i++) { >> + struct ofpact_vlan_vid *vlan_vid; >> + ofp_port_t port_ofport = 0; >> + struct peer_datapath *pdp = ldp->peer_dps[i]; >> + struct local_datapath *peer_ldp = get_local_datapath( >> + local_datapaths, pdp->peer_dp->tunnel_key); >> > > I have the same question about peer_ldp here that I did about the peer_ldp > lookup previously. > > > + if (peer_ldp->localnet_port && pdp->patch->tunnel_key) { >> + int64_t vlan_tag = (peer_ldp->localnet_port->n_tag ? >> + *peer_ldp->localnet_port->tag : 0); >> + if (!vlan_tag) { >> + continue; >> + } >> + port_ofport = u16_to_ofp(simap_get(&localvif_to_ofport, >> + peer_ldp->localnet_port->logical_port)); >> + if (!port_ofport) { >> + continue; >> + } >> + >> + match_init_catchall(&match); >> + ofpbuf_clear(ofpacts_p); >> + >> + match_set_metadata(&match, htonll(dp_key)); >> + match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0, >> + MLF_RCV_FROM_VLAN, >> MLF_RCV_FROM_VLAN); >> + match_set_reg(&match, MFF_LOG_INPORT - MFF_REG0, >> + pdp->patch->tunnel_key); >> + match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, >> port_key); >> + >> + vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); >> + vlan_vid->vlan_vid = vlan_tag; >> + vlan_vid->push_vlan_if_needed = true; >> + ofpact_put_OUTPUT(ofpacts_p)->port = port_ofport; >> + ofpact_put_STRIP_VLAN(ofpacts_p); >> + >> + ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 150, >> 0, >> + &match, ofpacts_p); >> + } >> + } >> + >> match_init_catchall(&match); >> ofpbuf_clear(ofpacts_p); >> diff --git a/ovn/lib/logical-fields.c b/ovn/lib/logical-fields.c >> index a8b5e3c..b9efa02 100644 >> --- a/ovn/lib/logical-fields.c >> +++ b/ovn/lib/logical-fields.c >> @@ -105,6 +105,10 @@ ovn_init_symtab(struct shash *symtab) >> MLF_FORCE_SNAT_FOR_LB_BIT); >> expr_symtab_add_subfield(symtab, "flags.force_snat_for_lb", NULL, >> flags_str); >> + snprintf(flags_str, sizeof flags_str, "flags[%d]", >> + MLF_RCV_FROM_VLAN_BIT); >> + expr_symtab_add_subfield(symtab, "flags.rcv_from_vlan", NULL, >> + flags_str); >> /* Connection tracking state. */ >> expr_symtab_add_field(symtab, "ct_mark", MFF_CT_MARK, NULL, false); >> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h >> index b1dbb03..96250fd 100644 >> --- a/ovn/lib/logical-fields.h >> +++ b/ovn/lib/logical-fields.h >> @@ -50,6 +50,7 @@ enum mff_log_flags_bits { >> MLF_FORCE_SNAT_FOR_DNAT_BIT = 2, >> MLF_FORCE_SNAT_FOR_LB_BIT = 3, >> MLF_LOCAL_ONLY_BIT = 4, >> + MLF_RCV_FROM_VLAN_BIT = 5, >> }; >> /* MFF_LOG_FLAGS_REG flag assignments */ >> @@ -75,6 +76,7 @@ enum mff_log_flags { >> * hypervisors should instead only be output to local targets >> */ >> MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT), >> + MLF_RCV_FROM_VLAN = (1 << MLF_RCV_FROM_VLAN_BIT), >> }; >> #endif /* ovn/lib/logical-fields.h */ >> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >> index 72fe4e7..350f015 100644 >> --- a/ovn/northd/ovn-northd.c >> +++ b/ovn/northd/ovn-northd.c >> @@ -4411,6 +4411,28 @@ add_route(struct hmap *lflows, const struct >> ovn_port *op, >> * routing. */ >> ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority, >> ds_cstr(&match), ds_cstr(&actions)); >> + >> + /* When output port is distributed gateway port, check if the router >> + * input port is a patch port connected to vlan network. >> + * Traffic from VLAN network to external network should be redirected >> + * to "redirect-chassis" by setting REGBIT_NAT_REDIRECT flag. >> + * Later physical table 32 will output this traffic to gateway >> + * chassis using input network vlan tag */ >> + if (op == op->od->l3dgw_port) { >> + ds_clear(&match); >> + ds_clear(&actions); >> + >> + ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", >> + dir, network_s, plen); >> + ds_put_format(&match, " && flags.rcv_from_vlan == 1"); >> + ds_put_format(&match, " && !is_chassis_resident(%s)", >> + op->od->l3redirect_port->json_key); >> + >> + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, >> + priority + 1, ds_cstr(&match), >> + REGBIT_NAT_REDIRECT" = 1; next;"); >> + } >> + >> ds_destroy(&match); >> ds_destroy(&actions); >> } >> @@ -4822,6 +4844,19 @@ build_lrouter_flows(struct hmap *datapaths, struct >> hmap *ports, >> } >> ovn_lflow_add(lflows, op->od, S_ROUTER_IN_ADMISSION, 50, >> ds_cstr(&match), "next;"); >> + >> + /* VLAN traffic from localnet port should be allowed for >> + * router processing on the "redirect-chassis". */ >> + if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer && >> + op->peer->od->localnet_port && (op != op->od->l3dgw_port)) { >> + ds_clear(&match); >> + ds_put_format(&match, "flags.rcv_from_vlan == 1"); >> + ds_put_format(&match, " && inport == %s", op->json_key); >> + ds_put_format(&match, " && is_chassis_resident(%s)", >> + op->od->l3redirect_port->json_key); >> + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_ADMISSION, 100, >> + ds_cstr(&match), "next;"); >> + } >> } >> /* Logical router ingress table 1: IP Input. */ >> diff --git a/tests/ovn.at b/tests/ovn.at >> index 6553d17..5ae767d 100644 >> --- a/tests/ovn.at >> +++ b/tests/ovn.at >> @@ -7713,6 +7713,284 @@ test_ip_packet gw2 gw1 >> OVN_CLEANUP([hv1],[gw1],[gw2],[ext1]) >> AT_CLEANUP >> +# VLAN traffic for external network redirected through distributed >> router gateway port >> +# should use vlans(i.e input network vlan tag) across hypervisors >> instead of tunneling. >> +AT_SETUP([ovn -- vlan traffic for external network with distributed >> router gateway port]) >> +AT_SKIP_IF([test $HAVE_PYTHON = no]) >> +ovn_start >> + >> +# Logical network: >> +# # One LR R1 that has switches foo (192.168.1.0/24) and >> +# # alice (172.16.1.0/24) connected to it. The logical port >> +# # between R1 and alice has a "redirect-chassis" specified, >> +# # i.e. it is the distributed router gateway port(172.16.1.6). >> +# # Switch alice also has a localnet port defined. >> +# # An additional switch outside has the same subnet as alice >> +# # (172.16.1.0/24), a localnet port and nexthop port(172.16.1.1) >> +# # which will receive the packet destined for external network >> +# # (i.e 8.8.8.8 as destination ip). >> + >> +# Physical network: >> +# # Three hypervisors hv[123]. >> +# # hv1 hosts vif foo1. >> +# # hv2 is the "redirect-chassis" that hosts the distributed router >> gateway port. >> +# # hv3 hosts nexthop port vif outside1. >> +# # All other tests connect hypervisors to network n1 through br-phys >> for tunneling. >> +# # But in this test, hv1 won't connect to n1(and no br-phys in hv1), and >> +# # in order to show vlans(instead of tunneling) used between hv1 and >> hv2, >> +# # a new network n2 created and hv1 and hv2 connected to this network >> through br-ex. >> +# # hv2 and hv3 are still connected to n1 network through br-phys. >> +net_add n1 >> + >> +# We are not calling ovn_attach for hv1, to avoid adding br-phys. >> +# Tunneling won't work in hv1 as ovn-encap-ip is not added to any bridge >> in hv1 >> +sim_add hv1 >> +as hv1 >> +ovs-vsctl \ >> + -- set Open_vSwitch . external-ids:system-id=hv1 \ >> + -- set Open_vSwitch . >> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock >> \ >> + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve,vxlan \ >> + -- set Open_vSwitch . external-ids:ovn-encap-ip=192.168.0.1 \ >> + -- add-br br-int \ >> + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true >> \ >> + -- set Open_vSwitch . external-ids:ovn-bridge-mappings=public:br-ex >> + >> +start_daemon ovn-controller >> +ovs-vsctl -- add-port br-int hv1-vif1 -- \ >> + set interface hv1-vif1 external-ids:iface-id=foo1 \ >> + ofport-request=1 >> + >> +sim_add hv2 >> +as hv2 >> +ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.2 >> +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappin >> gs="public:br-ex,phys:br-phys" >> + >> +sim_add hv3 >> +as hv3 >> +ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.3 >> +ovs-vsctl -- add-port br-int hv3-vif1 -- \ >> + set interface hv3-vif1 external-ids:iface-id=outside1 \ >> + options:tx_pcap=hv3/vif1-tx.pcap \ >> + options:rxq_pcap=hv3/vif1-rx.pcap \ >> + ofport-request=1 >> +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappin >> gs="phys:br-phys" >> + >> +# Create network n2 for vlan connectivity between hv1 and hv2 >> +net_add n2 >> + >> +as hv1 >> +ovs-vsctl add-br br-ex >> +net_attach n2 br-ex >> + >> +as hv2 >> +ovs-vsctl add-br br-ex >> +net_attach n2 br-ex >> + >> +OVN_POPULATE_ARP >> + >> +ovn-nbctl create Logical_Router name=R1 >> + >> +ovn-nbctl ls-add foo >> +ovn-nbctl ls-add alice >> +ovn-nbctl ls-add outside >> + >> +# Connect foo to R1 >> +ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24 >> +ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo \ >> + type=router options:router-port=foo \ >> + -- lsp-set-addresses rp-foo router >> + >> +# Connect alice to R1 as distributed router gateway port (172.16.1.6) on >> hv2 >> +ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.6/24 \ >> + -- set Logical_Router_Port alice options:redirect-chassis="hv2" >> +ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \ >> + type=router options:router-port=alice \ >> + -- lsp-set-addresses rp-alice router \ >> + >> + >> +# Create logical port foo1 in foo >> +ovn-nbctl lsp-add foo foo1 \ >> +-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2" >> + >> +# Create logical port outside1 in outside, which is a nexthop address >> +# for 172.16.1.0/24 >> +ovn-nbctl lsp-add outside outside1 \ >> +-- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.1" >> + >> +# Set default gateway (nexthop) to 172.16.1.1 >> +ovn-nbctl lr-route-add R1 "0.0.0.0/0" 172.16.1.1 alice >> +AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.6 192.168.1.1/24]) >> +ovn-nbctl set Logical_Switch_Port rp-alice options:nat-addresses=router >> + >> +ovn-nbctl lsp-add foo ln-foo >> +ovn-nbctl lsp-set-addresses ln-foo unknown >> +ovn-nbctl lsp-set-options ln-foo network_name=public >> +ovn-nbctl lsp-set-type ln-foo localnet >> +AT_CHECK([ovn-nbctl set Logical_Switch_Port ln-foo tag=2]) >> + >> +# Create localnet port in alice >> +ovn-nbctl lsp-add alice ln-alice >> +ovn-nbctl lsp-set-addresses ln-alice unknown >> +ovn-nbctl lsp-set-type ln-alice localnet >> +ovn-nbctl lsp-set-options ln-alice network_name=phys >> + >> +# Create localnet port in outside >> +ovn-nbctl lsp-add outside ln-outside >> +ovn-nbctl lsp-set-addresses ln-outside unknown >> +ovn-nbctl lsp-set-type ln-outside localnet >> +ovn-nbctl lsp-set-options ln-outside network_name=phys >> + >> +# Allow some time for ovn-northd and ovn-controller to catch up. >> +# XXX This should be more systematic. >> +ovn-nbctl --wait=hv --timeout=3 sync >> + >> +echo "---------NB dump-----" >> +ovn-nbctl show >> +echo "---------------------" >> +ovn-nbctl list logical_router >> +echo "---------------------" >> +ovn-nbctl list nat >> +echo "---------------------" >> +ovn-nbctl list logical_router_port >> +echo "---------------------" >> + >> +echo "---------SB dump-----" >> +ovn-sbctl list datapath_binding >> +echo "---------------------" >> +ovn-sbctl list port_binding >> +echo "---------------------" >> +ovn-sbctl dump-flows >> +echo "---------------------" >> +ovn-sbctl list chassis >> +echo "---------------------" >> + >> +for chassis in hv1 hv2 hv3; do >> + as $chassis >> + echo "------ $chassis dump ----------" >> + ovs-vsctl show br-int >> + ovs-ofctl show br-int >> + ovs-ofctl dump-flows br-int >> + echo "--------------------------" >> +done >> + >> +ip_to_hex() { >> + printf "%02x%02x%02x%02x" "$@" >> +} >> + >> +foo1_ip=$(ip_to_hex 192 168 1 2) >> +gw_ip=$(ip_to_hex 172 16 1 6) >> +dst_ip=$(ip_to_hex 8 8 8 8) >> +nexthop_ip=$(ip_to_hex 172 16 1 1) >> + >> +foo1_mac="f00000010203" >> +foo_mac="000001010203" >> +gw_mac="000002010203" >> +nexthop_mac="f00000010204" >> + >> +# Send ip packet from foo1 to 8.8.8.8 >> +src_mac="f00000010203" >> +dst_mac="000001010203" >> +packet=${foo_mac}${foo1_mac}08004500001c0000000040110000${f >> oo1_ip}${dst_ip}0035111100080000 >> + >> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet >> +sleep 2 >> + >> +# ARP request packet for nexthop_ip to expect at outside1 >> +arp_request=ffffffffffff${gw_mac}08060001080006040001${gw_m >> ac}${gw_ip}000000000000${nexthop_ip} >> +echo $arp_request >> hv3-vif1.expected >> +cat hv3-vif1.expected > expout >> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep >> ${nexthop_ip} | uniq > hv3-vif1 >> +AT_CHECK([sort hv3-vif1], [0], [expout]) >> + >> +# Send ARP reply from outside1 back to the router >> +reply_mac="f00000010204" >> +arp_reply=${gw_mac}${nexthop_mac}08060001080006040002${next >> hop_mac}${nexthop_ip}${gw_mac}${gw_ip} >> + >> +as hv3 ovs-appctl netdev-dummy/receive hv3-vif1 $arp_reply >> +OVS_WAIT_UNTIL([ >> + test `as hv2 ovs-ofctl dump-flows br-int | grep table=66 | \ >> +grep actions=mod_dl_dst:f0:00:00:01:02:04 | wc -l` -eq 1 >> + ]) >> + >> +echo "ovn-sbctl list MAC_Binding" >> +ovn-sbctl list MAC_Binding >> +echo "============================" >> + >> +# VLAN tagged packet with distributed gateway port(172.16.1.6) MAC as >> destination MAC >> +# is expected on bridge connecting hv1 and hv2 >> +expected=${gw_mac}${foo1_mac}8100000208004500001c0000000040 >> 110000${foo1_ip}${dst_ip}0035111100080000 >> +echo $expected > hv1-br-ex_n2.expected >> + >> +# Packet to Expect at outside1 i.e nexthop(172.16.1.1) port. >> +# As connection tracking not enabled for this test, snat can't be done >> on the packet. >> +# We still see foo1 as the source ip address. But source mac(gateway >> MAC) and >> +# dest mac(nexthop mac) are properly configured. >> +expected=${nexthop_mac}${gw_mac}08004500001c000000003f11010 >> 0${foo1_ip}${dst_ip}0035111100080000 >> +echo $expected > hv3-vif1.expected >> + >> +reset_pcap_file() { >> + local iface=$1 >> + local pcap_file=$2 >> + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ >> +options:rxq_pcap=dummy-rx.pcap >> + rm -f ${pcap_file}*.pcap >> + ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap >> \ >> +options:rxq_pcap=${pcap_file}-rx.pcap >> +} >> + >> +as hv1 reset_pcap_file br-ex_n2 hv1/br-ex_n2 >> +as hv3 reset_pcap_file hv3-vif1 hv3/vif1 >> +sleep 2 >> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet >> +sleep 2 >> + >> +# On hv1, table 65 for packets going from vlan switch pipleline to >> router pipleine >> +# set MLF_RCV_FROM_VLAN flag >> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep >> "priority=100,reg15=0x1,metadata=0x2" \ >> +| grep actions=clone | grep "load:0x1->NXM_NX_REG10" | wc -l], [0], [[1 >> +]]) >> +# On hv1, because of snat rule in table 15, a higher priority(i.e 2) flow >> +# added for packets with MLF_RCV_FROM_VLAN flag with output as >> distributed >> +# gateway port, which sets REGBIT_NAT_REDIRECT flag >> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=15 | grep >> "priority=2,ip,reg10=0x20/0x20,metadata=0x1" \ >> +| grep "actions=load:0x1->OXM_OF_PKT_REG4" | wc -l], [0], [[1 >> +]]) >> + >> +# On hv1, table 32 flow which tags packet with source network vlan tag >> and sends it to hv2 >> +# through br-ex >> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep >> "priority=150,reg10=0x20/0x20,reg14=0x1,reg15=0x3,metadata=0x1" \ >> +| grep "actions=mod_vlan_vid:2" | grep "n_packets=2," | wc -l], [0], [[1 >> +]]) >> + >> +# On hv2 table 0, vlan tagged packet is sent through router pipeline >> +# by setting MLF_RCV_FROM_VLAN flag (REG10) >> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep "table=0," | grep >> "priority=200" | grep "dl_vlan=2" | \ >> +grep "dl_dst=00:00:02:01:02:03" | grep >> "actions=strip_vlan,load:0x1->OXM_OF_METADATA" >> | grep "load:0x1->NXM_NX_REG14" | \ >> +grep "load:0x1->NXM_NX_REG10" | wc -l], [0], [[1 >> +]]) >> +# on hv2 table 8, allow packets with router metadata and with >> MLF_RCV_FROM_VLAN flag >> +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=8 | grep >> "priority=100,reg10=0x20/0x20,reg14=0x1,metadata=0x1" | wc -l], [0], [[1 >> +]]) >> + >> +ip_packet() { >> + grep "2010203f00000010203" >> +} >> + >> +# Check vlan tagged packet on the bridge connecting hv1 and hv2 >> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-ex_n2-tx.pcap | >> ip_packet | uniq > hv1-br-ex_n2 >> +cat hv1-br-ex_n2.expected > expout >> +AT_CHECK([sort hv1-br-ex_n2], [0], [expout]) >> + >> +# Check expected packet on nexthop interface >> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep >> ${foo1_ip}${dst_ip} | uniq > hv3-vif1 >> +cat hv3-vif1.expected > expout >> +AT_CHECK([sort hv3-vif1], [0], [expout]) >> + >> +OVN_CLEANUP([hv1],[hv2],[hv3]) >> +AT_CLEANUP >> + >> AT_SETUP([ovn -- 1 LR with distributed router gateway port]) >> AT_SKIP_IF([test $HAVE_PYTHON = no]) >> ovn_start >> >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
