On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote: > Flow key handleing changes: > - Add VLAN header array in struct flow, to record multiple 802.1q VLAN > headers. > - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN, > increase the maximum depth of nested OVS_KEY_ATTR_ENCAP. > > Refacter VLAN handling in dpif-xlate: > - Introduce 'xvlan' to track VLAN stack during flow processing. > - Input and output VLAN translation according to the xbundle type. > > Push VLAN action support: > - Allow ethertype 0x88a8 in VLAN headers and push_vlan action. > - Support push_vlan on dot1q packets. > > Add new port VLAN mode "dot1q-tunnel": > - Example: > ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 > Pushes another VLAN 100 header on packets (tagged and untagged) on ingress, > and pops it on egress. > - Customer VLAN check: > ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20 > Only customer VLAN of 10 and 20 are allowed. > > Signed-off-by: Xiao Liang <shaw.l...@gmail.com>
Thanks for working on this! I can see that it was a great deal of work. In addition to the "sparse" related fixes from a few days ago, I have some more detailed comments. It looks like this patch causes some tests to fail and then later patches fix them. This is not our practice: a patch should never make tests fails (at least not intentionally). Instead, if a patch requires changes to tests, then the same patch should update the tests. If the later patches are required to fix tests, they should be folded into this one. Our practice is to give arrays names that are plural, like 'vlans'. Let's talk about VLANs in the flow struct. The use of an array of TPID+TCI seems fine. I think, however, that we need to define and enforce an invariant: if vlan[i] matches on anything, for i > 0, then every vlan[j], for j < i, needs to match on (tci & VLAN_CFI) != 0. Otherwise we can end up with nonsensical matches like one that matches on vlan[0] being not present (i.e. match on (vlan[0] & VLAN_CFI) == 0) but vlan[1] being present (i.e. match on (vlan[1] & VLAN_CFI) != 0). We enforce a similar invariant for MPLS. I'm concerned about backward compatibility. Consider some application built on Open vSwitch using OpenFlow. Today, it can distinguish single-tagged and double-tagged packets (that use outer Ethertype 0x8100), as follows: - A single-tagged packet has vlan_tci != 0 and some non-VLAN dl_type. - A double-tagged packet has vlan_tci != 0 and a VLAN dl_type. With this patch, this won't work, because neither kind of packet has a VLAN dl_type. Instead, applications need to first match on the outer VLAN, then pop it off, then match on the inner VLAN. This difference could lead to security problems in applications. An application might, for example, want to pop an outer VLAN and forward the packet, but only if there is no inner VLAN. If it is implemented according to the previous rules, then it will not notice the inner VLAN. There are probably multiple ways to deal with this problem. Let me propose one. It is somewhat awkward, so there might be a more graceful way. Basically the idea is to retain the current view from an OpenFlow perspective: - Packet without tags: vlan_tci == 0, dl_type is non-VLAN - Packet with 1 tag: vlan_tci != 0, dl_type is non-VLAN - Packet with 2+ tags: vlan_tci != 0, dl_type is 2nd VLAN So, when a packet with 2 tags pops off the outermost one, dl_type becomes the inner Ethertype (such as IPv4) and vlan_tci becomes the single remaining VLAN. I think there's another backward compatibility risk. This patch adds support for TPID 0x88a8 without adding any way for OpenFlow applications to distinguish 88a8 from 8100. This means that applications that previously handled 0x8100 VLANs will now handle 0x88a8 VLANs whereas previously they were able to filter these out. I don't have a wonderful idea on how to handle this but I think we need some way. (The OpenFlow spec is totally unhelpful here.) The tests seem incomplete in that they do not seem to add much in the way of tests for OpenFlow handling of multiple VLANs. I'd feel more confident if it added some. In a few places it would be more convenient to make struct flow_vlan_hdr into a union, like this: union flow_vlan_hdr { ovs_be32 qtag; struct { ovs_be16 tpid; /* ETH_TYPE_VLAN_DOT1Q or ETH_TYPE_DOT1AD */ ovs_be16 tci; }; }; We could take advantage of it like this, for example: @@ -326,16 +326,12 @@ parse_mpls(const void **datap, size_t *sizep) } static inline bool -parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr *vlan_hdrs) +parse_vlan(const void **datap, size_t *sizep, union flow_vlan_hdr *vlan_hdrs) { int encaps; const ovs_be16 *eth_type; - struct qtag_prefix { - ovs_be16 eth_type; - ovs_be16 tci; - }; - memset(vlan_hdrs, 0, sizeof(struct flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS); + memset(vlan_hdrs, 0, sizeof(union flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS); data_pull(datap, sizep, ETH_ADDR_LEN * 2); eth_type = *datap; @@ -343,11 +339,11 @@ parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr *vlan_hdrs) for (encaps = 0; eth_type_vlan(*eth_type) && encaps < FLOW_MAX_VLAN_HEADERS; encaps++) { - if (OVS_LIKELY(*sizep - >= sizeof(struct qtag_prefix) + sizeof(ovs_be16))) { - const struct qtag_prefix *qp = data_pull(datap, sizep, sizeof *qp); - vlan_hdrs[encaps].tpid = qp->eth_type; - vlan_hdrs[encaps].tci = qp->tci | htons(VLAN_CFI); + if (OVS_LIKELY(*sizep >= sizeof(ovs_be32) + sizeof(ovs_be16))) { + const ovs_16aligned_be32 *qp = data_pull(datap, sizep, sizeof *qp); + vlan_hdrs[encaps].qtag = get_16aligned_be32(qp); + vlan_hdrs[encaps].tci |= htons(VLAN_CFI); + eth_type = *datap; } else { return false; diff --git a/lib/flow.h b/lib/flow.h index 4882e26..4851a32 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -735,8 +735,8 @@ static inline uint16_t miniflow_get_vid(const struct miniflow *flow, size_t n) { if (n < FLOW_MAX_VLAN_HEADERS) { - uint32_t u32 = MINIFLOW_GET_U32(flow, vlan[n]); - return vlan_tci_to_vid(((struct flow_vlan_hdr *)&u32)->tci); + union flow_vlan_hdr hdr = { .qtag = MINIFLOW_GET_BE32(flow, vlan[n]) }; + return vlan_tci_to_vid(hdr.tci); } return 0; } diff --git a/lib/odp-util.c b/lib/odp-util.c index 56a6145..6838d7d 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -5560,8 +5560,7 @@ commit_vlan_action(const struct flow* flow, struct flow *base, for (base_n--, flow_n--; base_n >= 0 && flow_n >= 0; base_n--, flow_n--) { - if (memcmp(&base->vlan[base_n], &flow->vlan[flow_n], - sizeof(struct flow_vlan_hdr))) { + if (base->vlan[base_n].qtag != flow->vlan[flow_n].qtag) { break; } } @@ -5569,7 +5568,7 @@ commit_vlan_action(const struct flow* flow, struct flow *base, /* Pop all mismatching vlan of base, push thoses of flow */ for (; base_n >= 0; base_n--) { nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_VLAN); - memset(&wc->masks.vlan[base_n], 0xff, sizeof(wc->masks.vlan[base_n])); + wc->masks.vlan[base_n] = OVS_BE32_MAX; } for (; flow_n >= 0; flow_n--) { It seems risky to fail to add a way for match_format() to print out matches past the first VLAN. It is likely to cause great confusion at some point. This code uses the term "shift" for what is usually termed "push". A "shift" can go in either direction. I'd use "push". I suggest that flow_{shift,push}_vlan() should take an optional 'wc' argument like so many other functions that modify flows. This would make it more like the MPLS code and thus easier to understand. I'm not confident about the change to handle_packet_upcall(). struct xvlan xvlan might be easier to use as just a flow_vlan_hdr. The commit_vlan_action() function would be better if it was more like commit_mpls_action(), to make it easier to verify correctness. The main difference is that at the datapath level VLAN only has push and pop, not "set", but that's a straightforward change. I'd introduce functions to count and find common VLAN tags, like we have for MPLS. I'll probably have further comments on later versions of this patch. Thanks again for working on this! _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev