Thanks! I'm working on code changes according to your comments. I think we need more discussion about the ethertype matching. Please see inline.
Thanks, Xiao On Thu, Jul 28, 2016 at 2:40 AM, Ben Pfaff <b...@ovn.org> wrote: > On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote: >> Flow key handleing changes: >> - Add VLAN header array in struct flow, to record multiple 802.1q VLAN >> headers. >> - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN, >> increase the maximum depth of nested OVS_KEY_ATTR_ENCAP. >> >> Refacter VLAN handling in dpif-xlate: >> - Introduce 'xvlan' to track VLAN stack during flow processing. >> - Input and output VLAN translation according to the xbundle type. >> >> Push VLAN action support: >> - Allow ethertype 0x88a8 in VLAN headers and push_vlan action. >> - Support push_vlan on dot1q packets. >> >> Add new port VLAN mode "dot1q-tunnel": >> - Example: >> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 >> Pushes another VLAN 100 header on packets (tagged and untagged) on ingress, >> and pops it on egress. >> - Customer VLAN check: >> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20 >> Only customer VLAN of 10 and 20 are allowed. >> >> Signed-off-by: Xiao Liang <shaw.l...@gmail.com> > > Thanks for working on this! I can see that it was a great deal of > work. In addition to the "sparse" related fixes from a few days ago, > I have some more detailed comments. > > It looks like this patch causes some tests to fail and then later > patches fix them. This is not our practice: a patch should never make > tests fails (at least not intentionally). Instead, if a patch > requires changes to tests, then the same patch should update the > tests. If the later patches are required to fix tests, they should be > folded into this one. I will squash the commits. > > Our practice is to give arrays names that are plural, like 'vlans'. > > Let's talk about VLANs in the flow struct. The use of an array of > TPID+TCI seems fine. I think, however, that we need to define and > enforce an invariant: if vlan[i] matches on anything, for i > 0, then > every vlan[j], for j < i, needs to match on (tci & VLAN_CFI) != 0. > Otherwise we can end up with nonsensical matches like one that matches > on vlan[0] being not present (i.e. match on (vlan[0] & VLAN_CFI) == 0) > but vlan[1] being present (i.e. match on (vlan[1] & VLAN_CFI) != 0). > We enforce a similar invariant for MPLS. > You are right, thanks. > I'm concerned about backward compatibility. Consider some application > built on Open vSwitch using OpenFlow. Today, it can distinguish > single-tagged and double-tagged packets (that use outer Ethertype > 0x8100), as follows: > > - A single-tagged packet has vlan_tci != 0 and some non-VLAN > dl_type. > > - A double-tagged packet has vlan_tci != 0 and a VLAN dl_type. > > With this patch, this won't work, because neither kind of packet has a > VLAN dl_type. Instead, applications need to first match on the outer > VLAN, then pop it off, then match on the inner VLAN. This difference > could lead to security problems in applications. An application > might, for example, want to pop an outer VLAN and forward the packet, > but only if there is no inner VLAN. If it is implemented according to > the previous rules, then it will not notice the inner VLAN. Maybe some applications are implemented this way, but they are probably wrong. OpenFlow says eth_type is "ethernet type of the OpenFlow packet payload, after VLAN tags", so it is the payload ethtype for a double-tagged packet. It's the same for single-tagged packet: application must explicitly match vlan_tci to decide whether it has VLAN tag. The problem is that there is currently no way to peek inner VLAN without popping the outer one. I heard from Tom that you have plan to support TPID matching. Is it possible to add extension match fields like TPID1, TPID2 to match both TPIDs? This may also provide a way to differentiate 0x8100 and 0x88a8. > > There are probably multiple ways to deal with this problem. Let me > propose one. It is somewhat awkward, so there might be a more > graceful way. Basically the idea is to retain the current view from > an OpenFlow perspective: > > - Packet without tags: vlan_tci == 0, dl_type is non-VLAN > - Packet with 1 tag: vlan_tci != 0, dl_type is non-VLAN > - Packet with 2+ tags: vlan_tci != 0, dl_type is 2nd VLAN > > So, when a packet with 2 tags pops off the outermost one, dl_type > becomes the inner Ethertype (such as IPv4) and vlan_tci becomes the > single remaining VLAN. > > I think there's another backward compatibility risk. This patch adds > support for TPID 0x88a8 without adding any way for OpenFlow > applications to distinguish 88a8 from 8100. This means that > applications that previously handled 0x8100 VLANs will now handle > 0x88a8 VLANs whereas previously they were able to filter these out. I > don't have a wonderful idea on how to handle this but I think we need > some way. (The OpenFlow spec is totally unhelpful here.) > > The tests seem incomplete in that they do not seem to add much in the > way of tests for OpenFlow handling of multiple VLANs. I'd feel more > confident if it added some. > I found some tests added in Tom's patches. I'm trying to include them and other tests. > In a few places it would be more convenient to make struct > flow_vlan_hdr into a union, like this: > > union flow_vlan_hdr { > ovs_be32 qtag; > struct { > ovs_be16 tpid; /* ETH_TYPE_VLAN_DOT1Q or ETH_TYPE_DOT1AD */ > ovs_be16 tci; > }; > }; That's a good idea. > > We could take advantage of it like this, for example: > > @@ -326,16 +326,12 @@ parse_mpls(const void **datap, size_t *sizep) > } > > static inline bool > -parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr > *vlan_hdrs) > +parse_vlan(const void **datap, size_t *sizep, union flow_vlan_hdr *vlan_hdrs) > { > int encaps; > const ovs_be16 *eth_type; > - struct qtag_prefix { > - ovs_be16 eth_type; > - ovs_be16 tci; > - }; > > - memset(vlan_hdrs, 0, sizeof(struct flow_vlan_hdr) * > FLOW_MAX_VLAN_HEADERS); > + memset(vlan_hdrs, 0, sizeof(union flow_vlan_hdr) * > FLOW_MAX_VLAN_HEADERS); > data_pull(datap, sizep, ETH_ADDR_LEN * 2); > > eth_type = *datap; > @@ -343,11 +339,11 @@ parse_vlan(const void **datap, size_t *sizep, struct > flow_vlan_hdr *vlan_hdrs) > for (encaps = 0; > eth_type_vlan(*eth_type) && encaps < FLOW_MAX_VLAN_HEADERS; > encaps++) { > - if (OVS_LIKELY(*sizep > - >= sizeof(struct qtag_prefix) + sizeof(ovs_be16))) { > - const struct qtag_prefix *qp = data_pull(datap, sizep, sizeof > *qp); > - vlan_hdrs[encaps].tpid = qp->eth_type; > - vlan_hdrs[encaps].tci = qp->tci | htons(VLAN_CFI); > + if (OVS_LIKELY(*sizep >= sizeof(ovs_be32) + sizeof(ovs_be16))) { > + const ovs_16aligned_be32 *qp = data_pull(datap, sizep, sizeof > *qp); > + vlan_hdrs[encaps].qtag = get_16aligned_be32(qp); > + vlan_hdrs[encaps].tci |= htons(VLAN_CFI); > + > eth_type = *datap; > } else { > return false; > diff --git a/lib/flow.h b/lib/flow.h > index 4882e26..4851a32 100644 > --- a/lib/flow.h > +++ b/lib/flow.h > @@ -735,8 +735,8 @@ static inline uint16_t > miniflow_get_vid(const struct miniflow *flow, size_t n) > { > if (n < FLOW_MAX_VLAN_HEADERS) { > - uint32_t u32 = MINIFLOW_GET_U32(flow, vlan[n]); > - return vlan_tci_to_vid(((struct flow_vlan_hdr *)&u32)->tci); > + union flow_vlan_hdr hdr = { .qtag = MINIFLOW_GET_BE32(flow, vlan[n]) > }; > + return vlan_tci_to_vid(hdr.tci); > } > return 0; > } > diff --git a/lib/odp-util.c b/lib/odp-util.c > index 56a6145..6838d7d 100644 > --- a/lib/odp-util.c > +++ b/lib/odp-util.c > @@ -5560,8 +5560,7 @@ commit_vlan_action(const struct flow* flow, struct flow > *base, > for (base_n--, flow_n--; > base_n >= 0 && flow_n >= 0; > base_n--, flow_n--) { > - if (memcmp(&base->vlan[base_n], &flow->vlan[flow_n], > - sizeof(struct flow_vlan_hdr))) { > + if (base->vlan[base_n].qtag != flow->vlan[flow_n].qtag) { > break; > } > } > @@ -5569,7 +5568,7 @@ commit_vlan_action(const struct flow* flow, struct flow > *base, > /* Pop all mismatching vlan of base, push thoses of flow */ > for (; base_n >= 0; base_n--) { > nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_VLAN); > - memset(&wc->masks.vlan[base_n], 0xff, > sizeof(wc->masks.vlan[base_n])); > + wc->masks.vlan[base_n] = OVS_BE32_MAX; > } > > for (; flow_n >= 0; flow_n--) { > > It seems risky to fail to add a way for match_format() to print out > matches past the first VLAN. It is likely to cause great confusion at > some point. > Yes, I need to think carefully about this. > This code uses the term "shift" for what is usually termed "push". A > "shift" can go in either direction. I'd use "push". > Yes, "push" looks symmetric. I used "shift" because it leaves room for a header rather than push data. > I suggest that flow_{shift,push}_vlan() should take an optional 'wc' > argument like so many other functions that modify flows. This would > make it more like the MPLS code and thus easier to understand. > > I'm not confident about the change to handle_packet_upcall(). > > struct xvlan xvlan might be easier to use as just a flow_vlan_hdr. flow_vlan_hdr is in big-endian and in which vlan-id/pcp are encoded. I defined struct xvlan to avoid conversion each time. > > The commit_vlan_action() function would be better if it was more like > commit_mpls_action(), to make it easier to verify correctness. The > main difference is that at the datapath level VLAN only has push and > pop, not "set", but that's a straightforward change. I'd introduce > functions to count and find common VLAN tags, like we have for MPLS. Agree, extracting a function makes it more clear. > > I'll probably have further comments on later versions of this patch. > > Thanks again for working on this! _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev