On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote:
> Flow key handleing changes:
> - Add VLAN header array in struct flow, to record multiple 802.1q VLAN
>   headers.
> - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN,
>   increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
> 
> Refacter VLAN handling in dpif-xlate:
> - Introduce 'xvlan' to track VLAN stack during flow processing.
> - Input and output VLAN translation according to the xbundle type.
> 
> Push VLAN action support:
> - Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
> - Support push_vlan on dot1q packets.
> 
> Add new port VLAN mode "dot1q-tunnel":
> - Example:
>     ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100
>   Pushes another VLAN 100 header on packets (tagged and untagged) on ingress,
>   and pops it on egress.
> - Customer VLAN check:
>     ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20
>   Only customer VLAN of 10 and 20 are allowed.
> 
> Signed-off-by: Xiao Liang <shaw.l...@gmail.com>

Thanks for working on this!  I can see that it was a great deal of
work.  In addition to the "sparse" related fixes from a few days ago,
I have some more detailed comments.

It looks like this patch causes some tests to fail and then later
patches fix them.  This is not our practice: a patch should never make
tests fails (at least not intentionally).  Instead, if a patch
requires changes to tests, then the same patch should update the
tests.  If the later patches are required to fix tests, they should be
folded into this one.

Our practice is to give arrays names that are plural, like 'vlans'.

Let's talk about VLANs in the flow struct.  The use of an array of
TPID+TCI seems fine.  I think, however, that we need to define and
enforce an invariant: if vlan[i] matches on anything, for i > 0, then
every vlan[j], for j < i, needs to match on (tci & VLAN_CFI) != 0.
Otherwise we can end up with nonsensical matches like one that matches
on vlan[0] being not present (i.e. match on (vlan[0] & VLAN_CFI) == 0)
but vlan[1] being present (i.e. match on (vlan[1] & VLAN_CFI) != 0).
We enforce a similar invariant for MPLS.

I'm concerned about backward compatibility.  Consider some application
built on Open vSwitch using OpenFlow.  Today, it can distinguish
single-tagged and double-tagged packets (that use outer Ethertype
0x8100), as follows:

    - A single-tagged packet has vlan_tci != 0 and some non-VLAN
      dl_type.

    - A double-tagged packet has vlan_tci != 0 and a VLAN dl_type.

With this patch, this won't work, because neither kind of packet has a
VLAN dl_type.  Instead, applications need to first match on the outer
VLAN, then pop it off, then match on the inner VLAN.  This difference
could lead to security problems in applications.  An application
might, for example, want to pop an outer VLAN and forward the packet,
but only if there is no inner VLAN.  If it is implemented according to
the previous rules, then it will not notice the inner VLAN.

There are probably multiple ways to deal with this problem.  Let me
propose one.  It is somewhat awkward, so there might be a more
graceful way.  Basically the idea is to retain the current view from
an OpenFlow perspective:

    - Packet without tags: vlan_tci == 0, dl_type is non-VLAN
    - Packet with 1 tag:   vlan_tci != 0, dl_type is non-VLAN
    - Packet with 2+ tags: vlan_tci != 0, dl_type is 2nd VLAN

So, when a packet with 2 tags pops off the outermost one, dl_type
becomes the inner Ethertype (such as IPv4) and vlan_tci becomes the
single remaining VLAN.

I think there's another backward compatibility risk.  This patch adds
support for TPID 0x88a8 without adding any way for OpenFlow
applications to distinguish 88a8 from 8100.  This means that
applications that previously handled 0x8100 VLANs will now handle
0x88a8 VLANs whereas previously they were able to filter these out.  I
don't have a wonderful idea on how to handle this but I think we need
some way.  (The OpenFlow spec is totally unhelpful here.)

The tests seem incomplete in that they do not seem to add much in the
way of tests for OpenFlow handling of multiple VLANs.  I'd feel more
confident if it added some.

In a few places it would be more convenient to make struct
flow_vlan_hdr into a union, like this:

    union flow_vlan_hdr {
        ovs_be32 qtag;
        struct {
            ovs_be16 tpid;  /* ETH_TYPE_VLAN_DOT1Q or ETH_TYPE_DOT1AD */
            ovs_be16 tci;
        };
    };

We could take advantage of it like this, for example:

@@ -326,16 +326,12 @@ parse_mpls(const void **datap, size_t *sizep)
 }
 
 static inline bool
-parse_vlan(const void **datap, size_t *sizep, struct flow_vlan_hdr *vlan_hdrs)
+parse_vlan(const void **datap, size_t *sizep, union flow_vlan_hdr *vlan_hdrs)
 {
     int encaps;
     const ovs_be16 *eth_type;
-    struct qtag_prefix {
-        ovs_be16 eth_type;
-        ovs_be16 tci;
-    };
 
-    memset(vlan_hdrs, 0, sizeof(struct flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS);
+    memset(vlan_hdrs, 0, sizeof(union flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS);
     data_pull(datap, sizep, ETH_ADDR_LEN * 2);
 
     eth_type = *datap;
@@ -343,11 +339,11 @@ parse_vlan(const void **datap, size_t *sizep, struct 
flow_vlan_hdr *vlan_hdrs)
     for (encaps = 0;
          eth_type_vlan(*eth_type) && encaps < FLOW_MAX_VLAN_HEADERS;
          encaps++) {
-        if (OVS_LIKELY(*sizep
-                       >= sizeof(struct qtag_prefix) + sizeof(ovs_be16))) {
-            const struct qtag_prefix *qp = data_pull(datap, sizep, sizeof *qp);
-            vlan_hdrs[encaps].tpid = qp->eth_type;
-            vlan_hdrs[encaps].tci = qp->tci | htons(VLAN_CFI);
+        if (OVS_LIKELY(*sizep >= sizeof(ovs_be32) + sizeof(ovs_be16))) {
+            const ovs_16aligned_be32 *qp = data_pull(datap, sizep, sizeof *qp);
+            vlan_hdrs[encaps].qtag = get_16aligned_be32(qp);
+            vlan_hdrs[encaps].tci |= htons(VLAN_CFI);
+
             eth_type = *datap;
         } else {
             return false;
diff --git a/lib/flow.h b/lib/flow.h
index 4882e26..4851a32 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -735,8 +735,8 @@ static inline uint16_t
 miniflow_get_vid(const struct miniflow *flow, size_t n)
 {
     if (n < FLOW_MAX_VLAN_HEADERS) {
-        uint32_t u32 = MINIFLOW_GET_U32(flow, vlan[n]);
-        return vlan_tci_to_vid(((struct flow_vlan_hdr *)&u32)->tci);
+        union flow_vlan_hdr hdr = { .qtag = MINIFLOW_GET_BE32(flow, vlan[n]) };
+        return vlan_tci_to_vid(hdr.tci);
     }
     return 0;
 }
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 56a6145..6838d7d 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -5560,8 +5560,7 @@ commit_vlan_action(const struct flow* flow, struct flow 
*base,
     for (base_n--, flow_n--;
          base_n >= 0 && flow_n >= 0;
          base_n--, flow_n--) {
-        if (memcmp(&base->vlan[base_n], &flow->vlan[flow_n],
-                   sizeof(struct flow_vlan_hdr))) {
+        if (base->vlan[base_n].qtag != flow->vlan[flow_n].qtag) {
             break;
         }
     }
@@ -5569,7 +5568,7 @@ commit_vlan_action(const struct flow* flow, struct flow 
*base,
     /* Pop all mismatching vlan of base, push thoses of flow */
     for (; base_n >= 0; base_n--) {
         nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_POP_VLAN);
-        memset(&wc->masks.vlan[base_n], 0xff, sizeof(wc->masks.vlan[base_n]));
+        wc->masks.vlan[base_n] = OVS_BE32_MAX;
     }
 
     for (; flow_n >= 0; flow_n--) {

It seems risky to fail to add a way for match_format() to print out
matches past the first VLAN.  It is likely to cause great confusion at
some point.

This code uses the term "shift" for what is usually termed "push".  A
"shift" can go in either direction.  I'd use "push".

I suggest that flow_{shift,push}_vlan() should take an optional 'wc'
argument like so many other functions that modify flows.  This would
make it more like the MPLS code and thus easier to understand.

I'm not confident about the change to handle_packet_upcall().

struct xvlan xvlan might be easier to use as just a flow_vlan_hdr.

The commit_vlan_action() function would be better if it was more like
commit_mpls_action(), to make it easier to verify correctness.  The
main difference is that at the datapath level VLAN only has push and
pop, not "set", but that's a straightforward change.  I'd introduce
functions to count and find common VLAN tags, like we have for MPLS.

I'll probably have further comments on later versions of this patch.

Thanks again for working on this!
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to