On Fri, Oct 7, 2011 at 1:03 PM, Pravin B Shelar <[email protected]> wrote:
> Almost all current actions can be expressed in the form of
> push/pop/set <field>, where field is one of the match fields. We can
> create three base actions and take a field. This has both a nice
> symmetry and avoids inconsistencies where we can match on the vlan
> TPID but not set it.
> Following patch converts all actions to this new format.
>
> Bug #7115

Since this touches datapath-protocol.h, which is really a kernel
header, this patch needs a signed-off-by line.

Also, since this won't compile without the follow on patch (and since
they have the same subject line as well), they really can't be
considered independent components and need to be combined or at least
divided differently.

> diff --git a/include/openvswitch/datapath-protocol.h 
> b/include/openvswitch/datapath-protocol.h
> index 6c89411..071f02a 100644
> --- a/include/openvswitch/datapath-protocol.h
> +++ b/include/openvswitch/datapath-protocol.h
> @@ -288,6 +288,7 @@ struct ovs_flow_stats {
>
>  enum ovs_key_type {
>        OVS_KEY_ATTR_UNSPEC,
> +       OVS_KEY_ATTR_NONE,

I'm not sure why we can't use UNSPEC instead of having to add NONE as well.

> +       OVS_ACTION_ATTR_OUTPUT,                 /* Output to switch port. */
> +       OVS_ACTION_ATTR_USERSPACE,              /* Send copy to userspace. */
> +       OVS_ACTION_ATTR_PUSH,                   /* push pakcet header. */
> +       OVS_ACTION_ATTR_POP,                    /* Pop packet headerheader. */
> +       OVS_ACTION_ATTR_SET,                    /* Set packet field(s). */
> +       OVS_ACTION_ATTR_SET_PRIORITY,           /* Set skb->priority. */
> +       OVS_ACTION_ATTR_RESTORE_PRIORITY,       /* Restore skb->priority. */
> +       OVS_ACTION_ATTR_SAMPLE,                 /* Sample action */
>  };

There are a number of typos/errors in the comments:
 * The description for PUSH should be captialized.
 * There's a typo in "pakcet" also with PUSH.
 * A typo with "headerheader" in POP.
 * The previous description for SAMPLE was much more descriptive.

> -#define OVS_ACTION_ATTR_MAX (__OVS_ACTION_ATTR_MAX - 1)

I'm not sure why you dropped this.

> + * OVS ACTIONS are expressed using enum ovs_action_type and enum 
> ovs_key_type.
> + * enum ovs_action_type is like base action and ovs_key_type specifies which
> + * field to act on. e.g. push/pop/set <field>.
> + * This has both a nice symmetry and avoids inconsistencies where we can 
> match
> + * on a key but could not set it.
> + */

This comment seems very out of place - I would just describe what is
going on and skip the editorializing.

> +#define OVS_ACT_TYPE(type, key_id)     ((type << 8) | key_id)

This compacted format never really thrilled me - it seems overly
aggressive about compacting information without much benefit and
seeing the code that uses it reinforces that belief.  Strictly
speaking, it doesn't actually lose information but it definitely makes
it more difficult to deal with and prevents effectively sharing code
with the match fields.  For example, PUSH and SET could have the
corresponding Netlink attribute as their data and POP would have just
the u16 type.

> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index af62bf0..b5b805d 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -702,48 +702,48 @@ dpif_netdev_validate_actions(const struct nlattr 
> *actions,
> -        case OVS_ACTION_ATTR_PUSH_VLAN:
> +        case OVS_ACT_TYPE(OVS_ACTION_ATTR_PUSH, OVS_KEY_ATTR_8021Q):
>             *mutates = true;
> -            if (nl_attr_get_be16(a) & htons(VLAN_CFI)) {
> +            q_key = nl_attr_get_unspec(a, sizeof(*q_key));
> +            if (!q_key || q_key->q_tci & htons(VLAN_CFI)) {

It's not necessary to check !q_key since it just returns an offset
from the point you gave it.

There are now a few other modifications where we support modifying
some fields in the key structure but not others.  For example, the
vlan TPID must currently be 0x8100 and the IPv4 protocol can't be set.

In addition, there are also quite a few other action types that can
now be expressed but are not supported such as modifications to IPv6
fields.  In this context, I'm not sure that there's any reason to
explicitly call out some unsupported actions rather than just a
catch-all.

Finally, one of Ben's patches removes the userspace action validation
on the assumption that you should be able to trust what it is
receiving.  These considerations are important for the kernel side of
things but it may not be worth updating the userspace validation.

> +dp_netdev_set_nw_addr(struct ofpbuf *packet, const struct flow *flow,
> +                      const struct ovs_key_ipv4 *ipv4_key)

Since this function can also set the ToS, I think it probably should be renamed.

> +static void
> +dp_netdev_set_tcp_port(struct ofpbuf *packet, const struct flow *flow,
> +                       const struct ovs_key_tcp *tcp_key)
> +{
> +    if (is_ip(packet, flow)) {

You need to verify that there's actually a TCP header present here.

> +dp_netdev_set_udp_port(struct ofpbuf *packet, const struct flow *flow,
> +                       const struct ovs_key_udp *udp_key)
>  {
> +    if (is_ip(packet, flow)) {
> +        struct udp_header *uh = packet->l4;

And need to check for a UDP header here.

> diff --git a/lib/dpif.c b/lib/dpif.c
> index c6940b1..7b99ba1 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -958,8 +958,15 @@ dpif_execute(struct dpif *dpif,
>     if (!(error ? VLOG_DROP_WARN(&error_rl) : VLOG_DROP_DBG(&dpmsg_rl))) {
>         struct ds ds = DS_EMPTY_INITIALIZER;
>         char *packet = ofp_packet_to_string(buf->data, buf->size, buf->size);
> +        struct flow flow;
> +
>         ds_put_format(&ds, "%s: execute ", dpif_name(dpif));
> -        format_odp_actions(&ds, actions, actions_len);
> +        if ((error = odp_flow_key_to_flow(key, key_len, &flow))) {
> +            ds_put_cstr(&ds, "invalid key");

I would put this in parentheses or otherwise mark it off as an error.

> +            return error;
> +        }

If this fails, it would be nice to still print out the contents of the
packet (perhaps even more important in this situation).

This also leaks both ds and packet on error.

> @@ -1188,8 +1195,13 @@ log_flow_message(const struct dpif *dpif, int error, 
> const char *operation,
>         dpif_flow_stats_format(stats, &ds);
>     }
>     if (actions || actions_len) {
> +        struct flow flow;
>         ds_put_cstr(&ds, ", actions:");
> -        format_odp_actions(&ds, actions, actions_len);
> +        if (odp_flow_key_to_flow(key, key_len, &flow)) {
> +            ds_put_cstr(&ds, "invalid key");
> +            return;

This leaks ds as well on error.

> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 7f5158f..d537d09 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
>  int
>  odp_action_len(uint16_t type)
>  {
> +    switch (type) {
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_OUTPUT, OVS_KEY_ATTR_NONE):
> +        return sizeof(uint32_t);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_USERSPACE, OVS_KEY_ATTR_NONE):
> +        return sizeof(uint64_t);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_PUSH, OVS_KEY_ATTR_8021Q):
> +        return sizeof(struct ovs_key_8021q);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_POP, OVS_KEY_ATTR_8021Q):
> +        return 0;
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_ETHERNET):
> +        return sizeof(struct ovs_key_ethernet);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_IPV4):
> +        return sizeof(struct ovs_key_ipv4);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_TCP):
> +        return sizeof(struct ovs_key_tcp);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_UDP):
> +        return sizeof(struct ovs_key_udp);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_TUN_ID):
> +        return sizeof(ovs_be64);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET_PRIORITY, OVS_KEY_ATTR_NONE):
> +        return sizeof(uint32_t);
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_RESTORE_PRIORITY, OVS_KEY_ATTR_NONE):
> +        return 0;
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SAMPLE, OVS_KEY_ATTR_NONE):
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_UNSPEC, OVS_KEY_ATTR_NONE):

Seeing as these are now all match types, it would be nice if we could
just one list of the lengths for both of them, at least for the
non-exceptions.

> @@ -91,7 +98,8 @@ format_generic_odp_action(struct ds *ds, const struct 
> nlattr *a)
>  }
>
>  static void
> -format_odp_sample_action(struct ds *ds, const struct nlattr *attr)
> +format_odp_sample_action(struct ds *ds, const struct nlattr *attr,
> +                         struct flow *flow)
>  {
>     static const struct nl_policy ovs_sample_policy[] = {
>         [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 },
> @@ -117,20 +125,20 @@ format_odp_sample_action(struct ds *ds, const struct 
> nlattr *attr)
>     ds_put_cstr(ds, "actions(");
>     nla_acts = nl_attr_get(a[OVS_SAMPLE_ATTR_ACTIONS]);
>     len = nl_attr_get_size(a[OVS_SAMPLE_ATTR_ACTIONS]);
> -    format_odp_actions(ds, nla_acts, len);
> +    format_odp_actions__(ds, nla_acts, len, flow);
>     ds_put_format(ds, "))");

Since we are now comparing actions to their current values, it's
possible for this to misrepresent the actions that are applied when
sampling is used.  Since it always parses all of the actions, it
effectively treats all sampling actions as having 100% probability.

Consider this set of actions:
SAMPLE(50%, set_nw_source(X)), set_nw_source(X), output

In this case, all packets will have their source set to X and the
SAMPLE action is irrelevant.  However, this will actually print out:
SAMPLE(50%, set_nw_source(X)), set_nw(error), output

Now, this is not particularly useful in practice.  However, since this
is primarily used for debugging it's pretty important that it reflect
what is actually going on, even if that doesn't make sense.

> +static void
> +format_odp_action(struct ds *ds, const struct nlattr *a,
> +                  struct flow *flow)
>  {
> +    case OVS_ACT_TYPE(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_TUN_ID):
>         ds_put_format(ds, "set_tunnel(%#"PRIx64")",
>                       ntohll(nl_attr_get_be64(a)));

This isn't consistent with the others in that it doesn't compare the
existing value to the new one.

That being said, I'm not sure that these comparisons are really
necessary at all.  As I said before, this is primarily a debugging
facility so it seems best to just print out what is actually being
done and not try to do any kind of interpretation.  For one thing, it
misses any of the fields which should not be set (and if they are, it
would indicate a bug).

> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 9e8b1fb..62f4bc6 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> +commit_vlan_action(struct action_xlate_ctx *ctx, ovs_be16 new_tci)
[...]
> +        type = OVS_ACT_TYPE(OVS_ACTION_ATTR_PUSH, OVS_KEY_ATTR_8021Q);
> +        q_key.q_tpid = 0;

TPID should be htons(ETH_TYPE_VLAN), not 0.

> +        q_key.q_tci = new_tci & ~htons(VLAN_CFI);

I wonder if we should keep the CFI bit for consistency with the match structure.

> +static void
> +commit_set_port_action(const struct flow *flow, struct flow *base,
> +                       struct ofpbuf *odp_actions)
[...]
> +    if (flow->nw_proto == IPPROTO_TCP) {
> +        struct ovs_key_tcp port_key;

I think we want to go further and check that there is a valid TCP/UDP
header here (by validating that the L4 fields in the flow aren't all
zero).  Then we can enforce this in the kernel at validate time and
avoid run-time packet length checks.
[...]
> +   } else {
> +        VLOG_WARN_RL(&rl, "Unknow protocol : %"PRIu16, flow->nw_proto);
> +   }

This is legal in OpenFlow (and can legitimately happen with wildcarded
flows) so I think we should let it go silently.  Regardless, you also
have a typo in "Unknow".

> +static void
> +commit_set_nw_action(const struct flow *flow, struct flow *base,
> +                     struct ofpbuf *odp_actions)
> +{
> +    struct ovs_key_ipv4 ipv4_key;
> +    uint16_t type;
> +
> +    if (base->nw_src == flow->nw_src &&
> +        base->nw_dst == flow->nw_dst &&
> +        base->nw_tos == flow->nw_tos) {
> +        return ;

There's a extra space after return.

Shouldn't we also be checking that this is IP (and also that the IP
header is present, as I mentioned before)?

> +static void
> +commit_odp_actions(struct action_xlate_ctx *ctx)
> +{
> +    const struct flow *flow = &ctx->flow;
> +    struct flow *base = &ctx->base_flow;
> +    struct ofpbuf *odp_actions = ctx->odp_actions;
> +
> +    commit_set_tun_id_action(flow, base, odp_actions);
> +    commit_set_nw_action(flow, base, odp_actions);
> +    commit_set_port_action(flow, base, odp_actions);
> +    commit_set_ether_addr_action(flow, base, odp_actions);
> +    commit_priority_action(ctx);
> +    commit_vlan_action(ctx, flow->vlan_tci);

It doesn't really matter but I would probably order these in some
logical form like layers: tunnel, Ethernet, vlan, IP, port, etc.

> @@ -4643,7 +4734,8 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, const 
> char *args_,
>         ds_put_char(&result, '\n');
>         trace_format_flow(&result, 0, "Final flow", &trace);
>         ds_put_cstr(&result, "Datapath actions: ");
> -        format_odp_actions(&result, odp_actions->data, odp_actions->size);
> +        format_odp_actions(&result, odp_actions->data, odp_actions->size,
> +                           &trace.ctx.base_flow);

Isn't base_flow going to be updated as we generate output actions so
the printed result will be not the same as if we compare it to the
original packet flow?

> diff --git a/utilities/ovs-dpctl.c b/utilities/ovs-dpctl.c
> index 4d0d3c2..aab0695 100644
> --- a/utilities/ovs-dpctl.c
> +++ b/utilities/ovs-dpctl.c
> @@ -569,12 +570,19 @@ do_dump_flows(int argc OVS_UNUSED, char *argv[])
>     dpif_flow_dump_start(&dump, dpif);
>     while (dpif_flow_dump_next(&dump, &key, &key_len,
>                                &actions, &actions_len, &stats)) {
> +        struct flow flow;
> +
>         ds_clear(&ds);
>         odp_flow_key_format(key, key_len, &ds);
>         ds_put_cstr(&ds, ", ");
>         dpif_flow_stats_format(stats, &ds);
>         ds_put_cstr(&ds, ", actions:");
> -        format_odp_actions(&ds, actions, actions_len);
> +        if (odp_flow_key_to_flow(key, key_len, &flow)) {
> +            ds_put_cstr(&ds, "invalid key");
> +            printf("%s\n", ds_cstr(&ds));
> +            continue;
> +        }
> +        format_odp_actions(&ds, actions, actions_len, &flow);
>         printf("%s\n", ds_cstr(&ds));

I think if you restructured this, you could remove some duplicate code
(like the extra printf).

One of the things that concerns me the most is that certain actions
can't be expressed when there is a sample action due to the ambiguity
of what the flow state is after that.  I think the only way to deal
with that is to convert the sample action to behave more like a
subroutine instead of an inline if statement.  This would mean that it
restores the packet back to its original state at the end of the
action list and as a result userspace would always know the exact
state of the flow at any given time.  I think this is a fairly small
change and doesn't affect sFlow at all since it doesn't modify the
packet.

There are some things that it will no longer be possible to express
using the sample action but if they are necessary in the future we can
add an "else" clause to the sample action, which will restore the full
expressibility plus more.  It's not necessary to do that now though.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Reply via email to